IBM power PS700, PS704, PS703 Problem Determination And Service Manual

Power Systems
Problem Determination and Ser vice Guide for the IBM Power PS700 (8406-70Y)

GI11-9831-00
Power Systems
Problem Determination and Ser vice Guide for the IBM Power PS700 (8406-70Y)

GI11-9831-00
Note
Before using this information and the product it supports, read the information in “Notices,” on page 271, “Safety notices” on page v, the IBM Systems Safety Notices manual, G229-9054, and the IBM Environmental Notices and User Guide, Z125–5823.
This edition applies to IBM Power Systems servers that contain the POWER7 processor and to all associated models.
© Copyright IBM Corporation 2010, 2011.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

Contents

Safety notices ............v
Chapter 1. Introduction ........1
Related documentation ...........1
Notices and statements ...........2
Features and specifications..........2
Supported DIMMs ............4
Blade server control panel buttons and LEDs . . . 5
Turning on the blade server .........6
Turning off the blade server .........7
System-board layouts ...........8
System-board connectors .........8
System-board LEDs ...........9
Chapter 2. Diagnostics ........11
Diagnostic tools .............11
Collecting dump data ...........13
Location codes .............14
Reference codes .............15
System reference codes (SRCs) .......16
1xxxyyyy SRCs ...........17
6xxxyyyy SRCs ...........21
A1xxyyyy service processor SRCs .....24
AA00E1A8 to AA260005 Partition firmware
attention codes ...........25
Bxxxxxxx Service processor early termination
SRCs ..............28
B200xxxx Logical partition SRCs .....29
B700xxxx Licensed internal code SRCs . . . 39 BA000010 to BA400002 Partition firmware
SRCs ..............48
POST progress codes (checkpoints) .....84
C1001F00 to C1645300 Service processor
checkpoints ............85
C2001000 to C20082FF Virtual service
processor checkpoints .........93
IPL status progress codes .......102
C700xxxx Server firmware IPL status
checkpoints ...........102
CA000000 to CA2799FF Partition firmware
checkpoints ............102
D1001xxx to D1xx3FFF Service processor
dump codes ............120
D1xx3y01 to D1xx3yF2 Service processor
dump codes ...........125
D1xx900C to D1xxC003 Service processor
power-off checkpoints ........128
Service request numbers (SRNs) ......129
Using the SRN tables .........129
101-711 through FFC-725 SRNs .....129
A00-FF0 through A24-xxx SRNs .....157
SCSD Devices SRNs (ssss-102 to ssss-640) 177 Failing function codes 151 through 2E33 . . 181
Error logs ..............183
Checkout procedure ...........184
About the checkout procedure.......184
Performing the checkout procedure .....184
Verifying the partition configuration......186
Running the diagnostics program ......186
Starting AIX concurrent diagnostics .....186
Starting stand-alone diagnostics from a CD . . 187
Starting stand-alone diagnostics from a NIM
server ...............188
Using the diagnostics program ......189
Boot problem resolution..........190
Troubleshooting tables ..........191
General problems ...........191
Drive problems............192
Intermittent problems .........192
Management module service processor
problems ..............193
Memory problems ...........193
Microprocessor problems ........194
Network connection problems.......194
PCI expansion card (PIOCARD) problem
isolation procedure ..........194
Optional device problems ........196
Power problems ...........196
POWER Hypervisor (PHYP) problems ....198
Service processor problems........200
Software problems...........213
Universal Serial Bus (USB) port problems . . . 213
Light path diagnostics ..........214
Viewing the light path diagnostic LEDs . . . 214
Light path diagnostics LEDs .......215
Isolating firmware problems ........218
Save vfchost map data ..........218
Restore vfchost map data .........219
Recovering the system firmware .......220
Starting the PERM image ........220
Starting the TEMP image ........221
Recovering the TEMP image from the PERM
image ...............221
Verifying the system firmware levels ....221
Committing the TEMP system firmware image 222 Solving shared BladeCenter resource problems . . 222
Solving shared media tray problems.....223
Solving shared network connection problems 225
Solving shared power problems ......225
Solving shared video problems ......226
Solving undetermined problems .......227
Calling IBM for service ..........228
Chapter 3. Parts listing, Type 8406 229
Chapter 4. Removing and replacing
blade server components ......233
Installation guidelines ..........233
System reliability guidelines .......234
Handling static-sensitive devices ......234
© Copyright IBM Corp. 2010, 2011 iii
Returning a device or component .....234
Removing the blade server from a BladeCenter
unit ................235
Installing the blade server in a BladeCenter unit 236
Removing and replacing Tier 1 CRUs .....237
Removing the blade server cover ......237
Installing and closing the blade server cover . . 239
Removing the bezel assembly .......240
Installing the bezel assembly .......240
Removing a drive ...........241
Installing a drive ...........242
Removing a memory module .......244
Installing a memory module .......245
Removing and installing an I/O expansion card 246
Removing a CIOv form-factor expansion card 247 Installing a CIOv form-factor expansion card 247 Removing a combination-form-factor
expansion card ...........249
Installing a combination-form-factor
expansion card ...........249
Removing the battery .........250
Installing the battery ..........251
Removing the disk drive tray .......252
Installing the disk drive tray .......253
Removing the tier 2 management card .....255
Installing the tier 2 management card .....256
Obtaining a PowerVM Virtualization Engine
system technologies activation code ......257
Replacing the FRU system-board and chassis
assembly ...............260
Chapter 5. Configuring .......263
Updating the firmware ..........263
Configuring the blade server ........264
Using the SMS utility...........265
Starting the SMS utility .........265
SMS utility menu choices ........265
Creating a CE login ...........265
Configuring the Gigabit Ethernet controllers . . . 266 Blade server Ethernet controller enumeration . . . 267 MAC addresses for host Ethernet adapters . . . 267
Configuring a RAID array .........268
Updating IBM Director ..........268
Appendix. Notices .........271
Trademarks ..............272
Electronic emission notices .........273
Class A Notices............273
Class B Notices............277
Terms and conditions...........280
iv Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)

Safety notices

Safety notices may be printed throughout this guide: v DANGER notices call attention to a situation that is potentially lethal or extremely hazardous to
people.
v CAUTION notices call attention to a situation that is potentially hazardous to people because of some
existing condition.
v Attention notices call attention to the possibility of damage to a program, device, system, or data.
World Trade safety information
Several countries require the safety information contained in product publications to be presented in their national languages. If this requirement applies to your country, a safety information booklet is included in the publications package shipped with the product. The booklet contains the safety information in your national language with references to the U.S. English source. Before using a U.S. English publication to install, operate, or service this product, you must first become familiar with the related safety information in the booklet. You should also refer to the booklet any time you do not clearly understand any safety information in the U.S. English publications.
German safety information
Das Produkt ist nicht für den Einsatz an Bildschirmarbeitsplätzen im Sinne§2der Bildschirmarbeitsverordnung geeignet.
Laser safety information
IBM®servers can use I/O cards or features that are fiber-optic based and that utilize lasers or LEDs.
Laser compliance
IBM servers may be installed inside or outside of an IT equipment rack.
© Copyright IBM Corp. 2010, 2011 v
DANGER
When working on or around the system, observe the following precautions:
Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM
provided power cord for any other product.
v Do not open or service any power supply assembly. v Do not connect or disconnect any cables or perform installation, maintenance, or reconfiguration
of this product during an electrical storm.
v The product might be equipped with multiple power cords. To remove all hazardous voltages,
disconnect all power cords.
v Connect all power cords to a properly wired and grounded electrical outlet. Ensure that the outlet
supplies proper voltage and phase rotation according to the system rating plate.
v Connect any equipment that will be attached to this product to properly wired outlets. v When possible, use one hand only to connect or disconnect signal cables. v Never turn on any equipment when there is evidence of fire, water, or structural damage. v Disconnect the attached power cords, telecommunications systems, networks, and modems before
you open the device covers, unless instructed otherwise in the installation and configuration procedures.
v Connect and disconnect cables as described in the following procedures when installing, moving,
or opening covers on this product or attached devices.
To Disconnect:
1. Turn off everything (unless instructed otherwise).
2. Remove the power cords from the outlets.
3. Remove the signal cables from the connectors.
4. Remove all cables from the devices
To Connect:
1. Turn off everything (unless instructed otherwise).
2. Attach all cables to the devices.
3. Attach the signal cables to the connectors.
4. Attach the power cords to the outlets.
5. Turn on the devices.
(D005)
DANGER
vi Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Observe the following precautions when working on or around your IT rack system:
v Heavy equipment–personal injury or equipment damage might result if mishandled.
v Always lower the leveling pads on the rack cabinet.
v Always install stabilizer brackets on the rack cabinet.
v To avoid hazardous conditions due to uneven mechanical loading, always install the heaviest
devices in the bottom of the rack cabinet. Always install servers and optional devices starting from the bottom of the rack cabinet.
v Rack-mounted devices are not to be used as shelves or work spaces. Do not place objects on top
of rack-mounted devices.
v Each rack cabinet might have more than one power cord. Be sure to disconnect all power cords in
the rack cabinet when directed to disconnect power during servicing.
v Connect all devices installed in a rack cabinet to power devices installed in the same rack
cabinet. Do not plug a power cord from a device installed in one rack cabinet into a power device installed in a different rack cabinet.
v An electrical outlet that is not correctly wired could place hazardous voltage on the metal parts of
the system or the devices that attach to the system. It is the responsibility of the customer to ensure that the outlet is correctly wired and grounded to prevent an electrical shock.
CAUTION
v Do not install a unit in a rack where the internal rack ambient temperatures will exceed the
manufacturer's recommended ambient temperature for all your rack-mounted devices.
v Do not install a unit in a rack where the air flow is compromised. Ensure that air flow is not
blocked or reduced on any side, front, or back of a unit used for air flow through the unit.
v Consideration should be given to the connection of the equipment to the supply circuit so that
overloading of the circuits does not compromise the supply wiring or overcurrent protection. To provide the correct power connection to a rack, refer to the rating labels located on the equipment in the rack to determine the total power requirement of the supply circuit.
v (For sliding drawers.) Do not pull out or install any drawer or feature if the rack stabilizer brackets
are not attached to the rack. Do not pull out more than one drawer at a time. The rack might become unstable if you pull out more than one drawer at a time.
v (For fixed drawers.) This drawer is a fixed drawer and must not be moved for servicing unless
specified by the manufacturer. Attempting to move the drawer partially or completely out of the rack might cause the rack to become unstable or cause the drawer to fall out of the rack.
(R001)
Safety notices vii
CAUTION: Removing components from the upper positions in the rack cabinet improves rack stability during relocation. Follow these general guidelines whenever you relocate a populated rack cabinet within a room or building:
v Reduce the weight of the rack cabinet by removing equipment starting at the top of the rack
cabinet. When possible, restore the rack cabinet to the configuration of the rack cabinet as you received it. If this configuration is not known, you must observe the following precautions:
– Remove all devices in the 32U position and above.
– Ensure that the heaviest devices are installed in the bottom of the rack cabinet.
– Ensure that there are no empty U-levels between devices installed in the rack cabinet below the
32U level.
v If the rack cabinet you are relocating is part of a suite of rack cabinets, detach the rack cabinet from
the suite.
v Inspect the route that you plan to take to eliminate potential hazards.
v Verify that the route that you choose can support the weight of the loaded rack cabinet. Refer to the
documentation that comes with your rack cabinet for the weight of a loaded rack cabinet.
v Verify that all door openings are at least 760 x 2030 mm (30 x 80 in.).
v Ensure that all devices, shelves, drawers, doors, and cables are secure.
v Ensure that the four leveling pads are raised to their highest position.
v Ensure that there is no stabilizer bracket installed on the rack cabinet during movement.
v Do not use a ramp inclined at more than 10 degrees.
v When the rack cabinet is in the new location, complete the following steps:
– Lower the four leveling pads.
– Install stabilizer brackets on the rack cabinet.
– If you removed any devices from the rack cabinet, repopulate the rack cabinet from the lowest
position to the highest position.
v If a long-distance relocation is required, restore the rack cabinet to the configuration of the rack
cabinet as you received it. Pack the rack cabinet in the original packaging material, or equivalent. Also lower the leveling pads to raise the casters off of the pallet and bolt the rack cabinet to the pallet.
(R002)
(L001)
(L002)
viii Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
(L003)
or
All lasers are certified in the U.S. to conform to the requirements of DHHS 21 CFR Subchapter J for class 1 laser products. Outside the U.S., they are certified to be in compliance with IEC 60825 as a class 1 laser product. Consult the label on each part for laser certification numbers and approval information.
CAUTION: This product might contain one or more of the following devices: CD-ROM drive, DVD-ROM drive, DVD-RAM drive, or laser module, which are Class 1 laser products. Note the following information:
v Do not remove the covers. Removing the covers of the laser product could result in exposure to
hazardous laser radiation. There are no serviceable parts inside the device.
v Use of the controls or adjustments or performance of procedures other than those specified herein
might result in hazardous radiation exposure.
(C026)
Safety notices ix
CAUTION: Data processing environments can contain equipment transmitting on system links with laser modules that operate at greater than Class 1 power levels. For this reason, never look into the end of an optical fiber cable or open receptacle. (C027)
CAUTION: This product contains a Class 1M laser. Do not view directly with optical instruments. (C028)
CAUTION: Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following information: laser radiation when open. Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam. (C030)
Power and cabling information for NEBS (Network Equipment-Building System) GR-1089-CORE
The following comments apply to the IBM servers that have been designated as conforming to NEBS (Network Equipment-Building System) GR-1089-CORE:
The equipment is suitable for installation in the following:
v Network telecommunications facilities v Locations where the NEC (National Electrical Code) applies
The intrabuilding ports of this equipment are suitable for connection to intrabuilding or unexposed wiring or cabling only. The intrabuilding ports of this equipment must not be metallically connected to the interfaces that connect to the OSP (outside plant) or its wiring. These interfaces are designed for use as intrabuilding interfaces only (Type 2 or Type 4 ports as described in GR-1089-CORE) and require isolation from the exposed OSP cabling. The addition of primary protectors is not sufficient protection to connect these interfaces metallically to OSP wiring.
Note: All Ethernet cables must be shielded and grounded at both ends.
The ac-powered system does not require the use of an external surge protection device (SPD).
The dc-powered system employs an isolated DC return (DC-I) design. The DC battery return terminal shall not be connected to the chassis or frame ground.
x Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)

Chapter 1. Introduction

This problem determination and service information helps you solve problems that might occur in your PS700 blade server. The information describes the diagnostic tools that come with the blade server, error codes and suggested actions, and instructions for replacing failing components.
Replaceable components are of three types: v Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your responsibility. If IBM
installs a Tier 1 CRU at your request, you are charged for the installation.
v Tier 2 customer replaceable unit: You can install a Tier 2 CRU yourself or request IBM to install it, at
no additional charge, under the type of warranty service that is designated for your blade server.
v Field replaceable unit (FRU): FRUs must be installed only by trained service technicians.
The serial number for the PS700 blade server can be found in the following locations:
v The bottom front of the blade server in the right corner on the 1S label. v The bottom rear of the blade server in the right corner. v Under the front cover door.
For information about the terms of the warranty and getting service and assistance, see the information center or the Warranty and Support Information document on the IBM BladeCenter
®
Documentation CD.

Related documentation

Documentation for the PS700 blade server includes documents in Portable Document Format (PDF) on the IBM BladeCenter Documentation CD and the online information center.
The most recent version of all BladeCenter documentation is in the BladeCenter information center.
The online BladeCenter information center is available in the IBM BladeCenter Information Center at http://publib.boulder.ibm.com/infocenter/bladectr/documentation/index.jsp.
PDF versions of the following documents are on the IBM BladeCenter Documentation CD and in the online information center:
v Installation and User's Guide
This document contains general information about the blade server, including how to install supported options and how to configure the blade server.
v Safety Information
This document contains translated caution and danger statements. Each caution and danger statement that appears in the documentation has a number that you can use to locate the corresponding statement in your language in the Safety Information document.
v Warranty and Support Information
This document contains information about the terms of the warranty and about getting service and assistance.
© Copyright IBM Corp. 2010, 2011 1
Additional documents might be included in the online information center and on the IBM BladeCenter Documentation CD.
The blade server might have features that are not described in the documentation that comes with the blade server. Occasional updates to the documentation might include information about those features, or technical updates might be available to provide additional information that is not included in the documentation that comes with the blade server.
Review the online information or the Planning Guide and the Installation Guide for your IBM BladeCenter unit. The information can help you prepare for system installation and configuration. The most current version of each document is available in the BladeCenter information center.

Notices and statements

The caution and danger statements in this document are also in the multilingual Safety Information. Each statement is numbered for reference to the corresponding statement in your language in the Safety Information document.
The following notices and statements are used in this document:
v Note: These notices provide important tips, guidance, or advice. v Important: These notices provide information or advice that might help you avoid inconvenient or
problem situations.
v Attention: These notices indicate potential damage to programs, devices, or data. An attention notice is
placed just before the instruction or situation in which damage might occur.
v Caution: These statements indicate situations that can be potentially hazardous to you. A caution
statement is placed just before the description of a potentially hazardous procedure step or situation.
v Danger: These statements indicate situations that can be potentially lethal or extremely hazardous to
you. A danger statement is placed just before the description of a potentially lethal or extremely hazardous procedure step or situation.

Features and specifications

Features and specifications of the IBM BladeCenter PS700 blade server are summarized in this overview.
The PS700 Type 8406 is a single-wide (non-expandable) blade server. The PS700 blade server is used in an IBM BladeCenter H (8852 and 7989), BladeCenter HT (8740 and 8750), or BladeCenter S (8886 and 7779) chassis unit.
Notes:
v Power, cooling, removable-media drives, external ports, and advanced system management are
provided by the BladeCenter unit.
v The operating system in the blade server must provide support for the Universal Serial Bus (USB), to
enable the blade server to recognize and communicate internally with the removable-media drives and front-panel USB ports.
2 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Core electronics:
v 64-bit Power 7 processors (12S
technology)
v Four core, single socket (4-way)
processors @ 3.0 GHz
v 64 GB maximum in 8 very low
profile (VLP) DIMM slots; Supports 4 GB DDR3 at 1066MHz, and 8 GB DDR3 at 800HMz
P5IOC2 I/O hub
On-board, integrated features:
v Two 1 GB Ethernet ports (HEA)
(two on each side)
v SAS controller v USB 2.0 v 1 Serial over LAN (SOL) console
using FSP
FSP1 Service Processor - IPMI and SOL
v The baseboard management
controller (BMC) is a flexible service processor (FSP1) with Intelligent Platform Management Interface (IPMI), Serial over LAN (SOL), and Wake on LAN (WOL) firmware support.
Local Storage:
v First DASD bay: zero or one 2.5"
SAS HDD
v Second DASD bay: zero or one 2.5"
SAS HDD
v SAS HDDs are 300 GB and 600 GB v Hardware mirroring
Daughter card I/O options:
v 1 1Xe expansion card (CIOv) v SAS Pass-through using 1Xe v 1 High-Speed expansion card
(CFFh)
Integrated functions:
v RS-485 interface for
communication with the management module
v Automatic server restart (ASR) v SOL through FSP v Two Universal Serial Bus (USB
2.0) buses on base planar for communication with removable-media drives
v Optical media available by shared
chassis feature
Environment:
v Air temperature:
– Blade server on: 10° to 35°C
(50° to 95°F). Altitude: 0 to 914 m (3000 ft)
– Blade server on: 10° to 32°C
(50° to 90°F). Altitude: 914 m to 2133 m (3000 ft to 7000 ft)
– Blade server off: -40° to 60°C
(-40° to 140°F)
v Humidity:
– Blade server on: 8% to 80% – Blade server off: 8% to 80%
PS700 Size:
v Height: 24.5 cm (9.7 inches) v Depth: 44.6 cm (17.6 inches) v Width: 30 mm (1.14 inches)
Systems management:
v Supported by BladeCenter chassis
management module
v Front panel LEDs v IBM Director v Hardware Management Console
(HMC)
v Integrated Virtualization Manager
(IVM)
v Energy Scale thermal management
for power management/ oversubscription (throttling) and environmental sensing
v Active Energy Manager
Clusters support for:
v IBM Director v xCat
Virtualization support for:
PowerVM
®
Standard Edition hardware feature, which provides the Integrated Virtualization Manager, Virtual I/O Server, and Director Power Systems
Manager (DPSM).
Reliability and service features:
v Dual alternating current power
supply
v BladeCenter chassis redundant and
hot plug power and cooling modules
v Boot-time processor deallocation v Blade server hot plug v Customer setup and expansion v Automatic reboot on power loss v Internal and ambient temperature
monitors
v ECC, chipkill memory v System management alerts
Electrical input: 12Vdc
See the ServerProven Web site for information about supported operating-system versions and all PS700 blade server optional devices.
Chapter 1. Introduction 3

Supported DIMMs

Each planar in the PS700 blade server contains eight very low profile (VLP) memory connectors for registered dual inline memory modules (RDIMMs). The maximum size for a single DIMM is 8 GB. The total memory capacity ranges for PS700 from a minimum of 4 GB to a maximum of 64 GB.
See Chapter 3, “Parts listing, Type 8406,” on page 229 for memory modules that you can order from IBM.
Memory module rules:
v Install DIMM fillers in unused DIMM slots for proper cooling. v Install DIMMs in pairs (1 and 3, 6 and 8, 2 and 4, 5 and 7) v Both DIMMs in a pair must be the same size, speed, type, and technology. You can mix compatible
DIMMs from different manufacturers.
v Each DIMM within a processor-support group (1-4 and 5-8) must be the same size and speed.
®
v Install only supported DIMMs, as described on the ServerProven
servers/eserver/serverproven/compat/us/.
v Installing or removing DIMMs changes the configuration of the blade server. After you install or
remove a DIMM, the blade server is automatically re-configured, and the new configuration information is stored.
v See “System-board connectors” on page 8 for DIMM connector locations.
Table 1 shows allowable placement of DIMM modules:
Table 1. Memory module combinations
DIMM
count PS700 Base blade planar (P1) DIMM slots
12345678
2 XX
4 XX XX
6 XXXXXX
8 XXXXXXXX
Web site. See http://www.ibm.com/
Figure 1. DIMM connectors. Base unit connectors
4 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)

Blade server control panel buttons and LEDs

Blade server control panel buttons and LEDs provide operational controls and status indicators.
Note: Figure 2 shows the control-panel door in the closed (normal) position. To access the power-control button, you must open the control-panel door.
Figure 2. Blade server control panel buttons and LEDs
1 Media-tray select button: Press this button to associate the shared BladeCenter unit media tray (removable-media drives and front-panel USB ports) with the blade server. The LED on the button flashes while the request is being processed, then is lit when the ownership of the media tray has been transferred to the blade server. It can take approximately 20 seconds for the operating system in the blade server to recognize the media tray.
If there is no response when you press the media-tray select button, use the management module to determine whether local control has been disabled on the blade server.
Note: The operating system in the blade server must provide USB support for the blade server to recognize and use the removable-media drives and USB ports.
Chapter 1. Introduction 5
2 Information LED: When this amber LED is lit, it indicates that information about a system error for the blade server has been placed in the management-module event log. The information LED can be turned off through the Web interface of the management module or through IBM Director Console.
3 Blade-error LED: When this amber LED is lit, it indicates that a system error has occurred in the blade server. The blade-error LED will turn off after one of the following events:
v Correcting the error v Reseating the blade server in the BladeCenter unit v Cycling the BladeCenter unit power
4 Power-control button: This button is behind the control panel door. Press this button to turn on or turn off the blade server.
The power-control button has effect only if local power control is enabled for the blade server. Local power control is enabled and disabled through the Web interface of the management module.
Press the power button for 5 seconds to begin powering down the blade server.
5 NMI reset (recessed): The nonmaskable interrupt (NMI) reset dumps the partition. Use this recessed button only as directed by IBM Support.
6 Power-on LED: This green LED indicates the power status of the blade server in the following manner:
v Flashing rapidly: The service processor is initializing the blade server. v Flashing slowly: The blade server has completed initialization and is waiting for a power-on command. v Lit continuously: The blade server has power and is turned on.
Note: The enhanced service processor can take as long as three minutes to initialize after you install the BladeCenter PS700 blade server, at which point the LED begins to flash slowly.
7 Activity LED: When this green LED is lit, it indicates that there is activity on the hard disk drive or network.
8 Location LED: When this blue LED is lit, it has been turned on by the system administrator to aid in visually locating the blade server. The location LED can be turned off through the Web interface of the management module or through IBM Director Console.

Turning on the blade server

After you connect the blade server to power through the BladeCenter unit, you can start the blade server after the discovery and initialization process is complete.
6 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
You can start the blade server in any of the following ways. v Start the blade server by pressing the power-control button on the front of the blade server.
The power-control button is behind the control panel door, as described in “Blade server control panel buttons and LEDs” on page 5.
After you push the power-control button, the power-on LED continues to blink slowly for about 15 seconds, then is lit solidly when the power-on process is complete.
Wait until the power-on LED on the blade server flashes slowly before you press the blade server power-control button. If the power-on LED is flashing rapidly, the service processor is initializing the blade server. The power-control button does not respond during initialization.
Note: The enhanced service processor can take as long as three minutes to initialize after you install the BladeCenter PS700 blade server, at which point the LED begins to flash slowly.
v Start the blade server automatically when power is restored after a power failure.
If a power failure occurs, the BladeCenter unit and then the blade server can start automatically when power is restored. You must configure the blade server to restart through the management module.
v Start the blade server remotely using the management module.
After you initiate the power-on process, the power-on LED blinks slowly for about 15 seconds, then is lit solidly when the power-on process is complete.

Turning off the blade server

When you turn off the blade server, it is still connected to power through the BladeCenter unit. The blade server can respond to requests from the service processor, such as a remote request to turn on the blade server. To remove all power from the blade server, you must remove it from the BladeCenter unit.
Shut down the operating system before you turn off the blade server. See the operating-system documentation for information about shutting down the operating system.
You can turn off the blade server in one of the following ways. v Turn off the blade server by pressing the power-control button for at least 5 seconds.
The power-control button is on the blade server behind the control panel door. See “Blade server control panel buttons and LEDs” on page 5 for the location.
Note: The power-control LED can remain on solidly for up to 1 minute after you push the power-control button. After you turn off the blade server, wait until the power-control LED is blinking slowly before you press the power-control button to turn on the blade server again.
If the operating system stops functioning, press and hold the power-control button for more than 5 seconds to force the blade server to turn off.
v Use the management module to turn off the blade server.
The power-control LED can remain on solidly for up to 1 minute after you initiate the power-off process. After you turn off the blade server, wait until the power-control LED is blinking slowly before you initiate the power-on process from the AMM to turn on the blade server again.
Use the management-module Web interface to configure the management module to turn off the blade server if the system is not operating correctly.
For additional information, see the online documentation or the User's Guide for the management module.
Chapter 1. Introduction 7

System-board layouts

Illustrations show the connectors and LEDs on the system board. The illustrations might differ slightly from your hardware.

System-board connectors

Blade server components attach to the connectors on the system board.
Figure 3 shows the connectors on the base unit system board in the blade server.
Figure 3. PS700 system-board connectors
Table 2 shows connector descriptions.
Table 2. PS700 connectors
Callout PS700 blade server connectors
1 Operator panel connector
2 DIMM 1-4 connectors (See Figure 4 on page 9 for individual connectors.) Expansion unit
(SMP) connector
3 Management card connector (P1-C9)
4 SAS hard disk drive connector (P1-D2)
5 Light Path Blue Button
6 SAS hard disk drive (P1-C10)
7 CIOv (1Xe) expansion card connector (P1-C11)
8 High-Speed (CFFh) expansion card connector (P1-C12)
9 DIMM 5-8 connectors (See Figure 4 on page 9 for individual connectors.)
10 3V lithium battery connector (P1-E1)
Figure 4 on page 9 shows individual DIMM connectors.
8 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Figure 4. DIMM connectors. Base unit connectors

System-board LEDs

Use the illustration of the LEDs on the system board to identify a light emitting diode (LED).
Remove the blade server from the BladeCenter unit, open the cover, press the blue button to see any error LEDs that were turned on during error processing, and use Figure 5 to identify the failing component.
Figure 5 shows the locations of LEDs on the system board. Table 3 shows LED descriptions.
Figure 5. LED locations on the system board of the PS700 blade server
Table 3. PS700 LEDs
Callout Base unit LEDs
1 3V lithium battery LED
2 DIMM 1-4 LEDs
3 Management card LED
4 Light path power LED
5 System board LED
6 HDD1 LED
7 Interposer LED
Chapter 1. Introduction 9
Table 3. PS700 LEDs (continued)
Callout Base unit LEDs
8 CIOv (1Xe) expansion card connector LED
9 High-Speed (CFFh) expansion card connector LED
10 HDD2 LED
11 DIMM 5-8 LEDs
10 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)

Chapter 2. Diagnostics

Use the available diagnostic tools to help solve any problems that might occur in the blade server.
The first and most crucial component of a solid serviceability strategy is the ability to accurately and effectively detect errors when they occur. While not all errors are a threat to system availability, those that go undetected are dangerous because the system does not have the opportunity to evaluate and act if necessary. POWER7 that extend from processor cores and memory to power supplies and hard drives.
POWER7 processor-based systems contain specialized hardware detection circuitry for detecting erroneous hardware operations. Error checking hardware ranges from parity error detection coupled with processor instruction retry and bus retry, to ECC correction on caches and system buses.
IBM hardware error checkers have these distinct attributes:
v Continuous monitoring of system operations to detect potential calculation errors v Attempted isolation of physical faults based on runtime detection of each unique failure v Initiation of a wide variety of recovery mechanisms designed to correct a problem
POWER7 processor-based systems include extensive hardware and firmware recovery logic.
Machine check handling
Machine checks are handled by firmware. When a machine check occurs, the firmware analyzes the error to identify the failing device and creates an error log entry.
®
processor-based systems are specifically designed with error-detection mechanisms
If the system degrades to the point that the service processor cannot reach standby state, the ability to analyze the error does not exist. If the error occurs during POWER PHYP initiates a system reboot.
In partitioned mode, an error that occurs during partition activity is reported to the operating system in the partition.
®
hypervisor (PHYP) activities, the

Diagnostic tools

Tools are available to help you diagnose and solve hardware-related problems.
© Copyright IBM Corp. 2010, 2011 11
v Power-on self-test (POST) progress codes (checkpoints), error codes, and isolation procedures
The POST checks out the hardware at system initialization. IPL diagnostic functions test some system components and interconnections. The POST generates eight-digit checkpoints to mark the progress of powering up the blade server.
Use the management module to view progress codes. The documentation of a progress code includes recovery actions for system hangs. See “POST progress
codes (checkpoints)” on page 84 for more information. If the service processor detects a problem during POST, an error code is logged in the management
module event log. Error codes are also logged in the Linux syslog or AIX
®
diagnostic log, if possible.
See “System reference codes (SRCs)” on page 16. The service processor can generate codes that point to specific isolation procedures. See “Service
processor problems” on page 200.
v Light path diagnostics
Use the light path diagnostic LEDs on the system board to identify failing hardware. If the system error LED on the system LED panel on the front or rear of the BladeCenter unit is lit, one or more error LEDs on the BladeCenter unit components also might be lit.
Light path diagnostics help identify failing customer replaceable unit (CRUs). CRU location codes are included in error codes and the event log.
LED locations
See “System-board LEDs” on page 9.
Front panel
See “Blade server control panel buttons and LEDs” on page 5.
v Troubleshooting tables
Use the troubleshooting tables to find solutions to problems that have identifiable symptoms. See “Troubleshooting tables” on page 191.
v Dump data collection
In some circumstances, an error might require a dump to show more data. The Integrated Virtualization Manager (IVM) or Hardware Management Console (HMC) sets up a dump area. Specific IVM or HMC information is included as part of the information that can optionally be sent to IBM support for analysis.
See “Collecting dump data” on page 13 for more information.
v Stand-alone diagnostics
The AIX-based stand-alone diagnostics CD is in the ship package and is also available from the IBM Web site. Boot the diagnostics from a CD drive or from an AIX network installation manager (NIM) server if the blade server cannot boot to an operating system, no matter which operating system is installed.
Functions provided by the stand-alone diagnostics include: – Analysis of errors reported by platform, such as microprocessor and memory errors – Testing of resources, such as I/O adapters and devices – Service aids, such as firmware update, format disk, and Raid Manager
v Diagnostic utilities for the AIX operating system
Run AIX concurrent diagnostics if AIX is functioning instead of the stand-alone diagnostics. Functions provided by disk-based AIX diagnostics include:
– Automatic error log analysis – Analysis of errors reported by platform, such as microprocessor and memory errors – Testing of resources, such as I/O adapters and devices – Service aids, such as firmware update, format disk, and Raid Manager
v Diagnostic utilities for Linux operating systems
12 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Linux on POWER service and productivity tools include hardware diagnostic aids and productivity tools, and installation aids. The installation aids are provided in the IBM Installation Toolkit for Linux on POWER, a set of tools that aids the installation of Linux on IBM servers with POWER architecture. You can also use the tools to update the PS700 blade server firmware.
Diagnostic utilities for the Linux operating system are available from IBM at https:// www14.software.ibm.com/webapp/set2/sas/f/lopdiags/home.html.
v Diagnostic utilities for other operating systems
You can use the stand-alone diagnostics CD to perform diagnostics on the PS700 blade server, no matter which operating system is loaded on the blade server. However, other supported operating systems might have diagnostic tools that are available through the operating system. See the documentation for your operating system for more information.

Collecting dump data

A dump might be critical for fault isolation when the built-in First Failure Data Capture (FFDC) mechanisms are not capturing sufficient fault data. Even when a fault is identified, dump data can provide additional information that is useful in problem determination.
All hardware state information is part of the dump if a hardware checkstop occurs. When a checkstop occurs, the service processor attempts to dump data that is necessary to analyze the error from appropriate parts of the system.
Note: If you power off the blade through the management module while the service processor is performing a dump, platform dump data is lost.
You might be asked to retrieve a dump to send it to IBM Support for analysis. The location of the dump data varies by operating system.
v Collect an AIX dump from the /var/adm/platform directory. v Collect a Linux dump from the /var/log/dump directory. v Collect an Integrated Virtualization Manager (IVM) dump from the IVM-managed PS700 blade server
through the Manage Dumps task in the IVM console.
v To collect a system dump by using the Hardware Management Console (HMC), complete these steps:
1. Perform a controlled shutdown of all partitions.
Note: A system dump will abnormally terminate any running partitions.
2. In the navigation area, open Systems Management.
3. Select the server and open it.
4. Select Serviceability > Manage Dumps > Action > Initiate System Dump. The dump is
automatically saved to the HMC. For details on how to copy, report, or delete a dump after you have completed a dump, see Managing dumps.
Chapter 2. Diagnostics 13

Location codes

Location codes identify components of the blade server. Location codes are displayed with some error codes to identify the blade server component that is causing the error.
See “System-board connectors” on page 8 for component locations.
Notes:
1. Location codes do not indicate the location of the blade server within the BladeCenter unit. The codes identify components of the blade server only.
2. For checkpoints with no associated location code, see “Light path diagnostics” on page 214 to identify the failing component when there is a hang condition.
3. For checkpoints with location codes, use the following table to identify the failing component when there is a hang condition.
4. For 8-digit codes not listed in Table 4, see the “Checkout procedure” on page 184.
Table 4. Location codes
Components Physical Location Code CRU LED
Un location codes are for enclosure and VPD locations.
Un = Utttt.mmm.sssssss
tttt = system machine type mmm = system model number sssssss = system serial number
DIMM 1 Un-P1-C1 Yes
DIMM 2 Un-P1-C2 Yes
DIMM 3 Un-P1-C3 Yes
DIMM 4 Un-P1-C4 Yes
DIMM 5 Un-P1-C5 Yes
DIMM 6 Un-P1-C6 Yes
DIMM 7 Un-P1-C7 Yes
DIMM 8 Un-P1-C8 Yes
2.5" SAS HDD1 Un-P1-D1 Yes
2.5" SAS HDD2 Un-P1-D2 Yes
Management Card Un-P1-C9 Yes
Battery Un-P1-E1 Yes
PCIe High Speed Expansion Card Un-P1-C12 Yes
1Xe Card Un-P1-C11 Yes
USB Port 1 (CDROM/FDD) Un-P1-T1 No
USB Port 2 (CDROM/FDD) Un-P1-T2 No
SAS controller Un-P1-T3 No
Ethernet HEA0_A Un-P1-T4 No
Ethernet HEA0_B Un-P1-T5 No
Machine Location Code Utttt.mmm.sssssss No
Um codes are for firmware. The format is the same as for a Un location code.
Um = Utttt.mmm.sssssss
14 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 4. Location codes (continued)
Components Physical Location Code CRU LED
Firmware version Um-Y1

Reference codes

Reference codes are diagnostic aids that help you determine the source of a hardware or operating system problem. To use reference codes effectively, use them in conjunction with other service and support procedures.
The BladeCenter PS700 Type 8406 blade server produces several types of codes.
Progress codes: The power-on self-test (POST) generates eight-digit status codes that are known as checkpoints or progress codes, which are recorded in the management-module event log. The checkpoints indicate which blade server resource is initializing.
Error codes: The First Failure Data Capture (FFDC) error checkers capture fault data, which the service processor then analyzes. For unrecoverable errors (UEs), for recoverable events that meet or exceed their service thresholds, and for fatal system errors, an unrecoverable checkstop service event triggers the service processor to analyze the error, log the system reference code (SRC), and turn on the system attention LED.
The service processor logs the nine-word, eight-digit per word error code in the BladeCenter management-module event log. Error codes are either system reference codes (SRCs) or service request numbers (SRNs). A location code might also be included.
Isolation procedures: If the fault analysis does not determine a definitive cause, the service processor might indicate a fault isolation procedure that you can use to isolate the failing component.
Viewing the codes
The PS700 blade server does not display checkpoints or error codes on the remote console. The shared BladeCenter unit video also does not display the codes.
If the POST detects a problem, a 9-word, 8-digit error code is logged in the BladeCenter management-module event log. A location code that identifies a component might also be included. See “Error logs” on page 183 for information about viewing the management-module event log.
Service request numbers can be viewed using the AIX diagnostics CD, or various operating system utilities, such as AIX diagnostics or the Linux service aid “diagela”, if it is installed.
Chapter 2. Diagnostics 15

System reference codes (SRCs)

System reference codes indicate a server hardware or software problem that can originate in hardware, in firmware, or in the operating system.
A blade server component generates an error code when it detects a problem. An SRC identifies the component that generated the error code and describes the error. Use the SRC information to identify a list of possibly failing items and to find information about any additional isolation procedures.
The following table shows the syntax of a nine-word B700xxxx SRC as it might be displayed in the event log of the management module.
The first word of the SRC in this example is the message identifier, B7001111. This example numbers each word after the first word to show relative word positions. The seventh word is the direct select address, which is 77777777 in the example.
Table 5. Nine-word system reference code in the management-module event log
Index Sev Source Date/Time Text
1 E Blade_05
01/21/2008, 17:15:14
Depending on your operating system and the utilities you have installed, error messages might also be stored in an operating system log. See the documentation that comes with the operating system for more information.
(PS700-BC1BLD5E) SYS F/W: Error. Replace UNKNOWN (5008FECF B7001111 22222222 33333333 44444444 55555555 66666666 77777777 88888888 99999999)
The management module can display the most recent 32 SRCs and time stamps. Manually refresh the list to update it.
Select Blade Service Data > blade_name in the management module to see a list of the 32 most recent SRCs.
Table 6. Management module reference code listing
Unique ID System Reference Code Timestamp
00040001 D1513901 2005-11-13 19:30:20
00000016 D1513801 2005-11-13 19:30:16
Any message with more detail is highlighted as a link in the System Reference Code column. Click the message to cause the management module to present the additional message detail:
D1513901 Created at: 2007-11-13 19:30:20 SRC Version: 0x02 Hex Words 2-5: 020110F0 52298910 C1472000 200000FF
16 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
SRC formats
SRCs are strings of either six or eight alphanumeric characters. The first two characters designate the reference code type.
The first character indicates the type of error. In a few cases, the first two characters indicate the type of error:
v 1xxxxxxx - System power control network (SPCN) error v 6xxxxxxx - Virtual optical device error v A1xxxxxx - Attention required (Service processor) v AAxxxxxx - Attention required (Partition firmware) v B1xxxxxx - Service processor error, such as a boot problem v B6xxxxxx - Licensed Internal Code or hardware event error v B9xxxxxx - Software installation error or IBM i IPL error. See "Recovering from IPL or system failures"
in the IBM i Information Center at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/ index.jsp?topic=/ipha5_p5/iplprocedure.htm.
v BAxxxxxx - Partition firmware error v Cxxxxxxx - Checkpoint (must hang to indicate an error) v Dxxxxxxx - Dump checkpoint (must hang to indicate an error)
To find a description of a SRC that is not listed in this PS700 blade server documentation, refer to the POWER7 Reference Code Lookup page at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/ index.jsp?topic=/ipha8/codefinder.htm.
1xxxyyyy SRCs
The 1xxxyyyy system reference codes are system power control network (SPCN) reference codes.
Look for the rightmost 4 characters (yyyy in 1xxxyyyy) in the error code; this is the reference code. Find the reference code in Table 7.
Perform all actions before exchanging failing items.
Table 7. 1xxxyyyy SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error Codes
00AC Informational message: AC loss
00AD Informational message: A
1F02 Informational message: The
1F03 Informational message: Invalid
Description Action
No action is required.
was reported
No action is required. service processor reset caused the blade server to power off
No action is required. trace logs reached 1K of data.
No action is required. TMS of location code.
Chapter 2. Diagnostics 17
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error Codes
2600 Power good (pGood) master
2610 pGood fault
2620 12V dc pGood input fault
2629 1.5V reg_pgood fault
262B 1.8V reg_pgood fault
262C 5V reg_pgood fault
262D 3.3V reg_pgood fault
262E 2.5V reg_pgood fault
2630 VRM CP0 core pGood fault
2632 VRM CP0 cache pGood fault
2647 12V "or-ing" FET short
2648 Blade power latch fault
Description Action
fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
18 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error
Description Action
Codes
2649 Blade power fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2670 The BladeCenter encountered a
problem, and the blade server was automatically shut down as a result
1. Check the management-module event log for entries that were made around the time that the PS700 blade server shut down.
2. Resolve any problems that are found.
3. Reboot the blade server.
4. If the problem is not resolved, replace the system-board and
chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
2671 12V power fault in the blade
server
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2672 Blades PEU3 voltage alert Perform the DTRCARD symbolic CRU isolation procedure by
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4. If the problem persists, replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service processor problems” on page 200
2675 1.1
Reg_CPU0_P5IO2C_Vio_pGood fault
2676 VTTA pGood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2677 VTTA pGood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2678 PROC_Vmem_controller_pGood
1.0V fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2679 Vmem_controller_pGood 1.5V
reg fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 19
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error Codes
267A HSDC/4xel_A0_pGood fault Perform the DTRCARD symbolic CRU isolation procedure by
267B HSDC/4xel_B0_pGood fault Perform the DTRCARD symbolic CRU isolation procedure by
267C REG_P5IO2C_core 1.2V pGood
267D 2.0_PLL_pGood fault
2710 pGood output/
3120 VRM voltage adjustment failure
Description Action
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4. If the problem persists, replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service processor problems” on page 200
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4. If the problem persists, replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service processor problems” on page 200
fault
P7_VRM_PVID_gate fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
20 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error Codes
3134 Fault on the hardware
8400 Invalid configuration decode
8402 Unable to get VPD from the
8413 Invalid processor 1 VPD
8414 Invalid processor 2 VPD
8423 No processor VPD was found
8480 Bad or missing memory
84A0 No backplane VPD was found
Description Action
Perform the DTRCARD symbolic CRU isolation procedure by
monitoring chip
concentrator
controller VID
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4. If the problem persists, replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service processor problems” on page 200
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists: a. Go to “Checkout procedure” on page 184. b. Replace the system-board, as described in “Replacing the FRU
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
system-board and chassis assembly” on page 260.
6xxxyyyy SRCs
The 6xxxyyyy system reference codes are virtual optical reference codes.
Chapter 2. Diagnostics 21
Look for the rightmost 4 characters (yyyy in 6xxxyyyy) in the error code; this is the reference code. Find the reference code in Table 8.
Table 8. 6xxxyyyy SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
6xxxyyyy Error Codes
632BCFC1 A virtual optical device cannot
632BCFC2 A non-recoverable error was
632BCFC3 The data in the list of volumes
632BCFC4 A virtual optical device cannot
632BCFC5 A non-recoverable error was
632BCFC6 The file specified does not
632BCFC7 A virtual optical device
632BCFC8 A virtual optical device
632CC000 Informational system log entry
632CC002 self configuring SCSI device
632CC010 Undefined sense key returned
632CC020 Configuration error. Refer to the hosting partition for problem analysis.
632CC100 SCSD bus error occurred. Refer to the hosting partition for problem analysis.
632CC110 SCSD command timeout
632CC210 Informational system log entry
Description Action
632Byyyy codes are Network File System (NFS) virtual optical SRCs
On this partition and on the Network File System server, verify access the file containing the list of volumes.
detected while reading the list of volumes.
is not valid.
access the file containing the specified optical volume.
detected while reading a virtual optical volume.
contain data that can be processed as a virtual optical volume.
detected an error reported by the Network File System server that cannot be recovered.
encountered a non-recoverable error.
632Cyyyy codes are virtual optical SRCs
only.
(SCSD) selection or reselection timeout occurred.
by device.
occurred.
only.
that the proper file is specified and that the proper authority is
granted.
Resolve any errors on the Network File System server.
On the Network File System server, verify that the proper file is
specified, that all files are entered correctly, that there are no blank
lines, and that the character set used is valid.
On the Network File System server, verify that the proper file is
specified in the list of volumes, and that the proper authority is
granted.
Resolve any errors on the Network File System server.
On the Network File System server, verify that all the files specified
in the list of optical volumes are correct.
Resolve any errors on the Network File System server.
Install any available operating system updates.
No corrective action is required.
Refer to the hosting partition for problem analysis.
Refer to the hosting partition for problem analysis.
Refer to the hosting partition for problem analysis.
No corrective action is required.
22 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 8. 6xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
6xxxyyyy Error Codes
632CC300 Media or device error
Description Action
Refer to the hosting partition for problem analysis.
occurred.
632CC301 Media or device error
Refer to the hosting partition for problem analysis.
occurred.
632CC302 Media or device error
Refer to the hosting partition for problem analysis.
occurred.
632CC303 Media has an unknown
No corrective action is required.
format.
632CC333 Incompatible media.
1. Verify that the disk has a supported format.
2. If the format is supported, clean the disk and attempt the
failing operation again.
3. If the operation fails again with the same system reference code, ask your media source for a replacement disk.
632CC400 Physical link error detected by
Refer to the hosting partition for problem analysis.
device.
632CC402 An internal program error
Install any available operating system updates.
occurred.
632CCFF2 Informational system log entry
No corrective action is required.
only.
632CCFF4 Internal device error occurred. Refer to the hosting partition for problem analysis.
632CCFF6 Informational system log entry
No corrective action is required.
only.
632CCFF7 Informational system log entry
No corrective action is required.
only.
632CCFFE Informational system log entry
No corrective action is required.
only.
632CFF3D Informational system log entry
No corrective action is required.
only.
632CFF6D Informational system log entry
No corrective action is required.
only.
Chapter 2. Diagnostics 23
A1xxyyyy service processor SRCs
An A1xxyyyy system reference code (SRC) is an attention code that offers information about a platform or service processor dump or confirms a control panel function request. Take the steps in the Action column only if the BladeSystem appears to hang on an attention code.
Table 9 shows A1xxyyyy SRCs.
Table 9. A1xxyyyy service processor SRCs
Attention code Description Action
A1xxyyyy Attention code
A2xxyyyy Logical partition SRCs
An A2xxyyyy SRC is a logical partition reference code that is related to logical partitioning.
Table 10. A2xxyyyy Logical partition SRCs
Reference Code Description Action
A2xxyyyy See the description for the B200yyyy error
code with the same yyyy value.
A2D03000 User-initiated immediate termination and MSD
of a partition.
A2D03001 User-initiated RSCDUMP of RPA partition's
PFW content.
A2D03002 User-initiated RSCDUMP of IBM i partition's
SLIC bootloader and PFW content.
1. Go to “Checkout procedure” on page 184.
2. Replace the system board and chassis
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
Perform the action described in the B200yyyy error code with the same yyyy value.
No corrective action is required.
No corrective action is required.
No corrective action is required.
A700yyyy Licensed internal code SRCs
An A700xxxx system reference code (SRC) is an error/event code that is related to licensed internal code.
Table 11. A700yyyy Licensed internal code SRCs
Reference Code Description Action
A700173C Informational system log entry only. No corrective action is required.
A7003000 A user-initiated platform dump occurred. No service action required.
A7004700 Informational system log entry only. No corrective action is required.
A7004712 A problem occurred when initializing, reading,
or using system VPD.
A7004713 A problem occurred when initializing, reading,
or using system VPD.
A7004715 A problem occurred when initializing, reading,
or using system VPD.
Replace the management card, as described in “Removing the tier 2 management card” on page 255 and “Installing the tier 2 management card” on page 256.
Replace the management card, as described in “Removing the tier 2 management card” on page 255 and “Installing the tier 2 management card” on page 256.
Replace the management card, as described in “Removing the tier 2 management card” on page 255 and “Installing the tier 2 management card” on page 256.
24 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 11. A700yyyy Licensed internal code SRCs (continued)
Reference Code Description Action
A7004721 The World Wide Port Name (WWPN) Prefix is
not valid.
A7004730 Informational system log entry only. No corrective action is required.
A7004740 Informational system log entry only. No corrective action is required.
A7004741 Informational system log entry only. No corrective action is required.
A7004788 Informational system log entry only. No corrective action is required.
A70047FF Informational system log entry only. No corrective action is required.
A7013003 Partition-initiated PHYP-content RSCDUMP. No corrective action is required.
A700yyyy For any other A7xxyyyy SRC not listed here,
see the description for the B7xxyyyy error code with the same xxyyyy value.
https://www-912.ibm.com/supporthome.nsf/ document/51455410
Perform the action in the B7xxyyyy error code with the same xxyyyy value.
AA00E1A8 to AA260005 Partition firmware attention codes
AAxx attention codes provide information about the next target state for the platform firmware. These codes might indicate that you need to perform an action.
Table 12 describes the partition firmware codes that might be displayed if the POST detects a problem. Each message description includes a suggested action to correct the problem.
Table 12. AA00E1A8 to AA260005 Partition firmware attention codes
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Attention code Description Action
AA00E1A8 The system is booting to the open
firmware prompt.
AA00E1A9 The system is booting to the System
Management Services (SMS) menus.
AA00E1B0 Waiting for the user to select the
language and keyboard. The menu should be visible on the console.
At the open firmware prompt, type dev
/packages/gui obe and press Enter; then, type 1 to select SMS Menu.
1. If the system or partition returns to the SMS menus after a boot attempt failed, use the SMS menus to check the progress indicator history for a BAxx xxxx error, which may indicate why the boot attempt failed. Follow the actions for that error code to resolve the boot problem.
2. Use the SMS menus to establish the boot list and restart the blade server.
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 25
Table 12. AA00E1A8 to AA260005 Partition firmware attention codes (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Attention code Description Action
AA00E1B1 Waiting for the user to accept or decline
the license
AA060007 A keyboard was not found. Verify that a keyboard is attached to the USB
AA06000B The system or partition was not able to
find an operating system on any of the devices in the boot list.
AA06000C The media in a device in the boot list
was not bootable.
AA06000D The media in the device in the bootlist
was not found under the I/O adapter specified by the bootlist.
AA06000E The adapter specified in the boot list is
not present or is not functioning.
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
port that is assigned to the partition.
1. Use the SMS menus to modify the boot list so that it includes devices that have a known-good operating system and restart the blade server.
2. If the problem remains, go to “Boot problem resolution” on page 190.
1. Replace the media in the device with known-good media or modify the boot list to boot from another bootable device.
2. If the problem remains, go to “Boot problem resolution” on page 190.
1. Verify that the media from which you are trying to boot is bootable or modify the boot list to boot from another bootable device.
2. If the problem remains, go to “Boot problem resolution” on page 190.
v For an AIX operating system:
1. Try booting the blade server from another bootable device; then, run AIX online diagnostics against the failing adapter.
2. If AIX cannot be booted from another device, boot the blade server using the stand-alone diagnostics CD or a NIM server; then, run diagnostics against the failing adapter.
v For a Linux operating system, boot the blade
server using the stand-alone diagnostics CD or a NIM server; then, run diagnostics against the failing adapter.
26 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 12. AA00E1A8 to AA260005 Partition firmware attention codes (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Attention code Description Action
AA060011 The firmware did not find an operating
system image and at least one hard disk in the boot list was not detected by the firmware. The firmware is retrying the entries in the boot list.
This might occur if a disk enclosure that contains the boot disk is not fully initialized or if the boot disk belongs to another partition. Verify that:
v The boot disk belongs to the partition from
which you are trying to boot.
v The boot list in the SMS menus is correct.
AA130013 Bootable media is missing from a USB
CD-ROM
Verify that a bootable CD is properly inserted in the CD or DVD drive and retry the boot operation.
AA130014 The media in a USB CD-ROM has
changed.
1. Retry the operation.
2. Check for server firmware updates; then,
install the updates if available and retry the operation.
AA170210 Setenv/$setenv parameter error - the
name contains a null character.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
AA170211 Setenv/$setenv parameter error - the
value contains a null character.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
AA190001 The hypervisor function to get or set the
time-of-day clock reported an error.
1. Use the operating system to set the system clock.
2. If the problem persists, check for server firmware updates.
3. Install any available updates and retry the operation.
AA260001 Enter the Type Model Number (Must be
8 characters)
AA260002 Enter the Serial Number (Must be 7
characters)
AA260003 Enter System Unique ID (Must be 12
characters)
AA260004 Enter WorldWide Port Number (Must be
12 characters)
Enter the machine type and model of the blade server at the prompt.
Enter the serial number of the blade server at the prompt.
Enter the system unique ID number at the prompt.
Enter the worldwide port number of the blade server at the prompt.
AA260005 Enter Brand (Must be 2 characters) Enter the brand number of the blade server at
the prompt.
Chapter 2. Diagnostics 27
Bxxxxxxx Service processor early termination SRCs
A Bxxxxxxx system reference code (SRC) is an error code that is related to an event or exception that occurred in the service processor firmware.
To find a description of a SRC that is not listed in this PS700 blade server documentation, refer to the POWER7 Reference Code Lookup page at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/ index.jsp?topic=/ipha8/codefinder.htm.
Table 13 describes error codes that might occur if POST detects a problem. The description also includes suggested actions to correct the problem.
Note: For problems persisting after completing the suggested actions, see “Solving undetermined problems” on page 227.
Table 13. B181xxxx Service processor early termination SRCs
B181 xxxx Error Code Description Action
7200 Invalid boot request
7201 Service processor failure
7202 The permanent and temporary
firmware sides are both marked invalid
7203 Error setting boot parameters
7204 Error reading boot parameters
7205 Boot code error
7206 Unit check timer was reset
7207 Error reading from NVRAM
7208 Error writing to NVRAM
7209 The service processor boot watchdog
timer expired and forced the service processor to attempt a boot from the other firmware image in the service processor flash memory
720A Power-off reset occurred. FipsDump
should be analyzed: Possible software problem
Go to “Checkout procedure” on page 184.
28 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
B200xxxx Logical partition SRCs
A B200xxxx SRC is a logical partition reference code that is related to logical partitioning.
Table 14 describes system reference codes that might be displayed if system firmware detects a problem. Suggested actions to correct the problem are also listed.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page 184 and “Solving undetermined problems” on page 227.
Table 14. B200xxxx Logical partition SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B2001130 A problem occurred during the
migration of a partition
You attempted to migrate a partition to a system that has a power or thermal problem. The migration will not continue.
B2001131 A problem occurred during the
migration of a partition.
Look for and fix power or thermal problems and then retry the migration.
Check for server firmware updates; then, install the updates if available.
The migration of a partition did not complete.
B2001132 A problem occurred during the
startup of a partition.
A platform firmware error occurred while it was trying to allocate memory. The startup will not continue.
B2001133 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
B2001134 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
B2001140 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
B2001141 A problem occurred during the
migration of a partition.
Collect a platform dump and then go to “Isolating firmware problems” on page 218.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
The migration of a partition did not complete.
Chapter 2. Diagnostics 29
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B2001142 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
B2001143 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
B2001144 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
B2001148 A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
The migration of a partition did not complete.
B2001150 During the startup of a partition, a
partitioning configuration problem occurred.
B2001151 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
B2001170 During the startup of a partition, a
failure occurred due to a validation error.
B2001225 A problem occurred during the
startup of a partition.
The partition attempted to start up prior to the platform fully initializing. Restart the partition after the platform has fully completed and the platform is not in standby mode.
B2001230 During the startup of a partition, a
partitioning configuration problem occurred; the partition is lacking the necessary resources to start up.
B2001260 A problem occurred during the
startup of a partition.
Go to “Verifying the partition configuration” on page 186.
Check for server firmware updates; then, install the updates if available.
Go to “Verifying the partition configuration” on page 186.
Restart the partition.
Go to “Verifying the partition configuration” on page 186.
Set the partition to Normal.
The partition could not start at the Timed Power On setting because the partition was not set to Normal.
30 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B2001265 The partition could not start up. An
Correct the startup settings. operating system Main Storage Dump startup was attempted with the startup side on D-mode, which is not a valid operating system startup scenario. The startup will be halted. This SRC can occur when a D-mode SLIC installation fails and attempts a Main Storage Dump.
B2001266 The partition could not start up. You
are attempting to start up an
Install a supported operating system and restart the
partition. operating system that is not supported.
B2001280 A problem occurred during a
Go to “Isolating firmware problems” on page 218. partition Main Storage Dump. A mainstore dump startup did not complete due to a configuration mismatch.
B2001281 A partition memory error occurred.
Restart the partition. The failed memory will no longer be used.
B2001282 A problem occurred during the
Go to “Isolating firmware problems” on page 218. startup of a partition.
B2001320 A problem occurred during the
startup of a partition.
Configure a load source for the partition. Then restart the
partition.
No default load source was selected. The startup will attempt to continue, but there may not be enough information to find the correct load source.
B2001321 A problem occurred during the
startup of a partition.
B2001322 In the partition startup, code failed
during a check of the load source path.
B2002048 A problem occurred during a
partition Main Storage Dump. A mainstore dump startup did not complete due to a copy error.
B2002054 A problem occurred during a
partition Main Storage Dump. A mainstore dump IPL did not complete due to a configuration mismatch.
Verify that the correct slot is specified for the load source.
Then restart the partition.
Verify that the path for the load source is specified
correctly. Then restart the partition.
Go to “Isolating firmware problems” on page 218.
Go to “Isolating firmware problems” on page 218.
Chapter 2. Diagnostics 31
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B2002058 A problem occurred during a
partition Main Storage Dump. A mainstore dump startup did not complete due to a copy error.
B2002210 Informational system log entry only. No corrective action is required.
B2002220 Informational system log entry only. No corrective action is required.
B2002250 During the startup of a partition, an
attempt to toggle the power state of a slot has failed.
B2002260 During the startup of a partition, the
partition firmware attempted an operation that failed.
B2002300 During the startup of a partition, an
attempt to toggle the power state of a slot has failed.
B2002310 During the startup of a partition, the
partition firmware attempted an operation that failed.
B2002320 During the startup of a partition, the
partition firmware attempted an operation that failed.
B2002425 During the startup of a partition, the
partition firmware attempted an operation that failed.
B2002426 During the startup of a partition, the
partition firmware attempted an operation that failed.
B2002475 During the startup of a partition, a
slot that was needed for the partition was either empty or the device in the slot has failed.
B2002485 During the startup of a partition, the
partition firmware attempted an operation that failed.
B2003000 Informational system log entry only. No corrective action is required.
B2003081 During the startup of a partition, the
startup did not complete due to a copy error.
B2003084 A problem occurred during the
startup of a partition.
Go to “Isolating firmware problems” on page 218.
Check for server firmware updates; then, install the updates if available.
Go to “Isolating firmware problems” on page 218.
Check for server firmware updates; then, install the updates if available.
Go to “Isolating firmware problems” on page 218.
Go to “Isolating firmware problems” on page 218.
Go to “Isolating firmware problems” on page 218.
Go to “Isolating firmware problems” on page 218.
Check for server firmware updates; then, install the updates if available.
Go to “Isolating firmware problems” on page 218.
Check for server firmware updates; then, install the updates if available.
Verify that the adapter type is supported.
The adapter type might not be supported.
B2003088 Informational system log entry only. No corrective action is required.
32 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B200308C A problem occurred during the
Verify that a valid I/O Load Source is tagged. startup of a partition.
The adapter type cannot be determined.
B2003090 A problem occurred during the
Go to “Isolating firmware problems” on page 218. startup of a partition.
B2003110 A problem occurred during the
Go to “Isolating firmware problems” on page 218. startup of a partition.
B2003113 A problem occurred during the
Look for B7xx xxxx errors and resolve them. startup of a partition.
B2003114 A problem occurred during the
Look for other errors and resolve them. startup of a partition.
B2003120 Informational system log entry only. No corrective action is required.
B2003123 Informational system log entry only. No corrective action is required.
B2003125 During the startup of a partition, the
blade server firmware could not
Check for server firmware updates; then, install the
updates if available. obtain a segment of main storage within the blade server to use for managing the creation of a partition.
B2003128 A problem occurred during the
Look for and resolve B700 69xx errors. startup of a partition. A return code for an unexpected failure was returned when attempting to query the load source path.
B2003130 A problem occurred during the
startup of a partition.
B2003135 A problem occurred during the
startup of a partition.
B2003140 A problem occurred during the
startup of a partition. This is a
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Reconfigure the partition to include the intended load
source path. configuration problem in the partition.
B2003141 Informational system log entry only. No corrective action is required.
B2003142 Informational system log entry only. No corrective action is required.
B2003143 Informational system log entry only. No corrective action is required.
B2003144 Informational system log entry only. No corrective action is required.
B2003145 Informational system log entry only. No corrective action is required.
B2003200 Informational system log entry only. No corrective action is required.
B2004158 Informational system log entry only. No corrective action is required.
B2004400 A problem occurred during the
startup of a partition.
Check for server firmware updates; then, install the
updates if available.
Chapter 2. Diagnostics 33
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B2005106 A problem occurred during the
startup of a partition. There is not enough space to contain the partition main storage dump. The startup will not continue.
B2005109 A problem occurred during the
startup of a partition. There was a partition main storage dump problem. The startup will not continue.
B2005114 A problem occurred during the
startup of a partition. There is not enough space to contain the partition main storage dump. The startup will not continue.
B2005115 A problem occurred during the
startup of a partition. There was an error reading the partition main storage dump from the partition load source into main storage. The startup will attempt to continue.
B2005117 A problem occurred during the
startup of a partition. A partition main storage dump has occurred but cannot be written to the load source device because a valid dump already exists.
B2005121 A problem occurred during the
startup of a partition. There was an error writing the partition main storage dump to the partition load source. The startup will not continue.
B2005122 Informational system log entry only. No corrective action is required.
B2005123 Informational system log entry only. No corrective action is required.
B2005135 A problem occurred during the
startup of a partition. There was an error writing the partition main storage dump to the partition load source. The main store dump startup will continue.
B2005137 A problem occurred during the
startup of a partition. There was an error writing the partition main storage dump to the partition load source. The main store dump startup will continue.
Verify that there is sufficient memory available to start the partition as it is configured. If there is already enough memory, then go to “Isolating firmware problems” on page 218.
Go to “Isolating firmware problems” on page 218.
Go to “Isolating firmware problems” on page 218.
If the startup does not continue, look for and resolve other errors.
Use the Main Storage Dump Manager to rename or copy the current main storage dump.
Look for related errors in the "Product Activity Log" and fix any problems found. Use virtual control panel function 34 to retry the current Main Store Dump startup while the partition is still in the failed state.
Look for other errors and resolve them.
Look for other errors and resolve them.
34 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B2005145 A problem occurred during the
Look for other errors and resolve them. startup of a partition. There was an error writing the partition main storage dump to the partition load source. The main store dump startup will continue.
B2005148 A problem occurred during the
Go to “Isolating firmware problems” on page 218. startup of a partition. An error occurred while doing a main storage dump that would have caused another main storage dump. The startup will not continue.
B2005149 A problem occurred during the
startup of a partition while doing a
Check for server firmware updates; then, install the
updates if available. Firmware Assisted Dump that would have caused another Firmware Assisted Dump.
B200514A A Firmware Assisted Dump did not
complete due to a copy error.
B200542A A Firmware Assisted Dump did not
complete due to a read error.
B200542B A Firmware Assisted Dump did not
complete due to a copy error.
B200543A A Firmware Assisted Dump did not
complete due to a copy error.
B200543B A Firmware Assisted Dump did not
complete due to a copy error.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
B200543C Informational system log entry only. No corrective action is required.
B200543D A Firmware Assisted Dump did not
complete due to a copy error.
B2006006 During the startup of a partition, a
Check for server firmware updates; then, install the
updates if available.
Go to “Isolating firmware problems” on page 218. system firmware error occurred when the partition memory was being initialized; the startup will not continue.
B2006006 A problem occurred during the
Contact IBM support. startup of a partition. The partition could not reserve the memory required for IPL.
B2006012 During the startup of a partition, the
Go to “Isolating firmware problems” on page 218. partition LID failed to completely load into the partition main storage area.
Chapter 2. Diagnostics 35
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B2006015 A problem occurred during the
startup of a partition. The load source media is corrupted or not valid.
B2006025 A problem occurred during the
startup of a partition. This is a problem with the load source media being corrupt or not valid.
B2006027 During the startup of a partition, a
failure occurred when allocating memory for an internal object used for firmware module load operations.
B2006110 A problem occurred during the
startup of a partition. There was an error on the load source device. The startup will attempt to continue.
B200690A During the startup of a partition, an
error occurred while copying open firmware into the partition load area.
B2007200 Informational system log entry only. No corrective action is required.
B2008080 Informational system log entry only. No corrective action is required.
B2008081 During the startup of a partition, an
internal firmware time-out occurred; the partition might continue to start up but it can experience problems while running.
B2008105 During the startup of a partition,
there was a failure loading the VPD areas of the partition; the load source media has been corrupted or is unsupported on this server.
B2008106 A problem occurred during the
startup of a partition. The startup will not continue.
B2008107 During the startup of a partition,
there was a problem getting a segment of main storage in the blade server main storage.
B2008109 During the startup of a partition, a
failure occurred. The startup will not continue.
B2008111 A problem occurred during the
startup of a partition.
Replace the load source media.
Replace the load source media.
1. Make sure that enough main storage was allocated to the partition.
2. Retry the operation.
Look for other errors and resolve them.
Go to “Isolating firmware problems” on page 218.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
Replace the load source media.
Check for server firmware updates; then, install the updates if available.
1. Make sure that there is enough memory to start up the partition.
2. Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
36 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B2008112 During the startup of a partition, a
failure occurred; the startup will not
Check for server firmware updates; then, install the updates if available.
continue.
B2008113 During the startup of a partition, an
error occurred while mapping
Check for server firmware updates; then, install the updates if available.
memory for the partition startup.
B2008114 During the startup of a partition,
there was a failure verifying the
Check for server firmware updates; then, install the
updates if available. VPD for the partition resources during startup.
B2008115 During the startup of a partition,
there was a low level
Check for server firmware updates; then, install the
updates if available. partition-to-partition communication failure.
B2008117 During the startup of a partition, the
partition did not start up due to a
Check for server firmware updates; then, install the
updates if available. system firmware error.
B2008121 During the startup of a partition, the
Go to “Isolating firmware problems” on page 218. partition did not start up due to a system firmware error.
B2008123 During the startup of a partition, the
Go to “Isolating firmware problems” on page 218. partition did not start up due to a system firmware error.
B2008125 During the startup of a partition, the
Go to “Isolating firmware problems” on page 218. partition did not start up due to a system firmware error.
B2008127 During the startup of a partition, the
Go to “Isolating firmware problems” on page 218. partition did not start up due to a system firmware error.
B2008129 During the startup of a partition, the
Go to “Isolating firmware problems” on page 218. partition did not start up due to a system firmware error.
B200813A There was a problem establishing a
Go to “Isolating firmware problems” on page 218. console.
B2008140 Informational system log entry only. No corrective action is required.
B2008141 Informational system log entry only. No corrective action is required.
B2008142 Informational system log entry only. No corrective action is required.
B2008143 Informational system log entry only. No corrective action is required.
B2008144 Informational system log entry only. No corrective action is required.
B2008145 Informational system log entry only. No corrective action is required.
B2008150 System firmware detected an error. Collect a platform dump and then go to “Isolating
firmware problems” on page 218.
Chapter 2. Diagnostics 37
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B2008151 System firmware detected an error. Use the Integrated Virtualization Manager (IVM) or
management console to increase the Logical Memory Block (LMB) size, and to reduce the number of virtual devices for the partition.
B2008152 No active system processors. Verify that processor resources are assigned to the
partition.
B2008160 A problem occurred during the
migration of a partition.
B2008161 A problem occurred during the
migration of a partition.
B200A100 A partition ended abnormally; the
partition could not stay running and shut itself down.
B200A101 A partition ended abnormally; the
partition could not stay running and shut itself down.
B200A140 A lower priority partition lost a
usable processor to supply it to a higher priority partition with a bad processor.
B200B07B Informational system log entry only. No corrective action is required.
B200B215 A problem occurred after a partition
ended abnormally.
Contact IBM support.
Contact IBM support.
1. Check the error logs and take the actions for the error codes that are found.
2. Go to “Isolating firmware problems” on page 218.
1. Check the error logs and take the actions for the error
codes that are found.
2. Go to “Isolating firmware problems” on page 218.
Evaluate the entire LPAR configuration. Adjust partition profiles with the new number of processors available in the system.
Restart the platform.
There was a communications problem between this partition's service processor and the platform's service processor.
B2005127 Timeout occurred during a main
store dump IPL.
B2D03001 Informational system log entry only. No corrective action is required.
B2D03002 Informational system log entry only. No corrective action is required.
B200C1F0 An internal system firmware error
occurred during a partition shutdown or a restart.
B200D150 A partition ended abnormally; there
was a communications problem between this partition and the code that handles resource allocation.
B200E0AA A problem occurred during the
power off of a partition.
There was not enough memory available for the dump to complete before the timeout occurred. Retry the main store dump IPL, or else power on the partition normally.
Go to “Isolating firmware problems” on page 218.
Check for server firmware updates; then, install the updates if available.
Go to “Isolating firmware problems” on page 218.
38 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 14. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Reference Code Description Action
B200F001 A problem occurred during the
startup of a partition. An operation has timed out.
B200F003 During the startup of a partition, the
partition processor(s) did not start the firmware within the time-out window.
B200F004 Informational system log entry only. No corrective action is required.
B200F005 Informational system log entry only. No corrective action is required.
B200F006 During the startup of a partition, the
code load operation for the partition startup timed out.
B200F007 During a shutdown of the partition,
a time-out occurred while trying to stop a partition.
B200F008 Informational system log entry only. No corrective action is required.
B200F009 Informational system log entry only. No corrective action is required.
B200F00A Informational system log entry only. No corrective action is required.
B200F00B Informational system log entry only. No corrective action is required.
B200F00C Informational system log entry only. No corrective action is required.
B200F00D Informational system log entry only. No corrective action is required.
Look for other errors and resolve them.
Collect the partition dump information; then, go to “Isolating firmware problems” on page 218.
1. Check the error logs and take the actions for the error codes that are found.
2. Go to “Isolating firmware problems” on page 218.
Check for server firmware updates; then, install the updates if available.
B700xxxx Licensed internal code SRCs
A B700xxxx system reference code (SRC) is an error code or event code that is related to licensed internal code.
Table 15 describes the system reference codes that might be displayed if system firmware detects a problem. Suggested actions to correct the problem are also listed.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page 184 and “Solving undetermined problems” on page 227.
Table 15. B700xxxx Licensed internal code SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
0102 System firmware detected an error. A
machine check occurred during startup.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
Chapter 2. Diagnostics 39
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
0103 System firmware detected a failure
0104 System firmware failure. Machine check,
undefined error occurred.
0105 System firmware detected an error.
More than one request to terminate the system was issued.
0106 System firmware failure.
0107 System firmware failure. The system
detected an unrecoverable machine check condition.
0200 System firmware has experienced a low
storage condition
0201 System firmware detected an error. No immediate action is necessary.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
1. Check for server firmware updates.
2. Update the firmware.
Go to “Isolating firmware problems” on page
218.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
No immediate action is necessary.
Continue running the system normally. At the earliest convenient time or service window, work with IBM Support to collect a platform dump and restart the system; then, go to “Isolating firmware problems” on page 218.
Continue running the system normally. At the earliest convenient time or service window, work with IBM Support to collect a platform dump and restart the system; then, go to “Isolating firmware problems” on page 218.
0202 Informational system log entry only. No corrective action is required.
0302 System firmware failure
0441 Service processor failure. The platform
encountered an error early in the startup or termination process.
0443 Service processor failure. Replace the system-board and chassis
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
Replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
40 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
0601 Informational system log entry only. No corrective action is required.
Note: This code and associated data can be used to determine why the time of day for a partition was lost.
0602 System firmware detected an error
condition.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
0611 There is a problem with the system
hardware clock; the clock time is
Use the operating system to set the system clock.
invalid.
0621 Informational system log entry only. No corrective action is required.
0641 System firmware detected an error.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
0650 System firmware detected an error.
Resource management was unable to allocate main storage. A platform dump was initiated.
1. Collect the event log.
2. Collect the platform dump data.
3. Collect the partition configuration
information.
4. Go to “Isolating firmware problems” on page 218.
0651 The system detected an error in the
system clock hardware
Replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
0803 Informational system log entry only. No corrective action is required.
0804 Informational system log entry only. No corrective action is required.
0A00 Informational system log entry only. No corrective action is required.
0A01 Informational system log entry only. No corrective action is required.
0A10 Informational system log entry only. No corrective action is required.
1150 Informational system log entry only. No corrective action is required.
1151 Informational system log entry only. No corrective action is required.
1152 Informational system log entry only. No corrective action is required.
1160 Service processor failure
1. Go to “Isolating firmware problems” on page 218.
2. Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
1161 Informational system log entry only. No corrective action is required.
1730 The VPD for the system is not what was
expected at startup.
Replace the management card, as described in “Removing the tier 2 management card” on page 255 and “Installing the tier 2 management card” on page 256.
Chapter 2. Diagnostics 41
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
1731 The VPD on a memory DIMM is not
correct and the memory on the DIMM cannot be used, resulting in reduced memory.
1732 The VPD on a processor card is not
correct and the processor card cannot be used, resulting in reduced processing power.
1733 System firmware failure. The startup
will not continue.
Replace the MEMDIMM symbolic CRU, as described in “Service processor problems” on page 200.
Replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
Look for and correct B1xxxxxx errors. If there are no serviceable B1xxxxxx errors, or if correcting the errors does not correct the problem, contact IBM support to reset the server firmware settings.
Attention: Resetting the server firmware settings results in the loss of all of the partition data that is stored on the service processor. Before continuing with this operation, manually record all settings that you intend to preserve.
The service processor reboots after IBM Support resets the server firmware settings.
If the problem persists, Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
173A A VPD collection overflow occurred.
173B A system firmware failure occurred
during VPD collection.
4091 Informational system log entry only. No corrective action is required.
4400 There is a platform dump to collect
4401 System firmware failure. The system
firmware detected an internal problem.
4402 A system firmware error occurred while
attempting to allocate the memory necessary to create a platform dump.
1. Look for and resolve other errors.
2. If there are no other errors: a. Update the firmware to the current
level, as described in “Updating the firmware” on page 263.
b. You might also have to update the
management module firmware to a compatible level.
Look for and correct other B1xxxxxx errors.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
Go to “Isolating firmware problems” on page
218.
Go to “Isolating firmware problems” on page
218.
42 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
4705 System firmware failure. A problem
Restart the system. occurred when initializing, reading, or using the system VPD. The Capacity on Demand function is not available.
4710 Informational system log entry only. No corrective action is required.
4714 Informational system log entry only. No corrective action is required.
4715 A problem occurred when initializing,
reading, or using system VPD.
Replace the management card, as described in
“Removing the tier 2 management card” on
page 255 and “Installing the tier 2 management
card” on page 256.
4750 Informational system log entry only. No corrective action is required.
4788 Informational system log entry only. No corrective action is required.
47CB Informational system log entry only. No corrective action is required.
5120 System firmware detected an error If the system is not exhibiting problematic
behavior, you can ignore this error. Otherwise,
go to “Isolating firmware problems” on page
218.
5121 System firmware detected a
programming problem for which a platform dump may have been initiated.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
5122 An error occurred during a search for
the load source.
If the partition fails to startup, go to “Isolating
firmware problems” on page 218. Otherwise,
no corrective action is required.
5123 Informational system log entry only. No corrective action is required.
5190 Operating system error. The server
firmware detected a problem in an operating system.
5191 System firmware detected a virtual I/O
configuration error.
Check for error codes in the partition that is
reporting the error and take the appropriate
actions for those error codes.
1. Use the Integrated Virtualization Manager (IVM) or management console to verify or reconfigure the invalid virtual I/O configuration.
2. Check for server firmware updates; then, install the updates if available.
5209 Informational system log entry only. No corrective action is required.
5219 Informational system log entry only. No corrective action is required.
5300 System firmware detected a failure
while partitioning resources. The platform partitioning code encountered
Check the management-module event log for error codes; then, take the actions associated with those error codes.
an error.
5301 User intervention required. The system
detected a problem with the partition configuration.
Use the Integrated Virtualization Manager (IVM) or management console to reallocate the system resources.
Chapter 2. Diagnostics 43
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
5302 An unsupported Preferred Operating
System was detected.
The Preferred Operating System specified is not supported. The IPL will not continue.
5303 An unsupported Preferred Operating
System was detected.
The Preferred Operating System specified is not supported. The IPL will continue.
5304 The number of available World Wide
Port Names (WWPN) is low.
5305 The number of available World Wide
Port Names (WWPN) is low.
5400 System firmware detected a problem
with a processor.
5442 System firmware detected an error. Replace the system-board and chassis
54DD Informational system log entry only. No corrective action is required.
5600 Informational system log entry only. No corrective action is required.
5601 System firmware failure. There was a
problem initializing, reading, or using system location codes.
5602 The system has out-of-date VPD LIDs. Check for server firmware updates; then,
5603 Enclosure feature code and/or serial
number not valid.
6900 PCI host bridge failure
6906 System bus error Replace the system-board and chassis
Work with IBM support to select a supported Preferred Operating System; then, re-IPL the system.
Work with IBM support to select a supported Preferred Operating System; then, re-IPL the system.
https://www-912.ibm.com/supporthome.nsf/ document/51455410
https://www-912.ibm.com/supporthome.nsf/ document/51455410
Replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
Go to “Isolating firmware problems” on page
218.
install the updates if available.
Verify that the machine type, model, and serial number are correct for this server.
1. Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
2. If the problem persists, use the “PCI expansion card (PIOCARD) problem isolation procedure” on page 194 to determine the failing component.
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
44 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
6907 System bus error
1. Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
2. Go to “Isolating firmware problems” on page 218.
6908 System bus error Replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
6909 System bus error
1. Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
2. Go to “Isolating firmware problems” on page 218.
6911 Platform LIC unable to find or retrieve
VPD LID file.
6912 Platform LIC unable to find or retrieve
VPD LID file.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
6944 Informational system log entry only. No corrective action is required.
6950 A platform dump has occurred.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
6951 An error occurred because a partition
needed more NVRAM than was available.
Use the Integrated Virtualization Manager (IVM) or management console to delete one or more partitions.
6952 Informational system log entry only. No corrective action is required.
6953 PHYP NVRAM is unavailable after a
service processor reset and reload.
Go to “Isolating firmware problems” on page
218.
6954 Informational system log entry only. No corrective action is required.
6955 Informational system log entry only. No corrective action is required.
6956 An NVRAM failure was detected. Go to “Isolating firmware problems” on page
218.
6965 Informational system log entry only. No corrective action is required.
6970 PCI host bridge failure
1. Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
2. If the problem persists, use the “PCI expansion card (PIOCARD) problem isolation procedure” on page 194 to determine the failing component.
Chapter 2. Diagnostics 45
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
6971 PCI bus failure
6972 System bus error Replace the system-board and chassis
6973 System bus error
6974 Informational system log entry only. No corrective action is required.
6978 Informational system log entry only. No corrective action is required.
6979 Informational system log entry only. No corrective action is required.
697C Connection from service processor to
system processor failed.
6980 RIO, HSL or 12X controller failure Replace the system-board and chassis
6981 System bus error. Replace the system-board and chassis
6984 Informational system log entry only. No corrective action is required.
6985 Remote I/O (RIO), high-speed link
(HSL), or 12X loop status message.
6987 Remote I/O (RIO), high-speed link
(HSL), or 12X connection failure.
1. Use the “PCI expansion card (PIOCARD) problem isolation procedure” on page 194 to determine the failing component.
2. If the problem persists, replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
1. Use the “PCI expansion card (PIOCARD) problem isolation procedure” on page 194 to determine the failing component.
2. If the problem persists, replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
Replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
Replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
Replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
46 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
6990 Service processor failure. Replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
6991 System firmware failure Go to “Isolating firmware problems” on page
218.
6993 Service processor failure
1. Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
2. Go to “Isolating firmware problems” on page 218.
6994 Service processor failure. Replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
6995 Informational system log entry only. No corrective action is required.
69C2 Informational system log entry only. No corrective action is required.
69C3 Informational system log entry only. No corrective action is required.
69D9 Host Ethernet Adapter (HEA) failure. Replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
69DA Informational system log entry only. No corrective action is required.
69DB System firmware failure.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
BAD1 The platform firmware detected an
error.
BAD2 System firmware detected an error.
Go to “Isolating firmware problems” on page
218.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
F103 System firmware failure
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
F104 Operating system error. System
firmware terminated a partition.
Check the management-module event log for partition firmware error codes (especially BA00F104); then, take the appropriate actions for those error codes.
F105 System firmware detected an internal
error
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
Chapter 2. Diagnostics 47
Table 15. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
F106 System firmware detected an error Replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
F107 System firmware detected an error.
F108 A firmware error caused the system to
terminate.
F10A System firmware detected an error Look for and correct B1xxxxxx errors.
F10B A processor resource has been disabled
due to hardware problems
F10C The platform LIC detected an internal
problem performing Partition Mobility.
F120 Informational system log entry only. No corrective action is required.
F130 Thermal Power Management Device
firmware error was detected.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
Replace the system-board and chassis assembly, as described in “Replacing the FRU system-board and chassis assembly” on page
260.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
Check for server firmware updates; then, install the updates if available.
BA000010 to BA400002 Partition firmware SRCs
The power-on self-test (POST) might display an error code that the partition firmware detects.
Table 16 describes error codes that might be displayed if POST detects a problem. The description also includes suggested actions to correct the problem.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page 184 and “Solving undetermined problems” on page 227.
Table 16. BA000010 to BA400002 Partition firmware SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA000010 The device data structure is corrupted
48 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA000020 Incompatible firmware levels were
found
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA000030 An lpevent communication failure
occurred
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA000031 An lpevent communication failure
occurred
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA000032 The firmware failed to register the
lpevent queues
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA000034 The firmware failed to exchange
capacity and allocate lpevents
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA000038 The firmware failed to exchange virtual
continuation events
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA000040 The firmware was unable to obtain the
RTAS code lid details
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 49
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA000050 The firmware was unable to load the
RTAS code lid
BA000060 The firmware was unable to obtain the
open firmware code lid details
BA000070 The firmware was unable to load the
open firmware code lid
BA000080 The user did not accept the license
agreement
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Accept the license agreement and restart the blade server.
If the problem persists:
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA000081 Failed to get the firmware license policy
BA000082 Failed to set the firmware license policy
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
50 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA000091 Unable to load a firmware code update
module
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA00E820 An lpevent communication failure
occurred
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA00E830 Failure when initializing ibm,event-scan
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA00E840 Failure when initializing PCI hot-plug
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA00E843 Failure when initializing the interface to
AIX or Linux
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA00E850 Failure when initializing dynamic
reconfiguration
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA00E860 Failure when initializing sensors
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA010000 There is insufficient information to boot
the systems
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA010001 The client IP address is already in use
by another network device
Verify that all of the IP addresses on the network are unique; then, retry the operation.
Chapter 2. Diagnostics 51
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA010002 Cannot get gateway IP address Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that: v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA010003 Cannot get server hardware address Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that: v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA010004 Bootp failed Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that: v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
52 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA010005 File transmission (TFTP) failed Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that: v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA010006 The boot image is too large Start up from another device with a bootable
image.
BA010007 The device does not have the required
device_type property.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA010008 The device_type property for this device
is not supported by the iSCSI initiator configuration specification.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA010009 The arguments specified for the ping
function are invalid.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
BA01000A The itname parameter string exceeds the
maximum length allowed.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
Chapter 2. Diagnostics 53
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA01000B The ichapid parameter string exceeds
the maximum length allowed.
BA01000C The ichappw parameter string exceeds
the maximum length allowed.
BA01000D The iname parameter string exceeds the
maximum length allowed.
BA01000E The LUN specified is not valid. The embedded host Ethernet adapters (HEAs)
BA01000F The chapid parameter string exceeds the
maximum length allowed.
BA010010 The chappw parameter string exceeds
the maximum length allowed.
BA010011 SET-ROOT-PROP could not find / (root)
package
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
54 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA010013 The information in the error log entry
Informational message. No action is required. for this SRC provides network trace data.
BA010014 The information in the error log entry
Informational message. No action is required. for this SRC provides network trace data.
BA010015 The information in the error log entry
Informational message. No action is required. for this SRC provides network trace data.
BA010020 A trace entry addition failed because of
a bad trace type.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA012010 Opening the TCP node failed.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA012011 TCP failed to read from the network
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA012012 TCP failed to write to the network.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA012013 Closing TCP failed.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 55
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA017020 Failed to open the TFTP package Verify that the Trivial File Transfer Protocol
(TFTP) parameters are correct.
BA017021 Failed to load the TFTP file Verify that the TFTP server and network
connections are correct.
BA01B010 Opening the BOOTP node failed.
BA01B011 BOOTP failed to read from the network Perform the following actions that checkpoint
BA01B012 BOOTP failed to write to the network Perform the following actions that checkpoint
BA01B013 The discover mode is invalid
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
CA00E174 describes:
1. Verify that: v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
CA00E174 describes:
1. Verify that: v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
56 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA01B014 Closing the BOOTP node failed
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA01B015 The BOOTP discover server timed out Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that: v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA01D001 Opening the DHCP node failed
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA01D020 DHCP failed to read from the network
1. Verify that the network cable is connected, and that the network is active.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA01D030 DHCP failed to write to the network
1. Verify that the network cable is connected, and that the network is active.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 57
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA01D040 The DHCP discover server timed out
BA01D050 DHCP::discover no good offer DHCP discovery did not receive any DHCP
1. Verify that the DHCP server has addresses available.
2. Verify that the DHCP server configuration file is not overly constrained. An over-constrained file might prevent a server from meeting the configuration requested by the client.
3. Perform the following actions that checkpoint CA00E174 describes:
a. Verify that:
v The bootp server is correctly
configured; then, retry the operation.
v The network connections are correct;
then, retry the operation.
b. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
offers from the servers that meet the client requirements.
BA01D051 DHCP::discover DHCP request timed
out
BA01D052 DHCP::discover: 10 incapable servers
were found
BA01D053 DHCP::discover received a reply, but
without a message type
Verify that the DHCP server configuration file is not overly constrained. An over-constrained file might prevent a server from meeting the configuration requested by the client.
DHCP discovery did receive a DHCP offer from a server that met the client requirements, but the server did not send the DHCP acknowledgement (DHCP ack) to the client DHCP request.
Another client might have used the address that was served.
Verify that the DHCP server has addresses available.
Ten DHCP servers have sent DHCP offers, none of which met the requirements of the client. Check the compatibility of the configuration that the client is requesting and the server DHCP configuration files.
Verify that the DHCP server is properly configured.
58 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA01D054 DHCP::discover: DHCP nak received DHCP discovery did receive a DHCP offer
from a server that meets the client requirements, but the server sent a DHCP not acknowledged (DHCP nak) to the client DHCP request.
Another client might be using the address that was served.
This situation can occur when there are multiple DHCP servers on the same network, and server A does not know the subnet configuration of server B, and vice-versa.
This situation can also occur when the pool of addresses is not truly divided.
Set the DHCP server configuration file to "authoritative".
Verify that the DHCP server is functioning properly.
BA01D055 DHCP::discover: DHCP decline DHCP discovery did receive a DHCP offer
from one or more servers that meet the client requirements. However, the client performed an ARP test on the address and found that another client was using the address.
The client sent a DHCP decline to the server, but the client did not receive an additional DHCP offer from a server. The client still does not have a valid address.
Verify that the DHCP server is functioning properly.
BA01D056 DHCP::discover: unknown DHCP
message
DHCP discovery received an unknown DHCP message type. Verify that the DHCP server is functioning properly.
BA01D0FF Closing the DHCP node failed.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA030011 RTAS attempt to allocate memory failed
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 59
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA04000F Self test failed on device; no error or
location code information available
BA040010 Self test failed on device; can't locate
package.
BA040020 The machine type and model are not
recognized by the blade server firmware.
BA040030 The firmware was not able to build the
UID properly for this system. As a result, problems may occur with the licensing of the AIX operating system.
BA040035 The firmware was unable to find the
“plant of manufacture” in the VPD. This may cause problems with the licensing of the AIX operating system.
BA040040 Setting the machine type, model, and
serial number failed.
BA040050 The h-call to switch off the boot
watchdog timer failed.
1. If a location code is identified with the error, replace the device specified by the location code.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. If a location code is identified with the error, replace the device specified by the location code.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
Verify that the machine type, model, and serial number are correct for this server. If this is a new server, check for server firmware updates; then, install the updates if available.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
60 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA040060 Setting the firmware boot side for the
next boot failed.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA050001 Failed to reboot a partition in logical
partition mode
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA050004 Failed to locate service processor device
tree node.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA05000A Failed to send boot failed message
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA060008 No configurable adapters found by the
Remote IPL menu in the SMS utilities
This error occurs when the firmware cannot locate any LAN adapters that are supported by the remote IPL function. Verify that the devices in the remote IPL device list are correct using the SMS menus.
BA06000B The system was not able to find an
Go to “Boot problem resolution” on page 190. operating system on the devices in the boot list.
BA06000C A pointer to the operating system was
found in non-volatile storage.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA060020 The environment variable “boot-device”
exceeded the allowed character limit.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA060021 The environment variable “boot-device”
contained more than five entries.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 61
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA060022 The environment variable “boot-device”
contained an entry that exceeded 255 characters in length
BA060030 Logical partitioning with shared
processors is enabled and the operating system does not support it.
BA060060 The operating system expects an IOSP
partition, but it failed to make the transition to alpha mode.
BA060061 The operating system expects a
non-IOSP partition, but it failed to make the transition to MGC mode.
1. Using the SMS menus, set the boot list to the default boot list.
2. Shut down; then, start up the blade server.
3. Use SMS menus to customize the boot list
as required.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Install or boot a level of the operating system that supports shared processors.
2. Disable logical partitioning with shared processors in the operating system.
3. If the problem remains: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Verify that: v The alpha-mode operating system image
is intended for this partition.
v The configuration of the partition
supports an alpha-mode operating system.
2. If the problem remains: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Verify that: v The alpha-mode operating system image
is intended for this partition.
v The configuration of the partition
supports an alpha-mode operating system.
2. If the problem remains: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
62 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA060070 The operating system does not support
this system's processor(s)
BA060071 An invalid number of vectors was
received from the operating system
BA060072 Client-arch-support hcall error
Boot a supported version of the operating system.
Boot a supported version of the operating system.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA060075 Client-arch-support firmware error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA060200 Failed to set the operating system boot
list from the management module boot list
1. Using the SMS menus, set the boot list to the default boot list.
2. Shut down; then, start up the blade server.
3. Use SMS menus to customize the boot list
as required.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA060201 Failed to read the VPD "boot path" field
value
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA060202 Failed to update the VPD with the new
"boot path" field value
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA060300 An I/O error on the adapter from which
the boot was attempted prevented the operating system from being booted.
1. Using the SMS menus, select another adapter from which to boot the operating system, and reboot the system.
2. Attempt to reboot the system.
3. Go to “Boot problem resolution” on page
190.
BA07xxxx self configuring SCSI device (SCSD)
controller failure
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 63
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA090001 SCSD DASD: test unit ready failed;
hardware error
BA090002 SCSD DASD: test unit ready failed;
sense data available
BA090003 SCSD DASD: send diagnostic failed;
sense data available
BA090004 SCSD DASD: send diagnostic failed:
devofl cmd
BA09000A There was a vendor specification error.
BA09000B Generic SCSD sense error
BA09000C The media is write-protected
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Check the vendor specification for additional information.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Verify that the SCSD cables and devices are properly plugged.
2. Correct any problems that are found.
3. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Change the setting of the media to allow writing, then retry the operation.
2. Insert new media of the correct type.
3. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
64 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA09000D The media is unsupported or not
recognized.
1. Insert new media of the correct type.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA09000E The media is not formatted correctly.
1. Insert the media.
2. Insert new media of the correct type.
3. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA09000F Media is not present
1. Insert new media with the correct format.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA090010 The request sense command failed.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA090011 The retry limit has been exceeded.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 65
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA090012 There is a SCSD device that is not
supported.
BA120001 On an undetermined SCSD device, test
unit ready failed; hardware error
BA120002 On an undetermined SCSD device, test
unit ready failed; sense data available
1. Replace the SCSD device that is not supported with a supported device.
2. If the problem persists: a. Troubleshoot the SCSD devices. b. Verify that the SCSD cables and devices
are properly plugged. Correct any problems that are found.
c. Replace the SCSD cables and devices. d. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
66 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA120003 On an undetermined SCSD device, send
diagnostic failed; sense data available
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA120004 On an undetermined SCSD device, send
diagnostic failed; devofl command
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA120010 Failed to generate the SAS device
physical location code. The event log entry has the details.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA130010 USB CD-ROM in the media tray: device
remained busy longer than the time-out period
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 67
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA130011 USB CD-ROM in the media tray:
execution of ATA/ATAPI command was not completed with the allowed time.
BA130012 USB CD-ROM in the media tray:
execution of ATA/ATAPI command failed.
BA130013 USB CD-ROM in the media tray:
bootable media is missing from the drive
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Insert a bootable CD in the drive and retry the operation.
2. If the problem persists: a. Retry the operation. b. Reboot the blade server. c. Troubleshoot the media tray and
CD-ROM drive.
d. Replace the USB CD or DVD drive. e. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
68 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA130014 USB CD-ROM in the media tray: the
media in the USB CD-ROM drive has been changed.
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA130015 USB CD-ROM in the media tray:
ATA/ATAPI packet command execution failed.
1. Remove the CD or DVD in the drive and replace it with a known-good disk.
2. If the problem persists: a. Retry the operation. b. Reboot the blade server. c. Troubleshoot the media tray and
CD-ROM drive.
d. Replace the USB CD or DVD drive. e. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
BA131010 The USB keyboard has been removed.
1. Reseat the keyboard cable in the management module USB port.
2. Check for server firmware updates; then, install the updates if available.
BA140001 The SCSD read/write optical test unit
ready failed; hardware error.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 69
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA140002 The SCSD read/write optical test unit
ready failed; sense data available.
BA140003 The SCSD read/write optical send
diagnostic failed; sense data available.
BA140004 The SCSD read/write optical send
diagnostic failed; devofl command.
BA150001 PCI Ethernet BNC/RJ-45 or PCI
Ethernet AUI/RJ-45 adapter: internal wrap test failure
BA151001 10/100 Mbps Ethernet PCI adapter:
internal wrap test failure
BA151002 10/100 Mbps Ethernet card failure
BA153002 Gigabit Ethernet adapter failure Verify that the MAC address programmed in
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Replace the adapter specified by the location code.
Replace the adapter specified by the location code.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
the FLASH/EEPROM is correct.
70 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA153003 Gigabit Ethernet adapter failure
1. Check for server firmware updates; then, install the updates if available.
2. Replace the Gigabit Ethernet adapter.
BA154010 HEA software error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA154020 The required open firmware property
was not found.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA154030 Invalid parameters were passed to the
HEA device driver.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA154040 The TFTP package open failed
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA154050 The transmit operation failed.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA154060 Failed to initialize the HEA port or
queue
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 71
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA154070 The receive operation failed.
BA170000 NVRAMRC initialization failed; device
test failed
BA170100 NVRAM data validation check failed
BA170201 The firmware was unable to expand
target partition - saving configuration variable
BA170202 The firmware was unable to expand
target partition - writing event log entry
BA170203 The firmware was unable to expand
target partition - writing VPD data
BA170210 Setenv/$Setenv parameter error - name
contains a null character
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Shut down the blade server; then, restart it.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
72 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA170211 Setenv/$Setenv parameter error - value
contains a null character
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA170220 Unable to write a variable value to
NVRAM due to lack of free memory in NVRAM.
1. Reduce the number of partitions, if possible, to add more NVRAM memory to this partition.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA170221 Setenv/$setenv had to delete stored
firmware network boot settings to free memory in NVRAM.
BA170998 NVRAMRC script evaluation error -
command line execution error.
Enter the adapter and network parameters again for the network boot or network installation.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA180008 PCI device Fcode evaluation error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA180009 The Fcode on a PCI adapter left a data
stack imbalance
1. Reseat the PCI adapter card.
2. Check for adapter firmware updates; then,
install the updates if available.
3. Check for server firmware updates; then, install the updates if available.
4. Replace the PCI adapter card.
5. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA180010 PCI probe error, bridge in freeze state
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA180011 PCI bridge probe error, bridge is not
usable
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 73
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA180012 PCI device runtime error, bridge in
freeze state
BA180014 MSI software error
BA180020 No response was received from a slot
during PCI probing.
BA180099 PCI probe error; bridge in freeze state,
slot in reset state
BA180100 The FDDI adapter Fcode driver is not
supported on this server.
BA180101 Stack underflow from fibre-channel
adapter
BA190001 Firmware function to get/set
time-of-day reported an error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reseat the PCI adapter card.
2. Check for adapter firmware updates; then,
install the updates if available.
3. Check for server firmware updates; then, install the updates if available.
4. Replace the PCI adapter card.
5. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
IBM may produce a compatible driver in the future, but does not guarantee one.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
74 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA201001 The serial interface dropped data
packets
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA201002 The serial interface failed to open
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA201003 The firmware failed to handshake
properly with the serial interface
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA210000 Partition firmware reports a default
catch
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA210001 Partition firmware reports a stack
underflow was caught
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA210002 Partition firmware was ready before
standout was ready
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 75
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA210003 A data storage error was caught by
partition firmware
BA210004 An open firmware stack-depth assert
failed.
BA210010 The transfer of control to the SLIC
loader failed
BA210011 The transfer of control to the IO
Reporter failed
BA210012 There was an NVRAMRC forced-boot
problem; unable to load the previous boot's operating system image
BA210013 There was a partition firmware error
when in the SMS menus.
1. If the location code reported with the error points to an adapter, check for adapter firmware updates.
2. Apply any available updates.
3. Check for server firmware updates.
4. Apply any available updates.
5. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Use the SMS menus to verify that the partition firmware can still detect the operating system image.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
76 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA210020 I/O configuration exceeded the
maximum size allowed by partition firmware.
1. Increase the logical memory block size to 256 MB and restart the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA210100 An error may not have been sent to the
management module event log.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA210101 The partition firmware event log queue
is full
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA210102 There was a communication failure
between partition firmware and the hypervisor. The lpevent that was expected from the hypervisor was not received.
1. Review the event log for errors that occurred around the time of this error.
2. Correct any errors that are found and reboot the blade server.
3. If the problem persists: a. Reboot the blade server. b. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 77
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA210103 There was a communication failure
between partition firmware and the hypervisor. There was a failing return code with the lpevent acknowledgement from the hypervisor.
BA220010 There was a partition firmware error
during a USB hotplug probing. USB hotplug may not work properly on this partition.
BA220020 CRQ registration error; partner vslot
may not be valid
BA278001 Failed to flash firmware: invalid image
file
BA278002 Flash file is not designed for this
platform
BA278003 Unable to lock the firmware update lid
manager
BA278004 An invalid firmware update lid was
requested
BA278005 Failed to flash a firmware update lid Download a new firmware update image and
BA278006 Unable to unlock the firmware update
lid manager
1. Review the event log for errors that occurred around the time of this error.
2. Correct any errors that are found and reboot the blade server.
3. If the problem persists: a. Reboot the blade server. b. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as described in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Look for EEH-related errors in the event log.
2. Resolve any EEH event log entries that are found.
3. Correct any errors that are found and reboot the blade server.
4. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Verify that this client virtual slot device has a valid server virtual slot device in a hosting partition.
Download a new firmware update image and retry the update.
Download a new firmware update image and retry the update.
1. Restart the blade server.
2. Verify that the operating system is
authorized to update the firmware. If the system is running multiple partitions, verify that this partition has service authority.
Download a new firmware update image and retry the update.
retry the update.
Restart the blade server.
78 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA278007 Failed to reboot the system after a
Restart the blade server.
firmware flash update
BA278009 The operating system's server firmware
update management tools are incompatible with this system.
Go to the IBM download site at www14.software.ibm.com/webapp/set2/sas/ f/lopdiags/home.html to download the latest version of the service aids package for Linux.
BA27800A The firmware installation failed due to a
hardware error that was reported.
1. Look for hardware errors in the event log.
2. Resolve any hardware errors that are found.
3. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA280000 RTAS discovered an invalid operation
that may cause a hardware error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA290000 RTAS discovered an internal stack
overflow
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
BA290001 RTAS low memory corruption was
detected
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA290002 RTAS low memory corruption was
detected
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA310010 Unable to obtain the SRC history
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 79
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA310020 An invalid SRC history was obtained.
BA310030 Writing the MAC address to the VPD
failed.
BA330000 Memory allocation error.
BA330001 Memory allocation error.
BA330002 Memory allocation error.
BA330003 Memory allocation error.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
80 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA330004 Memory allocation error.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA340001 There was a logical partition event
communication failure reading the BladeCenter open fabric manager parameter data structure from the service processor.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA340002 There was a logical partition event
communication failure reading the BladeCenter open fabric manager location code mapping data from the service processor.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA340003 An internal firmware error occurred;
unable to allocate memory for the open fabric manager location code mapping data.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA340004 An internal firmware error occurred; the
open fabric manager parameter data was corrupted.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA340005 An internal firmware error occurred; the
location code mapping table was corrupted.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 81
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA340006 An LP event communication failure
occurred reading the system initiator capability data from the service processor.
BA340007 An internal firmware error occurred; the
open fabric manager system initiator capability data was corrupted.
BA340008 An internal firmware error occurred; the
open fabric manager system initiator capability data version was not correct.
BA340009 An internal firmware error occurred; the
open fabric manager system initiator capability processing encountered an unexpected error.
BA340010 An internal firmware error was detected
during open fabric manager processing.
BA340011 Assignment of fabric ID to the I/O
adapter failed.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
82 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA340020 A logical partition event communication
failure occurred when writing the BladeCenter open fabric manager parameter data to the service processor.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA340021 A logical partition event communication
failure occurred when writing the BladeCenter open fabric manager system initiator capabilities data to the service processor.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA400001 Informational message: DMA trace
buffer full.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
BA400002 Informational message: DMA map-out
size mismatch.
1. Reboot the blade server.
2. If the problem persists: a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 83

POST progress codes (checkpoints)

When you turn on the blade server, the power-on self-test (POST) performs a series of tests to check the operation of the blade server components. Use the management module to view progress codes that offer information about the stages involved in powering on and performing an initial program load (IPL).
Progress codes do not indicate an error, although in some cases, the blade server can pause indefinitely (hang). Progress codes for blade servers are, 8-digit hexadecimal numbers that start with C and D.
Checkpoints are generated by various components. The baseboard management controller (BMC) service processor and the partitioning firmware are key contributors. The service processor provides additional isolation procedure codes for troubleshooting.
A checkpoint might have an associated location code as part of the message. The location code provides information that identifies the failing component when there is a hang condition.
Notes:
1. For checkpoints with no associated location code, see “Light path diagnostics” on page 214 to identify the failing component when there is a hang condition.
2. For checkpoints with location codes, see “Location codes” on page 14 to identify the failing component when there is a hang condition.
3. For eight-digit codes not listed here, see “Checkout procedure” on page 184 for information.
The management module can display the most recent 32 SRCs and time stamps. Manually refresh the list to update it.
Select Blade Service Data > blade_name in the management module to see a list of the 32 most recent SRCs.
Table 17. Management module reference code listing
Unique ID System Reference Code Timestamp
00040001 D1513901 2005-11-13 19:30:20
00000016 D1513801 2005-11-13 19:30:16
Any message with more detail is highlighted as a link in the System Reference Code column. Click the message to cause the management module to present the additional message detail:
D1513901 Created at: 2007-11-13 19:30:20 SRC Version: 0x02 Hex Words 2-5: 020110F0 52298910 C1472000 200000FF
84 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
C1001F00 to C1645300 Service processor checkpoints
The C1xx progress codes, or checkpoints, offer information about the initialization of both the service processor and the server. Service processor checkpoints are typical reference codes that occur during the initial program load (IPL) of the server.
Table 18 lists the progress codes that might be displayed during the power-on self-test (POST), along with suggested actions to take if the system hangs on the progress code. Only when you experience a hang condition should you take any of the actions described for a progress code.
In the following progress codes, x can be any number or letter.
Table 18. C1001F00 to C1645300 checkpoints
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code Description Action
C10010xx Pre-standby
C1001F00 Pre-standby: starting initial transition
file
C1001F0D Pre-standby: discovery completed in
initial transition file
While the blade server displays this checkpoint, the service processor reads the system vital product data (VPD). The service processor must complete reading the system VPD before the system displays the next progress code.
C1001F0F Pre-standby: waiting for standby
synchronization from initial transition file
C1001FFF Pre-standby: completed initial transition
file
C1009x01 Hardware object manager: (HOM): the
cancontinue flag is being cleared
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Wait at least 15 minutes for this checkpoint to change before you decide that the system is hung.
Reading the system VPD might take as long as 15 minutes on systems with maximum configurations or many disk drives.
2. Go to “Checkout procedure” on page 184.
3. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 85
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code Description Action
C1009x02 Hardware object manager: (HOM):
erase HOM IPL step in progress
C1009x04 Hardware object manager: (HOM):
build cards IPL step in progress
C1009x08 Hardware object manager: (HOM):
build processors IPL step in progress
C1009x0C Hardware object manager: (HOM):
build chips IPL step in progress
C1009x10 Hardware object manager: (HOM):
initialize HOM
C1009x14 Hardware object manager: (HOM):
validate HOM
C1009x18 Hardware object manager: (HOM):
GARD in progress
C1009x1C Hardware object manager: (HOM):
clock test in progress
C1009x20 Frequency control IPL step in progress
C1009x24 Asset protection IPL step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
86 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code Description Action
C1009x28 Memory configuration IPL step in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
C1009x2C Processor CFAM initialization in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
C1009x30 Processor self-synchronization in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
C1009034 Processor mask attentions being
initialized
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
C1009x38 Processor check ring IPL step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
C1009x39 Processor L2 line delete in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
C1009x3A Load processor gptr IPL step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
C1009x3C Processor ABIST step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
C1009x40 Processor LBIST step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
C1009x44 Processor array initialization step in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics 87
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress code Description Action
C1009x46 Processor AVP initialization step in
progress
C1009x48 Processor flush IPL step in progress
C1009x4C Processor wiretest IPL step in progress
C1009x50 Processor long scan IPL step in progress
C1009x54 Start processor clocks IPL step in
progress
C1009x58 Processor SCOM initialization step in
progress
C1009x5C Processor interface alignment procedure
in progress
C1009x5E Processor AVP L2 test case in progress
C1009x60 Processor random data test in progress
C1009x64 Processor enable machine check test in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and chassis assembly” on page 260.
88 Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Loading...