IBM BladeCenter JS12 Problem Determination And Service Manual


BladeCenter JS12 Type 7998
Problem Determination and Service Guide

BladeCenter JS12 Type 7998
Problem Determination and Service Guide
Note
Before using this information and the product it supports, read the general information in Appendix B, “Notices,” on page 289 and the Warranty and Support Information document for your blade server type on the Documentation CD.
Second Edition (November 2009)
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

Contents

Safety ...............v
Guidelines for trained service technicians ....vi
Inspecting for unsafe conditions ......vi
Guidelines for servicing electrical equipment . . vii
Safety statements ............viii
Chapter 1. Introduction ........1
Related documentation ...........1
Notices and statements in this documentation . . . 2
Features and specifications..........2
Supported DIMMs ............4
Blade server control panel buttons and LEDs . . . 5
Turning on the blade server .........7
Turning off the blade server .........8
System-board layouts ...........9
System-board connectors .........9
System-board LEDs ...........9
Chapter 2. Diagnostics ........11
Diagnostic tools .............12
Collecting dump data ...........13
Location codes .............14
Reference codes .............15
System reference codes (SRCs) .......16
1xxxyyyy SRCs ...........17
6xxxyyyy SRCs ...........22
A1xxyyyy service processor SRCs .....24
AA00E1A8 to AA260005 Partition firmware
attention codes ...........25
Bxxxxxxx Service processor early termination
SRCs ..............28
B200xxxx Logical partition SRCs .....29
B700xxxx Licensed internal code SRCs . . . 39 BA000010 to BA400002 Partition firmware
SRCs ..............48
POST progress codes (checkpoints) .....88
C1001F00 to C1645300 Service processor
checkpoints ............89
C2001000 to C20082FF Virtual service
processor checkpoints .........99
IPL status progress codes .......109
C700xxxx Server firmware IPL status
checkpoints ...........109
CA000000 to CA2799FF Partition firmware
checkpoints ............110
D1001xxx to D1xx3FFF Service processor
dump codes ............131
D1xx3y01 to D1xx3yF2 Service processor
dump codes ...........138
D1xx900C to D1xxC003 Service processor
power-off checkpoints ........140
Service request numbers (SRNs) ......141
Using the SRN tables .........142
101-711 through FFC-725 SRNs .....142
A00-FF0 through A24-xxx SRNs .....159
ssss-102 through ssss-640 SRNs for SCSI
devices .............180
Failing function codes 151 through 2D02 . . 184
Error logs ..............186
Checkout procedure ...........186
About the checkout procedure.......186
Performing the checkout procedure .....187
Verifying the partition configuration......189
Running the diagnostics program ......189
Starting AIX concurrent diagnostics .....189
Starting stand-alone diagnostics from a CD . . 190
Starting stand-alone diagnostics from a NIM
server ...............191
Using the diagnostics program ......192
Boot problem resolution ..........193
Troubleshooting tables ..........194
General problems ...........195
Hard disk drive problems ........195
Intermittent problems .........196
Keyboard problems ..........196
Management module service processor
problems ..............197
Memory problems ...........197
Microprocessor problems ........198
Monitor or video problems ........198
Network connection problems .......200
PCI expansion card (PIOCARD) problem
isolation procedure ..........200
Optional device problems ........201
Power problems ...........202
POWER Hypervisor (PHYP) problems ....203
Service processor problems ........205
Software problems...........217
Universal Serial Bus (USB) port problems . . . 217
Light path diagnostics ..........218
Viewing the light path diagnostic LEDs . . . 218
Light path diagnostics LEDs .......219
Isolating firmware problems ........222
Recovering the system firmware .......222
Starting the PERM image ........222
Starting the TEMP image ........223
Recovering the TEMP image from the PERM
image ...............223
Verifying the system firmware levels ....224
Committing the TEMP system firmware image 224 Solving shared BladeCenter resource problems . . 225
Solving shared keyboard problems .....226
Solving shared media tray problems.....226
Solving shared network connection problems 228
Solving shared power problems ......229
Solving shared video problems ......230
Solving undetermined problems .......231
Calling IBM for service ..........232
Chapter 3. Parts listing, Type 7998 235
© Copyright IBM Corp. 2008, 2009 iii
Chapter 4. Removing and replacing
blade server components ......239
Installation guidelines ..........239
System reliability guidelines .......240
Handling static-sensitive devices ......240
Returning a device or component .....241
Removing the blade server from a BladeCenter
unit ................241
Installing the blade server in a BladeCenter unit 242
Removing and replacing Tier 1 CRUs .....244
Removing the blade server cover ......244
Installing and closing the blade server cover . . 245
Removing the bezel assembly .......246
Installing the bezel assembly .......247
Removing a SAS hard disk drive ......248
Installing a SAS hard disk drive ......249
Removing a memory module .......251
Installing a memory module .......251
Removing the management card ......253
Installing the management card ......254
Entering vital product data ........256
Obtaining a PowerVM Virtualization Engine
system technologies activation code .....257
Removing and installing an I/O expansion card 260
Removing a small-form-factor expansion card 260 Installing a small-form-factor expansion card 261 Removing a standard-form-factor expansion
card..............263
Installing a standard-form-factor expansion
card..............264
Removing a combination-form-factor
expansion card ...........265
Installing a combination-form-factor
expansion card ...........266
Removing the battery .........267
Installing the battery ..........268
Removing the hard disk drive tray .....270
Installing the hard disk drive tray .....271
Removing the expansion bracket ......272
Installing the expansion bracket ......273
Replacing the Tier 2 system-board and chassis
assembly ...............274
Updating the firmware ..........277
Configuring the blade server ........278
Using the SMS utility...........279
Starting the SMS utility .........279
SMS utility menu choices ........280
Creating a CE login ...........280
Configuring the Gigabit Ethernet controllers . . . 281 Blade server Ethernet controller enumeration . . . 282 MAC addresses for host Ethernet adapters . . . 282
Updating IBM Director ..........283
Appendix A. Getting help and
technical assistance ........285
Before you call .............286
Using the documentation .........286
Getting help and information from the Web . . . 287
Software service and support ........287
Hardware service and support .......287
IBM Taiwan product service ........287
Appendix B. Notices ........289
Trademarks ..............290
Important notes ............291
Product recycling and disposal .......291
Battery return program ..........293
Electronic emission notices .........295
Federal Communications Commission (FCC)
statement..............295
Industry Canada Class A emission compliance
statement..............295
Avis de conformité à la réglementation
d’Industrie Canada ..........296
Australia and New Zealand Class A statement 296
United Kingdom telecommunications safety
requirement .............296
European Union EMC Directive conformance
statement..............296
Taiwanese Class A warning statement ....297
Chinese Class A warning statement .....297
Japanese Voluntary Control Council for
Interference (VCCI) statement .......297
Chapter 5. Configuring .......277
iv
JS12 Type 7998: Problem Determination and Service Guide
Index ...............299

Safety

Before installing this product, read the Safety Information.
Antes de instalar este produto, leia as Informações de Segurança.
Pred instalací tohoto produktu si prectete prírucku bezpecnostních instrukcí.
Læs sikkerhedsforskrifterne, før du installerer dette produkt.
Lees voordat u dit product installeert eerst de veiligheidsvoorschriften.
Ennen kuin asennat tämän tuotteen, lue turvaohjeet kohdasta Safety Information.
Avant d’installer ce produit, lisez les consignes de sécurité.
Vor der Installation dieses Produkts die Sicherheitshinweise lesen.
Prima di installare questo prodotto, leggere le Informazioni sulla Sicurezza.
Les sikkerhetsinformasjonen (Safety Information) før du installerer dette produktet.
Antes de instalar este produto, leia as Informações sobre Segurança.
© Copyright IBM Corp. 2008, 2009 v
Antes de instalar este producto, lea la información de seguridad.
Läs säkerhetsinformationen innan du installerar den här produkten.

Guidelines for trained service technicians

Inspect the equipment for unsafe conditions and observe the servicing guidelines.

Inspecting for unsafe conditions

Identify potential unsafe conditions in an IBM®product that you are working on.
Each IBM product, as it was designed and manufactured, has required safety items to protect users and service technicians from injury. This information addresses only those items. Use good judgment to identify potential unsafe conditions that might be caused by non-IBM alterations or attachment of non-IBM features or options that are not addressed in this information. If you identify an unsafe condition, you must determine how serious the hazard is and whether you must correct the problem before you work on the product.
Consider the following conditions and the safety hazards that they present:
v Electrical hazards, especially primary power. Primary voltage on the frame can
cause serious or fatal electrical shock.
v Explosive hazards, such as a damaged CRT face or a bulging capacitor.
v Mechanical hazards, such as loose or missing hardware.
To inspect the product for potential unsafe conditions, complete the following steps:
1. Make sure that the power is off and the power cords are disconnected.
2. Make sure that the exterior cover is not damaged, loose, or broken, and observe
any sharp edges.
3. Check the power cords:
v Make sure that the third-wire ground connector is in good condition. Use a
meter to measure third-wire ground continuity for 0.1 ohm or less between the external ground pin and the frame ground.
v Make sure that the power cords are the correct type.
v Make sure that the insulation is not frayed or worn.
4. Remove the cover.
5. Check for any obvious non-IBM alterations. Use good judgment as to the safety
of any non-IBM alterations.
6. Check inside the computer for any obvious unsafe conditions, such as metal
filings, contamination, water or other liquid, or signs of fire or smoke damage.
7. Check for worn, frayed, or pinched cables.
8. Make sure that the power-supply cover fasteners (screws or rivets) have not
been removed or tampered with.
vi JS12 Type 7998: Problem Determination and Service Guide

Guidelines for servicing electrical equipment

Observe the guidelines for servicing electrical equipment.
v Check the area for electrical hazards such as moist floors, nongrounded power
extension cords, and missing safety grounds.
v Use only approved tools and test equipment. Some hand tools have handles that
are covered with a soft material that does not provide insulation from live electrical current.
v Regularly inspect and maintain your electrical hand tools for safe operational
condition. Do not use worn or broken tools or testers.
v Do not touch the reflective surface of a dental mirror to a live electrical circuit.
The surface is conductive and can cause personal injury or equipment damage if it touches a live electrical circuit.
v Some rubber floor mats contain small conductive fibers to decrease electrostatic
discharge. Do not use this type of mat to protect yourself from electrical shock.
v Do not work alone under hazardous conditions or near equipment that has
hazardous voltages.
v Locate the emergency power-off (EPO) switch, disconnecting switch, or electrical
outlet so that you can turn off the power quickly in the event of an electrical accident.
v Disconnect all power before you perform a mechanical inspection, work near
power supplies, or remove or install main units.
v Before you work on the equipment, disconnect the power cord. If you cannot
disconnect the power cord, have the customer power-off the wall box that supplies power to the equipment and lock the wall box in the off position.
v Never assume that power has been disconnected from a circuit. Check it to
make sure that it has been disconnected.
v If you have to work on equipment that has exposed electrical circuits, observe
the following precautions:
– Make sure that another person who is familiar with the power-off controls is
near you and is available to turn off the power if necessary.
– When you are working with powered-on electrical equipment, use only one
hand. Keep the other hand in your pocket or behind your back to avoid creating a complete circuit that could cause an electrical shock.
– When using a tester, set the controls correctly and use the approved probe
leads and accessories for that tester.
– Stand on a suitable rubber mat to insulate you from grounds such as metal
floor strips and equipment frames.
v Use extreme care when measuring high voltages.
v To ensure proper grounding of components such as power supplies, pumps,
blowers, fans, and motor generators, do not service these components outside of their normal operating locations.
v If an electrical accident occurs, use caution, turn off the power, and send another
person to get medical aid.
Safety vii

Safety statements

Important: Each caution and danger statement in this documentation is labeled with a number. This number is used to cross reference an English-language caution or danger statement with translated versions of the caution or danger statement in the Safety Information document.
For example, if a caution statement is labeled, Statement 1,translations for that caution statement are in the Safety Information document under Statement 1.Be sure to read all caution and danger statements in this documentation before you perform the procedures. Read any additional safety information that comes with your blade server or optional device before you install the device.
Statement 1
DANGER
Electrical current from power, telephone, and communication cables is hazardous.
To avoid a shock hazard:
v Do not connect or disconnect any cables or perform installation,
maintenance, or reconfiguration of this product during an electrical storm.
v Connect all power cords to a properly wired and grounded electrical outlet.
v Connect to properly wired outlets any equipment that will be attached to
this product.
v When possible, use one hand only to connect or disconnect signal cables.
v Never turn on any equipment when there is evidence of fire, water, or
structural damage.
v Disconnect the attached power cords, telecommunications systems,
networks, and modems before you open the device covers, unless instructed otherwise in the installation and configuration procedures.
v Connect and disconnect cables as described in the following table when
installing, moving, or opening covers on this product or attached devices.
To Connect: To Disconnect:
1. Turn everything OFF.
2. First, attach all cables to devices.
3. Attach signal cables to connectors.
4. Attach power cords to outlet.
5. Turn device ON.
1. Turn everything OFF.
2. First, remove power cords from outlet.
3. Remove signal cables from connectors.
4. Remove all cables from devices.
viii JS12 Type 7998: Problem Determination and Service Guide
Statement 2
CAUTION: When replacing the lithium battery, use only IBM Part Number 16G8095 or an equivalent type battery recommended by the manufacturer. If your system has a module containing a lithium battery, replace it only with the same module type made by the same manufacturer. The battery contains lithium and can explode if not properly used, handled, or disposed of.
Do not:
v Throw or immerse into water
v Heat to more than 100°C (212°F)
v Repair or disassemble
Dispose of the battery as required by local ordinances or regulations.
Statement 3
CAUTION: When laser products (such as CD-ROMs, DVD drives, fiber optic devices, or transmitters) are installed, note the following:
v Do not remove the covers. Removing the covers of the laser product could
result in exposure to hazardous laser radiation. There are no serviceable parts inside the device.
v Use of controls or adjustments or performance of procedures other than those
specified herein might result in hazardous radiation exposure.
DANGER
Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following.
Laser radiation when open. Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam.
Safety ix
Statement 4
18 kg (39.7 lb) 32 kg (70.5 lb) 55 kg (121.2 lb)
CAUTION: Use safe practices when lifting.
Statement 5
CAUTION: The power control button on the device and the power switch on the power supply do not turn off the electrical current supplied to the device. The device also might have more than one power cord. To remove all electrical current from the device, ensure that all power cords are disconnected from the power source.
1 2
Statement 8
x JS12 Type 7998: Problem Determination and Service Guide
CAUTION: Never remove the cover on a power supply or any part that has the following label attached.
Hazardous voltage, current, and energy levels are present inside any component that has this label attached. There are no serviceable parts inside these components. If you suspect a problem with one of these parts, contact a service technician.
Statement 10
CAUTION: Do not place any object on top of rack-mounted devices.
Safety xi
xii JS12 Type 7998: Problem Determination and Service Guide

Chapter 1. Introduction

This problem determination and service information helps you solve problems that might occur in your IBM BladeCenter®JS12 Type 7998 blade server. The information describes the diagnostic tools that come with the blade server, error codes and suggested actions, and instructions for replacing failing components.
Replaceable components are of three types:
v Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your
responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation.
v Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or
request IBM to install it, at no additional charge, under the type of warranty service that is designated for your blade server.
v Field replaceable unit (FRU): FRUs must be installed only by trained service
technicians.
For information about the terms of the warranty and getting service and assistance, see the Warranty and Support Information document.

Related documentation

Documentation for the JS12 blade server includes documents in Portable Document Format (PDF) on the IBM BladeCenter Documentation CD and the online information center.
The most recent version of all BladeCenter documentation is in the BladeCenter information center.
The online BladeCenter information center is available in the IBM Systems Information Center.
You can find the following documents in PDF on the IBM BladeCenter Documentation CD and in the online information center:
v Installation and User’s Guide
This document contains general information about the blade server, including how to install supported options and how to configure the blade server.
v Safety Information
This document contains translated caution and danger statements. Each caution and danger statement that appears in the documentation has a number that you can use to locate the corresponding statement in your language in the Safety Information document.
v Warranty and Support Information
This document contains information about the terms of the warranty and about getting service and assistance.
© Copyright IBM Corp. 2008, 2009 1
Additional documents might be included in the online information center and on the IBM BladeCenter Documentation CD.
The blade server might have features that are not described in the documentation that comes with the blade server. The documentation might be updated occasionally to include information about those features, or technical updates might be available to provide additional information that is not included in the documentation that comes with the blade server.
Review the online information or the Planning Guide and the Installation Guide for your IBM BladeCenter unit. The information can help you prepare for system installation and configuration. The most current version of each document is available in the BladeCenter information center.

Notices and statements in this documentation

The caution and danger statements in this document are also in the multilingual Safety Information. Each statement is numbered for reference to the corresponding statement in your language in the Safety Information document.
The following notices and statements are used in this document:
v Note: These notices provide important tips, guidance, or advice.
v Important: These notices provide information or advice that might help you
avoid inconvenient or problem situations.
v Attention: These notices indicate potential damage to programs, devices, or data.
An attention notice is placed just before the instruction or situation in which damage might occur.
v Caution: These statements indicate situations that can be potentially hazardous
to you. A caution statement is placed just before the description of a potentially hazardous procedure step or situation.
v Danger: These statements indicate situations that can be potentially lethal or
extremely hazardous to you. A danger statement is placed just before the description of a potentially lethal or extremely hazardous procedure step or situation.

Features and specifications

Features and specifications of the IBM BladeCenter JS12 Type 7998 blade server are summarized in this overview.
2 JS12 Type 7998: Problem Determination and Service Guide
The JS12 blade server is used in one of the following IBM BladeCenter units: BladeCenter E (8677), BladeCenter H (8852), BladeCenter HT (8740 and 8750), BladeCenter S (8886), and BladeCenter T (8720 and 8730) units.
Notes:
v Power, cooling, removable-media drives, external ports, and advanced system
management are provided by the BladeCenter unit.
v The operating system in the blade server must provide support for the Universal
Serial Bus (USB), to enable the blade server to recognize and communicate internally with the removable-media drives and front-panel USB ports.
Microprocessor:
Support for one dual-core, 64-bit POWER6
Support for Energy Scale thermal management for power management/oversubscription (throttling) and environmental sensing
Memory:
v Dual-channel (DDR2) with 8 slots
v Supports 1 GB, 2 GB, 4 GB, and 8
v Supports 2-way interleaved, DDR2,
Virtualization:
PowerVM Standard Edition hardware feature supports Integrated Virtualization Manager and Virtual I/O Server
®
microprocessor; 3.8 GHz
for very low profile (18.3 mm) DIMMs
GB DDR2 DIMMs for a maximum of 64 GB
PC2-4200 or PC2-5300, ECC SDRAM registered x4, memory scrubbing, Chipkill, and bit steering DIMMs
Integrated functions:
v Two 1 Gigabit Ethernet controllers
v Expansion card interface
v The baseboard management
controller (BMC) is a flexible service processor with Intelligent Platform Management Interface (IPMI) firmware and SOL support
v ATI RN 50 ES1000 video controller
v SAS RAID controller
v Light path diagnostics
v RS-485 interface for
communication with the management module
v Automatic server restart (ASR)
v Serial over LAN (SOL)
v Support for local keyboard and
video
v Four Universal Serial Bus (USB)
buses for communication with keyboard and removable-media drives
v Transferable Anchor function
(Renesas Technology HD651330 microcontroller) in the management card
Storage:
Predictive Failure Analysis (PFA) alerts:
v Microprocessor
v Memory
Electrical input: 12Vdc
Environment:
v Air temperature:
– Blade server on: 10° to 35°C (50°
to 95°F). Altitude: 0 to 914 m (3000 ft)
– Blade server on: 10° to 32°C (50°
to 90°F). Altitude: 914 m to 2133 m (3000 ft to 7000 ft)
– Blade server off: -40° to 60°C (-40°
to 140°F)
v Humidity:
– Blade server on: 8% to 80% – Blade server off: 8% to 80%
Size:
v Height: 24.5 cm (9.7 inches)
v Depth: 44.6 cm (17.6 inches)
v Width: 2.9 cm (1.14 inches)
v Maximum weight: 5.0 kg (11 lb)
Support for two internal small-form-factor (SFF) Serial Attached SCSI (SAS) drives
See the ServerProven Web site for information about supported operating-system versions and all JS12 blade server optional devices.
Chapter 1. Introduction 3

Supported DIMMs

The BladeCenter JS12 Type 7998 blade server contains eight memory connectors for industry-standard registered, dual-inline-memory modules (RDIMMs). The DIMMS are very low profile, which means that each DIMM has a height of 18.3 millimeters (mm). Total memory can range from a minimum of 2 gigabytes (GB) to a maximum of 64 GB.
See Chapter 3, “Parts listing, Type 7998,” on page 235 for memory modules that you can order from IBM.
Memory module rules:
v Install DIMMs in pairs in the following connectors to have a supported (tested)
Table 1. Supported use of DIMMs
DIMM Connectors
Pair 1 (DIMM 1 and DIMM 3)
Pair 2 (DIMM 6 and DIMM 8)
Pair 3 (DIMM 2 and DIMM 4)
Pair 4 (DIMM 5 and DIMM 7)
configuration:
Number of DIMMs in Use
Two Four Six Eight
Yes Yes Yes Yes
No Yes Yes Yes
No No Yes Yes
No No No Yes
See “System-board connectors” on page 9 for DIMM connector locations.
v Both DIMMs in a pair must be the same size, speed, type, technology, and
physical design. You can mix compatible DIMMs from different manufacturers. Each DIMM in each of the following sets of four connectors must be the same size:
Size 1 DIMM 1 and DIMM 3 (pair 1) and DIMM 2 and DIMM 4 (pair 3) when
using 6 or 8 DIMMs
Size 2 DIMM 5 and DIMM 7 (pair 4) and DIMM 6 and DIMM 8 (pair 2) when
using 8 DIMMs
v When using 4 DIMMs in DIMM 1 and DIMM 3 (pair 1) and DIMM 6 and
DIMM 8 (pair 2), DIMMs in the second pair can differ in size and speed from the first pair.
v When using 8 GB DIMMs, all of the DIMMS used must be 8 GB.
®
v Install only supported DIMMs, as described on the ServerProven
Web site. See
http://www.ibm.com/servers/eserver/serverproven/compat/us/.
v Installing or removing DIMMs changes the configuration of the blade server.
After you install or remove a DIMM, the blade server is automatically reconfigured, and the new configuration information is stored.
4 JS12 Type 7998: Problem Determination and Service Guide

Blade server control panel buttons and LEDs

Blade server control panel buttons and LEDs provide operational controls and status indicators.
Note: Figure 1 shows the control-panel door in the closed (normal) position. To access the power-control button, you must open the control-panel door.
Keyboard/video select button
Media-tray select button
MT
Location LED
Activity LED
Power-on LED
Sleep (not used on blade server)
Figure 1. Blade server control panel buttons and LEDs
Keyboard/video select button: When you use an operating system that supports a local console and keyboard, press this button to associate the shared BladeCenter unit keyboard and video ports with the blade server.
Information LED
Blade-error LED
Power-control button
NMI reset
Notes:
v The operating system in the blade server must provide USB support for the
blade server to recognize and use the keyboard, even if the keyboard has a PS/2-style connector.
v The keyboard and video are available after partition firmware loads and is
running. Power-on self-test (POST) codes and diagnostics are not supported using the keyboard and video. Use the management module to view checkpoints.
The LED on this button flashes while the request is being processed, then is lit when the ownership of the keyboard and video has been transferred to the blade server. It can take approximately 20 seconds to switch control of the keyboard and video to the blade server.
Using a keyboard that is directly attached to the management module, you can press keys in the following sequence to switch keyboard and video control between blade servers:
NumLock NumLock blade_server_number Enter
Chapter 1. Introduction 5
Where blade_server_number is the two-digit number for the blade bay in which the blade server is installed. When you use some keyboards, such as the 28L3644 (37L0888) keyboard, hold down the Shift key while you enter this key sequence.
If there is no response when you press the keyboard/video select button, you can use the Web interface of the management module to determine whether local control has been disabled on the blade server.
Media-tray select button: Press this button to associate the shared BladeCenter unit media tray (removable-media drives and front-panel USB ports) with the blade server. The LED on the button flashes while the request is being processed, then is lit when the ownership of the media tray has been transferred to the blade server. It can take approximately 20 seconds for the operating system in the blade server to recognize the media tray.
If there is no response when you press the media-tray select button, use the management module to determine whether local control has been disabled on the blade server.
Note: The operating system in the blade server must provide USB support for the blade server to recognize and use the removable-media drives and USB ports.
Information LED: When this amber LED is lit, it indicates that information about a system error for the blade server has been placed in the management-module event log. The information LED can be turned off through the Web interface of the management module or through IBM Director Console.
Blade-error LED: When this amber LED is lit, it indicates that a system error has occurred in the blade server. The blade-error LED will turn off after one of the following events:
v Correcting the error
v Reseating the blade server in the BladeCenter unit
v Cycling the BladeCenter unit power
Power-control button: This button is behind the control panel door. Press this button to turn on or turn off the blade server.
The power-control button has effect only if local power control is enabled for the blade server. Local power control is enabled and disabled through the Web interface of the management module.
Press the power button for 5 seconds to begin powering down the blade server.
6 JS12 Type 7998: Problem Determination and Service Guide
NMI reset (recessed): The nonmaskable interrupt (NMI) reset dumps the partition.
Use this recessed button only as directed by IBM Support.
Power-on LED: This green LED indicates the power status of the blade server in the following manner:
v Flashing rapidly: The service processor (BMC) is initializing the blade server.
v Flashing slowly: The blade server has completed initialization and is waiting for
a power-on command.
v Lit continuously: The blade server has power and is turned on.
Note: The enhanced service processor (BMC) can take as long as three minutes to initialize after you install the BladeCenter JS12 blade server, at which point the LED begins to flash slowly.
Activity LED: When this green LED is lit, it indicates that there is activity on the hard disk drive or network.
Location LED: When this blue LED is lit, it has been turned on by the system administrator to aid in visually locating the blade server. The location LED can be turned off through the Web interface of the management module or through IBM Director Console.

Turning on the blade server

After you connect the blade server to power through the BladeCenter unit, you can start the blade server after the discovery and initialization process is complete.
You can start the blade server in any of the following ways.
v Start the blade server by pressing the power-control button on the front of the
blade server.
The power-control button is behind the control panel door, as described in “Blade server control panel buttons and LEDs” on page 5.
After you push the power-control button, the power-on LED continues to blink slowly for about 15 seconds, then is lit solidly when the power-on process is complete.
Wait until the power-on LED on the blade server flashes slowly before you press the blade server power-control button. If the power-on LED is flashing rapidly, the service processor is initializing the blade server. The power-control button does not respond during initialization.
Note: The enhanced service processor (BMC) can take as long as three minutes to initialize after you install the BladeCenter JS12 blade server, at which point the LED begins to flash slowly.
Chapter 1. Introduction 7
v Start the blade server automatically when power is restored after a power
failure.
If a power failure occurs, the BladeCenter unit and then the blade server can start automatically when power is restored. You must configure the blade server to restart through the management module.
v Start the blade server remotely using the management module.
After you initiate the power-on process, the power-on LED blinks slowly for about 15 seconds, then is lit solidly when the power-on process is complete.

Turning off the blade server

When you turn off the blade server, it is still connected to power through the BladeCenter unit. The blade server can respond to requests from the service processor, such as a remote request to turn on the blade server. To remove all power from the blade server, you must remove it from the BladeCenter unit.
Shut down the operating system before you turn off the blade server. See the operating-system documentation for information about shutting down the operating system.
You can turn off the blade server in one of the following ways.
v Turn off the blade server by pressing the power-control button for at least 5
seconds.
The power-control button is on the blade server behind the control panel door. See “Blade server control panel buttons and LEDs” on page 5 for the location.
Note: The power-control LED can remain on solidly for up to 1 minute after you push the power-control button. After you turn off the blade server, wait until the power-control LED is blinking slowly before you press the power-control button to turn on the blade server again.
If the operating system stops functioning, press and hold the power-control button for more than 5 seconds to force the blade server to turn off.
v Use the management module to turn off the blade server.
The power-control LED can remain on solidly for up to 1 minute after you initiate the power-off process. After you turn off the blade server, wait until the power-control LED is blinking slowly before you initiate the power-on process from the advanced management module to turn on the blade server again.
Use the management-module Web interface to configure the management module to turn off the blade server if the system is not operating correctly.
For additional information, see the online documentation or the User’s Guide for the management module.
8 JS12 Type 7998: Problem Determination and Service Guide

System-board layouts

Illustrations show the connectors and LEDs on the system board. The illustrations might differ slightly from your hardware.

System-board connectors

Blade server components attach to the connectors on the system board.
Figure 2 shows the connectors on the system board in the blade server.
Control panel connector
SAS drive (P1-D1)
DIMM 1 (P1-C1)
DIMM 2 (P1-C2)
DIMM 3 (P1-C3)
DIMM 4 (P1-C4)
SAS drive (P1-D2)
PCI-X expansion card (P1-C10)
PCI-X expansion card (P1-C10)
PCI-E high-speed expansion card (P1-C11)
Management card (P1-C9)
Battery (P1-E1)
Figure 2. System-board connectors

System-board LEDs

Use the illustration of the LEDs on the system board to identify a light emitting diode (LED).
DIMM 5 (P1-C5)
DIMM 6 (P1-C6)
DIMM 7 (P1-C7)
DIMM 8 (P1-C8)
Chapter 1. Introduction 9
Remove the blade server from the BladeCenter unit, open the cover to see any error LEDs that were turned on during error processing, and use Figure 3 to identify the failing component.
Front SAS drive error LED (P1-D1)
System board error LED (P1)
Battery error LED (P1-E1)
Figure 3. System-board LEDs
Power LED (always on when plugged in)
PCIe high-speed expansion card error LED (P1-C11)
Management card error LED (P1-C9)
DIMM 1 error LED (P1-C1)
DIMM 2 error LED (P1-C2)
DIMM 3 error LED (P1-C3)
DIMM 4 error LED (P1-C4)
PCI-X expansion card error LED (P1-C10)
DIMM 5 error LED (P1-C5)
DIMM 6 error LED (P1-C6)
DIMM 7 error LED (P1-C7)
DIMM 8 error LED (P1-C8)
10 JS12 Type 7998: Problem Determination and Service Guide

Chapter 2. Diagnostics

Use the available diagnostic tools to help solve any problems that might occur in the blade server.
The first and most crucial component of a solid serviceability strategy is the ability to accurately and effectively detect errors when they occur. While not all errors are a threat to system availability, those that go undetected are dangerous because the system does not have the opportunity to evaluate and act if necessary. POWER6 processor-based systems are specifically designed with error-detection mechanisms that extend from processor cores and memory to power supplies and hard drives.
POWER6 processor-based systems contain specialized hardware detection circuitry for detecting erroneous hardware operations. Error checking hardware ranges from parity error detection coupled with processor instruction retry and bus retry, to ECC correction on caches and system buses.
IBM hardware error checkers have these distinct attributes:
v Continuous monitoring of system operations to detect potential calculation
errors
v Attempted isolation of physical faults based on runtime detection of each unique
failure
v Initiation of a wide variety of recovery mechanisms designed to correct a
problem
POWER6 processor-based systems include extensive hardware and firmware recovery logic.
Machine check handling
Machine checks are handled by firmware. When a machine check occurs, the firmware analyzes the error to identify the failing device and creates an error log entry.
If the system degrades to the point that the service processor cannot reach standby state, the ability to analyze the error does not exist. If the error occurs during POWER
In partitioned mode, an error that occurs during partition activity is surfaced to the operating system in the partition.
®
hypervisor (PHYP) activities, the PHYP initiates a system reboot.
© Copyright IBM Corp. 2008, 2009 11

Diagnostic tools

What to do if you cannot solve a problem
If you cannot locate and correct the problem using the diagnostics tools and information, see Appendix A, “Getting help and technical assistance,” on page 285.
Tools are available to help you diagnose and solve hardware-related problems.
v Power-on self-test (POST) progress codes (checkpoints), error codes, and
isolation procedures
The POST checks out the hardware at system initialization. IPL diagnostic functions test some system components and interconnections. The POST generates eight-digit checkpoints to mark the progress of powering up the blade server.
Use the management module to view progress codes.
The documentation of a progress code includes recovery actions for system hangs. See “POST progress codes (checkpoints)” on page 88 for more information.
If the service processor detects a problem during POST, an error code is logged in the management module event log. Error codes are also logged in the Linux syslog or AIX®diagnostic log, if possible. See “System reference codes (SRCs)” on page 16.
The service processor can generate codes that point to specific isolation procedures. See “Service processor problems” on page 205.
v Light path diagnostics
Use the light path diagnostic LEDs on the system board to identify failing hardware. If the system error LED on the system LED panel on the front or rear of the BladeCenter unit is lit, one or more error LEDs on the BladeCenter unit components also might be lit.
Light path diagnostics help identify failing customer replaceable unit (CRUs). CRU location codes are included in error codes and the event log.
LED locations
See “System-board LEDs” on page 9.
Front panel
See “Blade server control panel buttons and LEDs” on page 5.
v Troubleshooting tables
Use the troubleshooting tables to find solutions to problems that have identifiable symptoms.
See “Troubleshooting tables” on page 194.
v Dump data collection
In some circumstances, an error might require a dump to show more data. The Integrated Virtual Manager (IVM) sets up a dump area. Specific IVM information is included as part of the information that can optionally be sent to IBM support for analysis.
See “Collecting dump data” on page 13 for more information.
v Stand-alone diagnostics
The AIX-based stand-alone Diagnostics CD is in the ship package and is also available from the IBM Web site. Boot the CD from a CD drive or from an AIX network installation manager (NIM) server if the blade server cannot boot to an operating system, no matter which operating system is installed.
®
12 JS12 Type 7998: Problem Determination and Service Guide
Functions provided by the stand-alone diagnostics include:
– Analysis of errors reported by platform, such as microprocessor and memory
– Testing of resources, such as I/O adapters and devices
– Service aids, such as firmware update, format disk, and Raid Manager
v Diagnostic utilities for the AIX operating system
Run AIX concurrent diagnostics if AIX is functioning instead of the stand-alone diagnostics. Functions provided by disk-based AIX diagnostic include:
– Automatic error log analysis
– Analysis of errors reported by platform, such as microprocessor and memory
– Testing of resources, such as I/O adapters and devices
– Service aids, such as firmware update, format disk, and Raid Manager
v Diagnostic utilities for Linux operating systems
Linux on POWER service and productivity tools include hardware diagnostic aids and productivity tools, and installation aids. The installation aids are provided in the IBM Installation Toolkit for Linux on POWER, a set of tools that aids the installation of Linux on IBM servers with POWER architecture. You can also use the tools to update the JS12 blade server firmware.
Diagnostic utilities for the Linux operating system are available from IBM at https://www14.software.ibm.com/webapp/set2/sas/f/lopdiags/home.html.
v Diagnostic utilities for other operating systems
You can use the stand-alone Diagnostics CD to perform diagnostics on the JS12 blade server, no matter which operating system is loaded on the blade server. However, other supported operating systems might have diagnostic tools that are available through the operating system. See the documentation for your operating system for more information.

Collecting dump data

A dump might be critical for fault isolation when the built-in First Failure Data Capture (FFDC) mechanisms are not capturing sufficient fault data. Even when a fault is identified, dump data can provide additional information that is useful in problem determination.
All hardware state information is part of the dump if a hardware checkstop occurs. When a checkstop occurs, the service processor attempts to dump data that is necessary to analyze the error from appropriate parts of the system.
Note: If you power off the blade through the management module while the service processor is performing a dump, platform dump data is lost.
You might be asked to retrieve a dump to send it to IBM Support for analysis. The location of the dump data varies per operating system platform.
Chapter 2. Diagnostics 13
v Collect an AIX dump from the /var/adm/platform directory. v Collect a Linux dump from the /var/log/dump directory.
v Collect an Integrated Virtualization Manager (IVM) dump from the
IVM-managed JS12 blade server through the Manage Dumps task in the IVM console.

Location codes

Location codes identify components of the blade server. Location codes are displayed with some error codes to identify the blade server component that is causing the error.
See “System-board connectors” on page 9 for component locations.
Notes:
1. Location codes do not indicate the location of the blade server within the
BladeCenter unit. The codes identify components of the blade server only.
2. For checkpoints with no associated location code, see “Light path diagnostics”
on page 218 to identify the failing component when there is a hang condition.
3. For checkpoints with location codes, use the following table to identify the
failing component when there is a hang condition.
4. For 8-digit codes not listed in Table 2, see “Checkout procedure” on page 186.
Table 2. Location codes
Location code Component
Un location codes are for enclosure and VPD locations.
Un = Utttt.mmm.sssssss
tttt = system machine type mmm = system model number sssssss = system serial number
Un-P1 System-board and chassis assembly (Planar, FSP, SPCN,
CP0, P5IOC2)
Un-P1-C1 DIMM 1 (DIMM1A)
Un-P1-C2 DIMM 2 (DIMM1B)
Un-P1-C3 DIMM 3 (DIMM0A)
Un-P1-C4 DIMM 4 (DIMM0B)
Un-P1-C5 DIMM 5 (DIMM3B)
Un-P1-C6 DIMM 6 (DIMM3A)
Un-P1-C7 DIMM 7 (DIMM2B)
Un-P1-C8 DIMM 8 (DIMM2A)
Un-P1-C9 Management card (MGMT CRD)
Un-P1-C10 PCI-X expansion card (PIOCARD)
Un-P1-C11 PCIe high-speed expansion card (PIOCARD)
Un-P1-D1 Front SAS hard disk drive (SFF0)
Un-P1-D2 Rear SAS hard disk drive (SFF1)
Un-P1-E1 Battery (BATT)
14 JS12 Type 7998: Problem Determination and Service Guide
Table 2. Location codes (continued)
Location code Component
Um codes are for firmware. The format is the same as for a Un location code.
Um = Utttt.mmm.sssssss
Um-Y1 Firmware version

Reference codes

Reference codes are diagnostic aids that help you determine the source of a hardware or operating system problem. To use reference codes effectively, use them in conjunction with other service and support procedures.
The BladeCenter JS12 Type 7998 blade server produces several types of codes.
Progress codes: The power-on self-test (POST) generates eight-digit status codes that are known as checkpoints or progress codes, which are recorded in the management-module event log. The checkpoints indicate which blade server resource is initializing.
Error codes: The First Failure Data Capture (FFDC) error checkers capture fault data, which the baseboard management controller (BMC) service processor then analyzes. For unrecoverable errors (UEs), for recoverable events that meet or exceed their service thresholds, and for fatal system errors, an unrecoverable checkstop service event triggers the service processor to analyze the error, log the system reference code (SRC), and turn on the system attention LED.
The service processor logs the nine-word, eight-digit per word error code in the BladeCenter management-module event log. Error codes are either system reference codes (SRCs) or service request numbers (SRNs). A location code might also be included.
Isolation procedures: If the fault analysis does not determine a definitive cause, the service processor might indicate a fault isolation procedure that you can use to isolate the failing component.
Viewing the codes
The JS12 blade server does not display checkpoints or error codes on the remote console. The shared BladeCenter unit video also does not display the codes.
If the POST detects a problem, a 9-word, 8-digit error code is logged in the BladeCenter management-module event log. A location code that identifies a component might also be included. See “Error logs” on page 186 for information about viewing the management-module event log.
Service request numbers can be viewed using the AIX diagnostics CD, or various operating system utilities, such as AIX diagnostics or the Linux service aid “diagela”, if it is installed.
Chapter 2. Diagnostics 15

System reference codes (SRCs)

System reference codes indicate a server hardware or software problem that can originate in hardware, in firmware, or in the operating system.
A blade server component generates an error code when it detects a problem. An SRC identifies the component that generated the error code and describes the error. Use the SRC information to identify a list of possibly failing items and to find information about any additional isolation procedures.
The following table shows the syntax of a nine-word B700xxxx SRC as it might be displayed in the event log of the management module.
The first word of the SRC in this example is the message identifier, B7001111. This example numbers each word after the first word to show relative word positions. The seventh word is the direct select address, which is 77777777 in the example.
Table 3. Nine-word system reference code in the management-module event log
Index Sev Source Date/Time Text
(JS12-BC1BLD5E) SYS F/W: Error. Replace UNKNOWN (5008FECF B7001111 22222222 33333333 44444444 55555555 66666666 77777777 88888888 99999999)
1 E Blade_05
Depending on your operating system and the utilities you have installed, error messages might also be stored in an operating system log. See the documentation that comes with the operating system for more information.
01/21/2008, 17:15:14
The management module can display the most recent 32 SRCs and time stamps. Manually refresh the list to update it.
Select Blade Service Data blade_name in the management module to see a list of the 32 most recent SRCs.
Table 4. Management module reference code listing
Unique ID System Reference Code Timestamp
00040001 D1513901 2005-11-13 19:30:20
00000016 D1513801 2005-11-13 19:30:16
Any message with more detail is highlighted as a link in the System Reference Code column. Click the message to cause the management module to present the additional message detail:
D1513901 Created at: 2007-11-13 19:30:20 SRC Version: 0x02 Hex Words 2-5: 020110F0 52298910 C1472000 200000FF
16 JS12 Type 7998: Problem Determination and Service Guide
SRC formats
SRCs are strings of either six or eight alphanumeric characters. The first four characters designate the reference code type and the second four characters designate the unit reference code (URC).
The first character indicates the type of error. In a few cases, the first two characters indicate the type of error:
v 1xxxxxxx - System power control network (SPCN) error
v 6xxxxxxx - Virtual optical device error
v A1xxxxxx - Attention required (Service processor)
v AAxxxxxx - Attention required (Partition firmware)
v B1xxxxxx - Service processor error, such as a boot problem
v BAxxxxxx - Partition firmware error
v Cxxxxxxx - Checkpoint (must hang to indicate an error)
v Dxxxxxxx - Dump checkpoint (must hang to indicate an error)
To find a description of a SRC that is not listed in this JS12 blade server documentation, refer to the POWER6 Reference Code Lookup page at https://www-01.ibm.com/servers/resourcelink/lib03030.nsf/pages/eClipzML/ $file/refCode.html.
1xxxyyyy SRCs
The 1xxxyyyy system reference codes are system power control network (SPCN) reference codes.
Look for the rightmost 4 characters (yyyy in 1xxxyyyy) in the error code; this is the reference code. Find the reference code in Table 5.
Perform all actions before exchanging failing items.
Table 5. 1xxxyyyy SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error Codes
00AC Informational message: AC
00AD Informational message: A
1F02 Informational message: The
2600 pGood master fault
Description Action
No action is required.
loss was reported
No action is required. service processor reset caused the blade server to power off
No action is required. trace logs reached 1K of data.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 17
Table 5. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error Codes
2610 Power good (pGood) fault
2620 12V dc pGood input fault
2622 SMP expansion_comp_pgood
2623 mezzanine_comp_pgood fault
2624 mezzanine_12V_pgood fault
2625 PCIE_A0_PGOOD fault Perform the DTRCARD Symbolic CRU isolation procedure by
2626 PCIE_A1_PGOOD fault Perform the DTRCARD Symbolic CRU isolation procedure by
Description Action
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
fault
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to “Checkout procedure” on page
186.
4. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
The DTRCARD Symbolic CRU isolation procedure is in “Service processor problems” on page 205.
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to “Checkout procedure” on page
186.
4. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
The DTRCARD Symbolic CRU isolation procedure is in “Service processor problems” on page 205.
18 JS12 Type 7998: Problem Determination and Service Guide
Table 5. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error Codes
Description Action
2627 PCIE_B_PGOOD fault Perform the DTRCARD Symbolic CRU isolation procedure by
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to “Checkout procedure” on page
186.
4. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
The DTRCARD Symbolic CRU isolation procedure is in “Service
processor problems” on page 205.
2629 1.5V reg_pgood fault
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
262B 1.8V reg_pgood fault
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
262C 5V reg_pgood fault
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
262D 3.3V reg_pgood fault
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
262E 2.5V reg_pgood fault
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
2630 VRM CP0 core pGood fault
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
2632 VRM CP0 cache pGood fault
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 19
Table 5. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error Codes
2633 1.2V reg_pgood fault
2640 VRM CP1 core pGood fault
2642 VRM CP1 cache pGood fault
2643 1.2V power good signal fault
2647 No 12V dc coming to the
2648 Blade power latch fault
2649 Blade power fault
2670 The BladeCenter encountered
2671 12V power fault in the blade
Description Action
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
blade server from the BladeCenter midplane
a problem, and the blade server was automatically shut down as a result
server
1. Check the management-module event log for errors that
indicate a power problem with the BladeCenter.
2. Resolve any problems that are found.
3. Reboot the blade server.
4. If the problem is not resolved, replace the system-board and
chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Check the management-module event log for entries that were
made around the time that the JS12 blade server shut down.
2. Resolve any problems that are found.
3. Reboot the blade server.
4. If the problem is not resolved, replace the system-board and
chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
20 JS12 Type 7998: Problem Determination and Service Guide
Table 5. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error Codes
Description Action
2672 Blades PEU3 voltage alert Perform the DTRCARD Symbolic CRU isolation procedure by
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to “Checkout procedure” on page
186.
4. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
The DTRCARD Symbolic CRU isolation procedure is in “Service
processor problems” on page 205.
3134 I2C problem found with the
PEU3 hardware monitoring chip
Perform the DTRCARD Symbolic CRU isolation procedure by
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to “Checkout procedure” on page
186.
4. If the problem persists, replace the system board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
The DTRCARD Symbolic CRU isolation procedure is in “Service
processor problems” on page 205.
8400 Invalid configuration decode
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists:
a. Go to “Checkout procedure” on page 186.
b. Replace the system-board and chassis assembly, as described
in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
8402 Unable to get VPD from the
concentrator
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
8413 Invalid processor 1 VPD
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
8414 Invalid processor 2 VPD
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 21
Table 5. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy Error Codes
8423 No processor VPD was found
84A0 No backplane VPD was found
Description Action
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis assembly, as described in
“Replacing the Tier 2 system-board and chassis assembly” on page 274.
6xxxyyyy SRCs
The 6xxxyyyy system reference codes are virtual optical reference codes.
Look for the rightmost 4 characters (yyyy in 6xxxyyyy) in the error code; this is the reference code. Find the reference code in Table 6.
Table 6. 6xxxyyyy SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
6xxxyyyy Error Codes
632BCFC1 A virtual optical device cannot
632BCFC2 A non-recoverable error was
632BCFC3 The data in the list of volumes
632BCFC4 A virtual optical device cannot
632BCFC5 A non-recoverable error was
632BCFC6 The file specified does not
Description Action
632Byyyy codes are Network File System (NFS) virtual optical SRCs
On this partition and on the Network File System server, verify access the file containing the list of volumes.
detected while reading the list of volumes.
is not valid.
access the file containing the specified optical volume.
detected while reading a virtual optical volume.
contain data that can be processed as a virtual optical volume.
that the proper file is specified and that the proper authority is
granted.
Resolve any errors on the Network File System server.
On the Network File System server, verify that the proper file is
specified, that all files are entered correctly, that there are no blank
lines, and that the character set used is valid.
On the Network File System server, verify that the proper file is
specified in the list of volumes, and that the proper authority is
granted.
Resolve any errors on the Network File System server.
On the Network File System server, verify that all the files specified
in the list of optical volumes are correct.
22 JS12 Type 7998: Problem Determination and Service Guide
Table 6. 6xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
6xxxyyyy Error Codes
632BCFC7 A virtual optical device
Description Action
Resolve any errors on the Network File System server. detected an error reported by the Network File System server that cannot be recovered.
632BCFC8 A virtual optical device
Install any available operating system updates. encountered a non-recoverable error.
632Cyyyy codes are virtual optical SRCs
632CC000 Informational system log entry
No corrective action is required. only.
632CC002 SCSI selection or reselection
Refer to the hosting partition for problem analysis. timeout occurred.
632CC010 Undefined sense key returned
Refer to the hosting partition for problem analysis. by device.
632CC020 Configuration error. Refer to the hosting partition for problem analysis.
632CC100 SCSI bus error occurred. Refer to the hosting partition for problem analysis.
632CC110 SCSI command timeout
Refer to the hosting partition for problem analysis. occurred.
632CC210 Informational system log entry
No corrective action is required. only.
632CC300 Media or device error
Refer to the hosting partition for problem analysis. occurred.
632CC301 Media or device error
Refer to the hosting partition for problem analysis. occurred.
632CC302 Media or device error
Refer to the hosting partition for problem analysis. occurred.
632CC303 Media has an unknown
No corrective action is required. format.
632CC333 Incompatible media.
1. Verify that the disk has a supported format.
2. If the format is supported, clean the disk and attempt the
failing operation again.
3. If the operation fails again with the same system reference code,
ask your media source for a replacement disk.
632CC400 Physical link error detected by
Refer to the hosting partition for problem analysis. device.
632CC402 An internal program error
Install any available operating system updates. occurred.
632CCFF2 Informational system log entry
No corrective action is required. only.
632CCFF4 Internal device error occurred. Refer to the hosting partition for problem analysis.
Chapter 2. Diagnostics 23
Table 6. 6xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
6xxxyyyy Error Codes
632CCFF6 Informational system log entry
632CCFF7 Informational system log entry
632CCFFE Informational system log entry
632CFF3D Informational system log entry
632CFF6D Informational system log entry
Description Action
No corrective action is required.
only.
No corrective action is required.
only.
No corrective action is required.
only.
No corrective action is required.
only.
No corrective action is required.
only.
A1xxyyyy service processor SRCs
An A1xxyyyy system reference code (SRC) is an attention code that offers information about a platform or service processor dump or confirms a control panel function request.
Table 7 shows A1xxyyyy SRCs.
Table 7. A1xxyyyy service processor SRCs
Attention code Description Action
A1xxyyyy Attention code
1. Go to “Checkout procedure” on page 186.
2. Replace the system board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
A200yyyy Logical partition SRCs
An A200yyyy SRC is a logical partition reference code that is deprecated in favor of a corresponding B2xx SRC. B2xx SRCs are described in “B200xxxx Logical partition SRCs” on page 29.
Table 8. A200yyyy Logical partition SRCs
Attention code Description Action
A200yyyy See the description for the B200yyyy error
code with the same yyyy value.
Perform the action described in the B200yyyy error code with the same yyyy value.
24 JS12 Type 7998: Problem Determination and Service Guide
A700yyyy Licensed internal code SRCs
An A7xx SRC is a licensed internal code SRC that is deprecated in favor of a corresponding B7xx SRC. B7xx SRCs are described in “B700xxxx Licensed internal code SRCs” on page 39.
Table 9. A700yyyy Licensed internal code SRCs
Attention code Description Action
A7003000 A user-initiated platform dump occurred. No service action required.
A700yyyy See the description for the B700yyyy error
code with the same yyyy value.
Perform the action in the B700yyyy error code with the same yyyy value.
AA00E1A8 to AA260005 Partition firmware attention codes
AAxx attention codes provide information about the next target state for the platform firmware. These codes might indicate that you need to perform an action.
Table 10 describes the partitioning firmware codes that may be displayed if POST detects a problem. Each message description includes a suggested action to correct the problem.
Table 10. AA00E1A8 to AA260005 Partition firmware attention codes
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Attention code Description Action
AA00E1A8 The system is booting to the open
firmware prompt.
AA00E1A9 The system is booting to the System
Management Services (SMS) menus.
AA00E1B0 Waiting for the user to select the
language and keyboard. The menu should be visible on the console.
At the open firmware prompt, type dev
/packages/gui obe and press Enter; then, type 1 to select SMS Menu.
1. If the system or partition returns to the
SMS menus after a boot attempt failed, use the SMS menus to check the progress indicator history for a BAxx xxxx error, which may indicate why the boot attempt failed. Follow the actions for that error code to resolve the boot problem.
2. Use the SMS menus to establish the boot
list and restart the blade server.
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 25
Table 10. AA00E1A8 to AA260005 Partition firmware attention codes (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Attention code Description Action
AA00E1B1 Waiting for the user to accept or decline
the license
AA060007 A keyboard was not found. Verify that a keyboard is attached to the USB
AA06000B The system or partition was not able to
find an operating system on any of the devices in the boot list.
AA06000C The media in a device in the boot list
was not bootable.
AA06000D The media in the device in the bootlist
was not found under the I/O adapter specified by the bootlist.
AA06000E The adapter specified in the boot list is
not present or is not functioning.
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
port that is assigned to the partition.
1. Use the SMS menus to modify the boot list
so that it includes devices that have a known-good operating system and restart the blade server.
2. If the problem remains, go to “Boot
problem resolution” on page 193.
1. Replace the media in the device with
known-good media or modify the boot list to boot from another bootable device.
2. If the problem remains, go to “Boot
problem resolution” on page 193.
1. Verify that the media from which you are
trying to boot is bootable or modify the boot list to boot from another bootable device.
2. If the problem remains, go to “Boot
problem resolution” on page 193.
v For an AIX operating system:
1. Try booting the blade server from
another bootable device; then, run AIX online diagnostics against the failing adapter.
2. If AIX cannot be booted from another
device, boot the blade server using the stand-alone Diagnostics CD or a NIM server; then, run diagnostics against the failing adapter.
v For a Linux operating system, boot the blade
server using the stand-alone Diagnostics CD or a NIM server; then, run diagnostics against the failing adapter.
26 JS12 Type 7998: Problem Determination and Service Guide
Table 10. AA00E1A8 to AA260005 Partition firmware attention codes (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Attention code Description Action
AA060011 The firmware did not find an operating
system image and at least one hard disk in the boot list was not detected by the firmware. The firmware is retrying the entries in the boot list.
This might occur if a disk enclosure that contains the boot disk is not fully initialized or if the boot disk belongs to another partition. Verify that:
v The boot disk belongs to the partition from
which you are trying to boot.
v The boot list in the SMS menus is correct.
AA130013 Bootable media is missing from a USB
CD-ROM
Verify that a bootable CD is properly inserted in the CD or DVD drive and retry the boot operation.
AA130014 The media in a USB CD-ROM has
changed.
1. Retry the operation.
2. Check for server firmware updates; then,
install the updates if available and retry the operation.
AA170210 Setenv/$setenv parameter error - the
name contains a null character.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
AA170211 Setenv/$setenv parameter error - the
value contains a null character.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
AA190001 The hypervisor function to get or set the
time-of-day clock reported an error.
1. Use the operating system to set the system
clock.
2. If the problem persists, check for server
firmware updates.
3. Install any available updates and retry the
operation.
AA260001 Enter the Type Model Number (Must be
8 characters)
AA260002 Enter the Serial Number (Must be 7
characters)
AA260003 Enter System Unique ID (Must be 12
characters)
AA260004 Enter WorldWide Port Number (Must be
12 characters)
Enter the machine type and model of the blade server at the prompt.
Enter the serial number of the blade server at the prompt.
Enter the system unique ID number at the prompt.
Enter the worldwide port number of the blade server at the prompt.
AA260005 Enter Brand (Must be 2 characters) Enter the brand number of the blade server at
the prompt.
Chapter 2. Diagnostics 27
Bxxxxxxx Service processor early termination SRCs
A Bxxxxxxx system reference code (SRC) is an error code that is related to an event or exception that occurred in the service processor firmware.
To find a description of a SRC that is not listed in this JS12 blade server documentation, refer to the POWER6 Reference Code Lookup page at https://www-01.ibm.com/servers/resourcelink/lib03030.nsf/pages/eClipzML/ $file/refCode.html.
Table 11 describes error codes that might occur if POST detects a problem. The description also includes suggested actions to correct the problem.
Note: For problems persisting after completing the suggested actions, see “Solving undetermined problems” on page 231.
Table 11. B181xxxx Service processor early termination SRCs
B181 xxxx Error Code Description Action
7200 Invalid boot request
7201 Service processor failure
7202 The permanent and temporary
firmware sides are both marked invalid
7203 Error setting boot parameters
7204 Error reading boot parameters
7205 Boot code error
7206 Unit check timer was reset
7207 Error reading from NVRAM
7208 Error writing to NVRAM
7209 The service processor boot watchdog
timer expired and forced the service processor to attempt a boot from the other firmware image in the service processor flash memory
720A Power-off reset occurred. FipsDump
should be analyzed: Possible software problem
Go to “Checkout procedure” on page 186.
28 JS12 Type 7998: Problem Determination and Service Guide
B200xxxx Logical partition SRCs
A B200xxxx system reference code (SRC) is an error code that is related to logical partitioning.
Table 12 describes error codes that might be displayed if POST detects a problem. The description also includes suggested actions to correct the problem.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page 186 and “Solving undetermined problems” on page
231.
Table 12. B200xxxx Logical partition SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
1130 A problem occurred during the
migration of a partition
You attempted to migrate a partition to a system that has a power or thermal problem. The migration will not continue.
1131 A problem occurred during the
migration of a partition.
Look for and fix power or thermal problems and then retry the migration.
Check for server firmware updates; then, install the updates if available.
The migration of a partition did not complete.
1132 A problem occurred during the
startup of a partition.
A platform firmware error occurred while it was trying to allocate memory. The startup will not continue.
1133 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
1134 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
1140 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
Collect a platform dump and then go to “Isolating firmware problems” on page 222.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
Chapter 2. Diagnostics 29
Table 12. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
1141 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
1142 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
1143 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
1144 A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
The migration of a partition did not complete.
1148 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
1150 During the startup of a partition, a
partitioning configuration problem occurred.
1151 A problem occurred during the
migration of a partition.
The migration of a partition did not complete.
1170 During the startup of a partition, a
failure occurred due to a validation error.
1225 A problem occurred during the
startup of a partition.
The partition attempted to start up prior to the platform fully initializing. Restart the partition after the platform has fully completed and the platform is not in standby mode.
1230 During the startup of a partition, a
partitioning configuration problem occurred; the partition is lacking the necessary resources to start up.
Check for server firmware updates; then, install the updates if available.
Go to “Verifying the partition configuration” on page 189.
Check for server firmware updates; then, install the updates if available.
Go to “Verifying the partition configuration” on page 189.
Restart the partition.
Go to “Verifying the partition configuration” on page 189.
30 JS12 Type 7998: Problem Determination and Service Guide
Table 12. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
1260 A problem occurred during the
Set the partition to Normal.
startup of a partition.
The partition could not start at the Timed Power On setting because the partition was not set to Normal.
1265 The partition could not start up. An
Correct the startup settings. operating system Main Storage Dump startup was attempted with the startup side on D-mode, which is not a valid operating system startup scenario. The startup will be halted. This SRC can occur when a D-mode SLIC installation fails and attempts a Main Storage Dump.
1266 The partition could not start up. You
are attempting to start up an
Install a supported operating system and restart the
partition. operating system that is not supported.
1280 A problem occurred during a
Go to “Isolating firmware problems” on page 222. partition Main Storage Dump. A mainstore dump startup did not complete due to a configuration mismatch.
1281 A partition memory error occurred.
Restart the partition. The failed memory will no longer be used.
1282 A problem occurred during the
Go to “Isolating firmware problems” on page 222. startup of a partition.
1320 A problem occurred during the
startup of a partition.
Configure a load source for the partition. Then restart the
partition.
No default load source was selected. The startup will attempt to continue, but there may not be enough information to find the correct load source.
1321 A problem occurred during the
startup of a partition.
1322 In the partition startup, code failed
during a check of the load source path.
2048 A problem occurred during a
partition Main Storage Dump. A mainstore dump startup did not complete due to a copy error.
Verify that the correct slot is specified for the load source.
Then restart the partition.
Verify that the path for the load source is specified
correctly. Then restart the partition.
Go to “Isolating firmware problems” on page 222.
Chapter 2. Diagnostics 31
Table 12. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
2054 A problem occurred during a
partition Main Storage Dump. A mainstore dump IPL did not complete due to a configuration mismatch.
2058 A problem occurred during a
partition Main Storage Dump. A mainstore dump startup did not complete due to a copy error.
2210 Informational system log entry only. No corrective action is required.
2220 Informational system log entry only. No corrective action is required.
2250 During the startup of a partition, an
attempt to toggle the power state of a slot has failed.
2260 During the startup of a partition, the
partition firmware attempted an operation that failed.
2300 During the startup of a partition, an
attempt to toggle the power state of a slot has failed.
2310 During the startup of a partition, the
partition firmware attempted an operation that failed.
2320 During the startup of a partition, the
partition firmware attempted an operation that failed.
2425 During the startup of a partition, the
partition firmware attempted an operation that failed.
2426 During the startup of a partition, the
partition firmware attempted an operation that failed.
2475 During the startup of a partition, a
slot that was needed for the partition was either empty or the device in the slot has failed.
2485 During the startup of a partition, the
partition firmware attempted an operation that failed.
3000 Informational system log entry only. No corrective action is required.
3081 During the startup of a partition, the
startup did not complete due to a copy error.
Go to “Isolating firmware problems” on page 222.
Go to “Isolating firmware problems” on page 222.
Check for server firmware updates; then, install the updates if available.
Go to “Isolating firmware problems” on page 222.
Check for server firmware updates; then, install the updates if available.
Go to “Isolating firmware problems” on page 222.
Go to “Isolating firmware problems” on page 222.
Go to “Isolating firmware problems” on page 222.
Go to “Isolating firmware problems” on page 222.
Check for server firmware updates; then, install the updates if available.
Go to “Isolating firmware problems” on page 222.
Check for server firmware updates; then, install the updates if available.
32 JS12 Type 7998: Problem Determination and Service Guide
Table 12. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
3084 A problem occurred during the
Verify that the adapter type is supported. startup of a partition.
The adapter type might not be supported.
3088 Informational system log entry only. No corrective action is required.
308C A problem occurred during the
Verify that a valid I/O Load Source is tagged. startup of a partition.
The adapter type cannot be determined.
3090 A problem occurred during the
Go to “Isolating firmware problems” on page 222. startup of a partition.
3110 A problem occurred during the
Go to “Isolating firmware problems” on page 222. startup of a partition.
3113 A problem occurred during the
Look for B7xx xxxx errors and resolve them. startup of a partition.
3114 A problem occurred during the
Look for other errors and resolve them. startup of a partition.
3120 Informational system log entry only. No corrective action is required.
3123 Informational system log entry only. No corrective action is required.
3125 During the startup of a partition, the
blade server firmware could not
Check for server firmware updates; then, install the
updates if available. obtain a segment of main storage within the blade server to use for managing the creation of a partition.
3128 A problem occurred during the
Look for and resolve B700 69xx errors. startup of a partition. A return code for an unexpected failure was returned when attempting to query the load source path.
3130 A problem occurred during the
startup of a partition.
3135 A problem occurred during the
startup of a partition.
3140 A problem occurred during the
startup of a partition. This is a
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Reconfigure the partition to include the intended load
source path. configuration problem in the partition.
3141 Informational system log entry only. No corrective action is required.
3142 Informational system log entry only. No corrective action is required.
3143 Informational system log entry only. No corrective action is required.
3144 Informational system log entry only. No corrective action is required.
3145 Informational system log entry only. No corrective action is required.
Chapter 2. Diagnostics 33
Table 12. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
3200 Informational system log entry only. No corrective action is required.
4158 Informational system log entry only. No corrective action is required.
4400 A problem occurred during the
startup of a partition.
5106 A problem occurred during the
startup of a partition. There is not enough space to contain the partition main storage dump. The startup will not continue.
5109 A problem occurred during the
startup of a partition. There was a partition main storage dump problem. The startup will not continue.
5114 A problem occurred during the
startup of a partition. There is not enough space to contain the partition main storage dump. The startup will not continue.
5115 A problem occurred during the
startup of a partition. There was an error reading the partition main storage dump from the partition load source into main storage. The startup will attempt to continue.
5117 A problem occurred during the
startup of a partition. A partition main storage dump has occurred but cannot be written to the load source device because a valid dump already exists.
5121 A problem occurred during the
startup of a partition. There was an error writing the partition main storage dump to the partition load source. The startup will not continue.
5122 Informational system log entry only. No corrective action is required.
5123 Informational system log entry only. No corrective action is required.
5135 A problem occurred during the
startup of a partition. There was an error writing the partition main storage dump to the partition load source. The main store dump startup will continue.
Check for server firmware updates; then, install the updates if available.
Verify that there is sufficient memory available to start the partition as it is configured. If there is already enough memory, then go to “Isolating firmware problems” on page 222.
Go to “Isolating firmware problems” on page 222.
Go to “Isolating firmware problems” on page 222.
If the startup does not continue, look for and resolve other errors.
Use the Main Storage Dump Manager to rename or copy the current main storage dump.
Look for related errors in the Product Activity Logand fix any problems found. Use virtual control panel function 34 to retry the current Main Store Dump startup while the partition is still in the failed state.
Look for other errors and resolve them.
34 JS12 Type 7998: Problem Determination and Service Guide
Table 12. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
5137 A problem occurred during the
Look for other errors and resolve them. startup of a partition. There was an error writing the partition main storage dump to the partition load source. The main store dump startup will continue.
5145 A problem occurred during the
Look for other errors and resolve them. startup of a partition. There was an error writing the partition main storage dump to the partition load source. The main store dump startup will continue.
5148 A problem occurred during the
Go to “Isolating firmware problems” on page 222. startup of a partition. An error occurred while doing a main storage dump that would have caused another main storage dump. The startup will not continue.
5149 A problem occurred during the
startup of a partition while doing a
Check for server firmware updates; then, install the
updates if available. Firmware Assisted Dump that would have caused another Firmware Assisted Dump.
514A A Firmware Assisted Dump did not
complete due to a copy error.
542A A Firmware Assisted Dump did not
complete due to a read error.
542B A Firmware Assisted Dump did not
complete due to a copy error.
543A A Firmware Assisted Dump did not
complete due to a copy error.
543B A Firmware Assisted Dump did not
complete due to a copy error.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
543C Informational system log entry only. No corrective action is required.
543D A Firmware Assisted Dump did not
complete due to a copy error.
6006 During the startup of a partition, a
Check for server firmware updates; then, install the
updates if available.
Go to “Isolating firmware problems” on page 222. system firmware error occurred when the partition memory was being initialized; the startup will not continue.
600A A problem occurred during the
startup of a partition. The partition
Contact IBM support, as described in Appendix A,
“Getting help and technical assistance,” on page 285. could not reserve the memory required for IPL.
Chapter 2. Diagnostics 35
Table 12. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
6012 During the startup of a partition, the
partition LID failed to completely load into the partition main storage area.
6015 A problem occurred during the
startup of a partition. The load source media is corrupted or not valid.
6025 A problem occurred during the
startup of a partition. This is a problem with the load source media being corrupt or not valid.
6027 During the startup of a partition, a
failure occurred when allocating memory for an internal object used for firmware module load operations.
6110 A problem occurred during the
startup of a partition. There was an error on the load source device. The startup will attempt to continue.
690A During the startup of a partition, an
error occurred while copying open firmware into the partition load area.
7200 Informational system log entry only. No corrective action is required.
8080 Informational system log entry only. No corrective action is required.
8081 During the startup of a partition, an
internal firmware time-out occurred; the partition might continue to start up but it can experience problems while running.
8105 During the startup of a partition,
there was a failure loading the VPD areas of the partition; the load source media has been corrupted or is unsupported on this server.
8106 A problem occurred during the
startup of a partition. The startup will not continue.
8107 During the startup of a partition,
there was a problem getting a segment of main storage in the blade server main storage.
Go to “Isolating firmware problems” on page 222.
Replace the load source media.
Replace the load source media.
1. Make sure that enough main storage was allocated to
the partition.
2. Retry the operation.
Look for other errors and resolve them.
Go to “Isolating firmware problems” on page 222.
Check for server firmware updates; then, install the updates if available.
Check for server firmware updates; then, install the updates if available.
Replace the load source media.
Check for server firmware updates; then, install the updates if available.
36 JS12 Type 7998: Problem Determination and Service Guide
Table 12. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
8109 During the startup of a partition, a
failure occurred. The startup will not continue.
1. Make sure that there is enough memory to start up
the partition.
2. Check for server firmware updates; then, install the
updates if available.
8111 A problem occurred during the
startup of a partition.
8112 During the startup of a partition, a
failure occurred; the startup will not
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available. continue.
8113 During the startup of a partition, an
error occurred while mapping
Check for server firmware updates; then, install the
updates if available. memory for the partition startup.
8114 During the startup of a partition,
there was a failure verifying the
Check for server firmware updates; then, install the
updates if available. VPD for the partition resources during startup.
8115 During the startup of a partition,
there was a low level
Check for server firmware updates; then, install the
updates if available. partition-to-partition communication failure.
8117 During the startup of a partition, the
partition did not start up due to a
Check for server firmware updates; then, install the
updates if available. system firmware error.
8121 During the startup of a partition, the
Go to “Isolating firmware problems” on page 222. partition did not start up due to a system firmware error.
8123 During the startup of a partition, the
Go to “Isolating firmware problems” on page 222. partition did not start up due to a system firmware error.
8125 During the startup of a partition, the
Go to “Isolating firmware problems” on page 222. partition did not start up due to a system firmware error.
8127 During the startup of a partition, the
Go to “Isolating firmware problems” on page 222. partition did not start up due to a system firmware error.
8129 During the startup of a partition, the
Go to “Isolating firmware problems” on page 222. partition did not start up due to a system firmware error.
813A There was a problem establishing a
Go to “Isolating firmware problems” on page 222. console.
8140 Informational system log entry only. No corrective action is required.
8141 Informational system log entry only. No corrective action is required.
8142 Informational system log entry only. No corrective action is required.
8143 Informational system log entry only. No corrective action is required.
Chapter 2. Diagnostics 37
Table 12. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
8144 Informational system log entry only. No corrective action is required.
8145 Informational system log entry only. No corrective action is required.
8150 System firmware detected an error. Collect a platform dump and then go to “Isolating
firmware problems” on page 222.
8151 System firmware detected an error. Use the Integrated Virtual Manager (IVM) to increase the
Logical Memory Block (LMB) size, and to reduce the number of virtual devices for the partition.
8152 No active system processor. Verify that processor resources are assigned to the
partition.
8160 A problem occurred during the
migration of a partition.
8161 A problem occurred during the
migration of a partition.
A100 A partition ended abnormally; the
partition could not stay running and shut itself down.
A101 A partition ended abnormally; the
partition could not stay running and shut itself down.
A140 A lower priority partition lost a
usable processor to supply it to a higher priority partition with a bad processor.
B07B Informational system log entry only. No corrective action is required.
B215 A problem occurred after a partition
ended abnormally.
Contact IBM support, as described in Appendix A, “Getting help and technical assistance,” on page 285.
Contact IBM support, as described in Appendix A, “Getting help and technical assistance,” on page 285.
1. Check the error logs and take the actions for the error
codes that are found.
2. Go to “Isolating firmware problems” on page 222.
1. Check the error logs and take the actions for the error
codes that are found.
2. Go to “Isolating firmware problems” on page 222.
Evaluate the entire LPAR configuration. Adjust partition profiles with the new number of processors available in the system.
Restart the platform.
There was a communications problem between this partition’s service processor and the platform’s service processor.
C1F0 An internal system firmware error
occurred during a partition shutdown or a restart.
D150 A partition ended abnormally; there
was a communications problem between this partition and the code that handles resource allocation.
E0AA A problem occurred during the
power off of a partition.
F001 A problem occurred during the
startup of a partition. An operation has timed out.
38 JS12 Type 7998: Problem Determination and Service Guide
Go to “Isolating firmware problems” on page 222.
Check for server firmware updates; then, install the updates if available.
Go to “Isolating firmware problems” on page 222.
Look for other errors and resolve them.
Table 12. B200xxxx Logical partition SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B200 xxxx Error Code Description Action
F003 During the startup of a partition, the
partition processor(s) did not start the firmware within the time-out window.
F004 Informational system log entry only. No corrective action is required.
F005 Informational system log entry only. No corrective action is required.
F006 During the startup of a partition, the
code load operation for the partition startup timed out.
F007 During a shutdown of the partition,
a time-out occurred while trying to stop a partition.
F008 Informational system log entry only. No corrective action is required.
F009 Informational system log entry only. No corrective action is required.
F00A Informational system log entry only. No corrective action is required.
F00B Informational system log entry only. No corrective action is required.
F00C Informational system log entry only. No corrective action is required.
F00D Informational system log entry only. No corrective action is required.
Collect the partition dump information; then, go to
“Isolating firmware problems” on page 222.
1. Check the error logs and take the actions for the error
codes that are found.
2. Go to “Isolating firmware problems” on page 222.
Check for server firmware updates; then, install the
updates if available.
B700xxxx Licensed internal code SRCs
A B700xxxx system reference code (SRC) is an error code that is related to licensed internal code.
Table 13 describes the error codes that may be displayed if POST detects a problem. Suggested actions to correct the problem are also described.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page 186 and “Solving undetermined problems” on page
231.
Table 13. B700xxxx Licensed internal code SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
0102 System firmware detected an error. A
machine check occurred during startup.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 222.
Chapter 2. Diagnostics 39
Table 13. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
0103 System firmware detected a failure
0104 System firmware failure. Machine check,
undefined error occurred.
0105 System firmware detected an error.
More than one request to terminate the system was issued.
0106 System firmware failure.
0107 System firmware failure. The system
detected an unrecoverable machine check condition.
0200 System firmware has experienced a low
storage condition
0201 System firmware detected an error. No immediate action is necessary.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 222.
1. Check for server firmware updates.
2. Update the firmware.
Go to “Isolating firmware problems” on page
222.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 222.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 222.
No immediate action is necessary.
Continue running the system normally. At the earliest convenient time or service window, work with IBM Support to collect a platform dump and restart the system; then, go to “Isolating firmware problems” on page 222.
Continue running the system normally. At the earliest convenient time or service window, work with IBM Support to collect a platform dump and restart the system; then, go to “Isolating firmware problems” on page 222.
0302 System firmware failure
0441 Service processor failure. The platform
encountered an error early in the startup or termination process.
0443 Service processor failure. Replace the system-board and chassis
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 222.
Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
40 JS12 Type 7998: Problem Determination and Service Guide
Table 13. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
0601 Informational system log entry only. No corrective action is required.
Note: This code and associated data can be used to determine why the time of day for a partition was lost.
0602 System firmware detected an error
condition.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 222.
0611 There is a problem with the system
hardware clock; the clock time is
Use the operating system to set the system clock.
invalid.
0621 Informational system log entry only. No corrective action is required.
0641 System firmware detected an error.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 222.
0650 System firmware detected an error.
Resource management was unable to allocate main storage. A platform dump was initiated.
1. Collect the event log.
2. Collect the platform dump data.
3. Collect the partition configuration
information.
4. Go to “Isolating firmware problems” on
page 222.
0651 The system detected an error in the
system clock hardware
Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
0803 Informational system log entry only. No corrective action is required.
0804 Informational system log entry only. No corrective action is required.
0A00 Informational system log entry only. No corrective action is required.
0A01 Informational system log entry only. No corrective action is required.
0A10 Informational system log entry only. No corrective action is required.
1150 Informational system log entry only. No corrective action is required.
1151 Informational system log entry only. No corrective action is required.
1152 Informational system log entry only. No corrective action is required.
1160 Service processor failure
1. Go to “Isolating firmware problems” on
page 222.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1161 Informational system log entry only. No corrective action is required.
Chapter 2. Diagnostics 41
Table 13. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
1730 The VPD for the system is not what was
expected at startup.
1731 The VPD on a memory DIMM is not
correct and the memory on the DIMM cannot be used, resulting in reduced memory.
1732 The VPD on a processor card is not
correct and the processor card cannot be used, resulting in reduced processing power.
1733 System firmware failure. The startup
will not continue.
Replace the management card, as described in “Removing the management card” on page 253 and “Installing the management card” on page
254.
Replace the MEMDIMM symbolic CRU, as described in “Service processor problems” on page 205.
Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
Look for and correct B1xxxxxx errors. If there are no serviceable B1xxxxxx errors, or if correcting the errors does not correct the problem, contact IBM support to reset the server firmware settings.
Attention: Resetting the server firmware settings results in the loss of all of the partition data that is stored on the service processor. Before continuing with this operation, manually record all settings that you intend to preserve.
The service processor reboots after IBM Support resets the server firmware settings.
If the problem persists, replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
173A A VPD collection overflow occurred.
173B A system firmware failure occurred
during VPD collection.
4091 Informational system log entry only. No corrective action is required.
4400 There is a platform dump to collect
1. Look for and resolve other errors.
2. If there are no other errors:
a. Update the firmware to the current
level, as described in “Updating the firmware” on page 277.
b. You might also have to update the
management module firmware to a compatible level.
Look for and correct other B1xxxxxx errors.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 222.
42 JS12 Type 7998: Problem Determination and Service Guide
Table 13. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
4401 System firmware failure. The system
firmware detected an internal problem.
4402 A system firmware error occurred while
attempting to allocate the memory
Go to “Isolating firmware problems” on page
222.
Go to “Isolating firmware problems” on page
222.
necessary to create a platform dump.
4705 System firmware failure. A problem
Restart the system. occurred when initializing, reading, or using the system VPD. The Capacity on Demand function is not available.
4710 Informational system log entry only. No corrective action is required.
4714 Informational system log entry only. No corrective action is required.
4788 Informational system log entry only. No corrective action is required.
5120 System firmware detected an error If the system is not exhibiting problematic
behavior, you can ignore this error. Otherwise,
go to “Isolating firmware problems” on page
222.
5121 System firmware detected a
programming problem for which a platform dump may have been initiated.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 222.
5122 An error occurred during a search for
the load source.
If the partition fails to startup, go to “Isolating
firmware problems” on page 222. Otherwise,
no corrective action is required.
5123 Informational system log entry only. No corrective action is required.
5190 Operating system error. The server
firmware detected a problem in an operating system.
5191 System firmware detected a virtual I/O
configuration error.
Check for error codes in the partition that is
reporting the error and take the appropriate
actions for those error codes.
1. Use the Integrated Virtual Manager (IVM)
to verify or reconfigure the invalid virtual I/O configuration.
2. Check for server firmware updates; then,
install the updates if available.
5209 Informational system log entry only. No corrective action is required.
5219 Informational system log entry only. No corrective action is required.
5300 System firmware detected a failure
while partitioning resources. The platform partitioning code encountered
Check the management-module event log for
error codes; then, take the actions associated
with those error codes. an error.
5301 User intervention required. The system
detected a problem with the partition
Use the Integrated Virtual Manager (IVM) to
reallocate the system resources. configuration.
Chapter 2. Diagnostics 43
Table 13. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
5302 An unsupported Preferred Operating
System was detected.
The Preferred Operating System specified is not supported. The IPL will not continue.
5303 An unsupported Preferred Operating
System was detected.
The Preferred Operating System specified is not supported. The IPL will continue.
5400 System firmware detected a problem
with a processor.
5442 System firmware detected an error. Replace the system-board and chassis
54DD Informational system log entry only. No corrective action is required.
5600 Informational system log entry only. No corrective action is required.
5601 System firmware failure. There was a
problem initializing, reading, or using system location codes.
6900 PCI host bridge failure
6906 System bus error Replace the system-board and chassis
6907 System bus error
6908 System bus error Replace the system-board and chassis
Work with IBM support to select a supported Preferred Operating System; then, re-IPL the system.
Work with IBM support to select a supported Preferred Operating System; then, re-IPL the system.
Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
Go to “Isolating firmware problems” on page
222.
1. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
2. If the problem persists, use the “PCI
expansion card (PIOCARD) problem isolation procedure” on page 200 to determine the failing component.
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
1. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
2. Go to “Isolating firmware problems” on
page 222.
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
44 JS12 Type 7998: Problem Determination and Service Guide
Table 13. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
6909 System bus error
1. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
2. Go to “Isolating firmware problems” on
page 222.
6944 Informational system log entry only. No corrective action is required.
6950 A platform dump has occurred.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 222.
6951 An error occurred because a partition
needed more NVRAM than was
Use the Integrated Virtualization Manager
(IVM) to delete one or more partitions. available.
6952 Informational system log entry only. No corrective action is required.
6953 PHYP NVRAM is unavailable after a
service processor reset and reload.
Go to “Isolating firmware problems” on page
222.
6954 Informational system log entry only. No corrective action is required.
6955 Informational system log entry only. No corrective action is required.
6956 An NVRAM failure was detected. Go to “Isolating firmware problems” on page
222.
6965 Informational system log entry only. No corrective action is required.
6970 PCI host bridge failure
1. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
2. If the problem persists, use the “PCI
expansion card (PIOCARD) problem isolation procedure” on page 200 to determine the failing component.
6971 PCI bus failure
1. Use the “PCI expansion card (PIOCARD)
problem isolation procedure” on page 200 to determine the failing component.
2. If the problem persists, replace the
system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
6972 System bus error Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2
system-board and chassis assembly” on page
274.
Chapter 2. Diagnostics 45
Table 13. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
6973 System bus error
6974 Informational system log entry only. No corrective action is required.
6978 Informational system log entry only. No corrective action is required.
6979 Informational system log entry only. No corrective action is required.
697C Connection from service processor to
system processor failed.
6980 RIO, HSL or 12X controller failure Replace the system-board and chassis
6981 System bus error. Replace the system-board and chassis
6984 Informational system log entry only. No corrective action is required.
6985 Remote I/O (RIO), high-speed link
(HSL), or 12X loop status message.
6987 Remote I/O (RIO), high-speed link
(HSL), or 12X connection failure.
6990 Service processor failure. Replace the system-board and chassis
6991 System firmware failure Go to “Isolating firmware problems” on page
6993 Service processor failure
1. Use the “PCI expansion card (PIOCARD)
problem isolation procedure” on page 200 to determine the failing component.
2. If the problem persists, replace the
system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page
274.
222.
1. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
2. Go to “Isolating firmware problems” on
page 222.
46 JS12 Type 7998: Problem Determination and Service Guide
Table 13. B700xxxx Licensed internal code SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error Codes Description Action
6994 Service processor failure. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2
system-board and chassis assembly” on page
274.
6995 Informational system log entry only. No corrective action is required.
69C2 Informational system log entry only. No corrective action is required.
69C3 Informational system log entry only. No corrective action is required.
69D9 Host Ethernet Adapter (HEA) failure. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2
system-board and chassis assembly” on page
274.
69DA Informational system log entry only. No corrective action is required.
69DB System firmware failure.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 222.
BAD1 The platform firmware detected an
error.
F103 System firmware failure
Go to “Isolating firmware problems” on page
222.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 222.
F104 Operating system error. System
firmware terminated a partition.
Check the management-module event log for
partition firmware error codes (especially
BA00F104); then, take the appropriate actions
for those error codes.
F105 System firmware detected an internal
error
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 222.
F106 System firmware detected an error Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2
system-board and chassis assembly” on page
274.
F10A System firmware detected an error Look for and correct B1xxxxxx errors.
F10B A processor resource has been disabled
due to hardware problems
Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2
system-board and chassis assembly” on page
274.
F120 Informational system log entry only. No corrective action is required.
Chapter 2. Diagnostics 47
BA000010 to BA400002 Partition firmware SRCs
The power-on self-test (POST) might display an error code that the partition firmware detects. Try to correct the problem with the suggested action.
Table 14 describes error codes that might be displayed if POST detects a problem. The description also includes suggested actions to correct the problem.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page 186 and “Solving undetermined problems” on page
231.
Table 14. BA000010 to BA400002 Partition firmware SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA000010 The device data structure is corrupted
BA000020 Incompatible firmware levels were
found
BA000030 An lpevent communication failure
occurred
BA000031 An lpevent communication failure
occurred
BA000032 The firmware failed to register the
lpevent queues
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
48 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA000034 The firmware failed to exchange
capacity and allocate lpevents
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA000038 The firmware failed to exchange virtual
continuation events
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA000040 The firmware was unable to obtain the
RTAS code lid details
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA000050 The firmware was unable to load the
RTAS code lid
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA000060 The firmware was unable to obtain the
open firmware code lid details
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 49
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA000070 The firmware was unable to load the
open firmware code lid
BA000080 The user did not accept the license
agreement
BA000081 Failed to get the firmware license policy
BA000082 Failed to set the firmware license policy
BA000091 Unable to load a firmware code update
module
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Accept the license agreement and restart the blade server.
If the problem persists:
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
50 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA00E820 An lpevent communication failure
occurred
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA00E830 Failure when initializing ibm,event-scan
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA00E840 Failure when initializing PCI hot-plug
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA00E843 Failure when initializing the interface to
AIX or Linux
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA00E850 Failure when initializing dynamic
reconfiguration
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA00E860 Failure when initializing sensors
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA010000 There is insufficient information to boot
the systems
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA010001 The client IP address is already in use
by another network device
Verify that all of the IP addresses on the network are unique; then, retry the operation.
Chapter 2. Diagnostics 51
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA010002 Cannot get gateway IP address Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA010003 Cannot get server hardware address Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA010004 Bootp failed Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
52 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA010005 File transmission (TFTP) failed Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA010006 The boot image is too large Start up from another device with a bootable
image.
BA010007 The device does not have the required
device_type property.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA010008 The device_type property for this device
is not supported by the iSCSI initiator configuration specification.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA010009 The arguments specified for the ping
function are invalid.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
BA01000A The itname parameter string exceeds the
maximum length allowed.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
Chapter 2. Diagnostics 53
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA01000B The ichapid parameter string exceeds
the maximum length allowed.
BA01000C The ichappw parameter string exceeds
the maximum length allowed.
BA01000D The iname parameter string exceeds the
maximum length allowed.
BA01000E The LUN specified is not valid. The embedded host Ethernet adapters (HEAs)
BA01000F The chapid parameter string exceeds the
maximum length allowed.
BA010010 The chappw parameter string exceeds
the maximum length allowed.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs) help provide iSCSI, which is supported by iSCSI software device drivers on either AIX or Linux. Verify that all of the iSCSI configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator.
54 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA010011 SET-ROOT-PROP could not find / (root)
package
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA010013 The information in the error log entry
Informational message. No action is required. for this SRC provides network trace data.
BA010014 The information in the error log entry
Informational message. No action is required. for this SRC provides network trace data.
BA010015 The information in the error log entry
Informational message. No action is required. for this SRC provides network trace data.
BA010020 A trace entry addition failed because of
a bad trace type.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA012010 Opening the TCP node failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA012011 TCP failed to read from the network
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 55
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA012012 TCP failed to write to the network.
BA012013 Closing TCP failed.
BA017020 Failed to open the TFTP package Verify that the Trivial File Transfer Protocol
BA017021 Failed to load the TFTP file Verify that the TFTP server and network
BA01B010 Opening the BOOTP node failed.
BA01B011 BOOTP failed to read from the network Perform the following actions that checkpoint
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
(TFTP) parameters are correct.
connections are correct.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
56 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA01B012 BOOTP failed to write to the network Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA01B013 The discover mode is invalid
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA01B014 Closing the BOOTP node failed
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA01B015 The BOOTP discover server timed out Perform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 57
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA01D001 Opening the DHCP node failed
BA01D020 DHCP failed to read from the network
BA01D030 DHCP failed to write to the network
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Verify that the network cable is connected,
and that the network is active.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Verify that the network cable is connected,
and that the network is active.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
58 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA01D040 The DHCP discover server timed out
1. Verify that the DHCP server has addresses
available.
2. Verify that the DHCP server configuration
file is not overly constrained. An over-constrained file might prevent a server from meeting the configuration requested by the client.
3. Perform the following actions that
checkpoint CA00E174 describes:
a. Verify that:
v The bootp server is correctly
configured; then, retry the operation.
v The network connections are correct;
then, retry the operation.
b. If the problem persists:
1) Go to “Checkout procedure” on
page 186.
2) Replace the system-board and
chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA01D050 DHCP::discover no good offer DHCP discovery did not receive any DHCP
offers from the servers that meet the client
requirements.
BA01D051 DHCP::discover DHCP request timed
out
BA01D052 DHCP::discover: 10 incapable servers
were found
BA01D053 DHCP::discover received a reply, but
without a message type
Verify that the DHCP server configuration file
is not overly constrained. An over-constrained
file might prevent a server from meeting the
configuration requested by the client.
DHCP discovery did receive a DHCP offer
from a server that met the client requirements,
but the server did not send the DHCP
acknowledgement (DHCP ack) to the client
DHCP request.
Another client might have used the address
that was served.
Verify that the DHCP server has addresses
available.
Ten DHCP servers have sent DHCP offers,
none of which met the requirements of the
client. Check the compatibility of the
configuration that the client is requesting and
the server DHCP configuration files.
Verify that the DHCP server is properly
configured.
Chapter 2. Diagnostics 59
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA01D054 DHCP::discover: DHCP nak received DHCP discovery did receive a DHCP offer
from a server that meets the client requirements, but the server sent a DHCP not acknowledged (DHCP nak) to the client DHCP request.
Another client might be using the address that was served.
This situation can occur when there are multiple DHCP servers on the same network, and server A does not know the subnet configuration of server B, and vice-versa.
This situation can also occur when the pool of addresses is not truly divided.
Set the DHCP server configuration file to authoritative.
Verify that the DHCP server is functioning properly.
BA01D055 DHCP::discover: DHCP decline DHCP discovery did receive a DHCP offer
from one or more servers that meet the client requirements. However, the client performed an ARP test on the address and found that another client was using the address.
The client sent a DHCP decline to the server, but the client did not receive an additional DHCP offer from a server. The client still does not have a valid address.
Verify that the DHCP server is functioning properly.
BA01D056 DHCP::discover: unknown DHCP
message
BA01D0FF Closing the DHCP node failed.
DHCP discovery received an unknown DHCP message type. Verify that the DHCP server is functioning properly.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
60 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA030011 RTAS attempt to allocate memory failed
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA04000F Self test failed on device; no error or
location code information available
1. If a location code is identified with the
error, replace the device specified by the location code.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA040010 Self test failed on device; can’t locate
package.
1. If a location code is identified with the
error, replace the device specified by the location code.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA040020 The machine type and model are not
recognized by the blade server firmware.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA040030 The firmware was not able to build the
UID properly for this system. As a result, problems may occur with the licensing of the AIX operating system.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA040035 The firmware was unable to find the
“plant of manufacture” in the VPD. This may cause problems with the licensing of the AIX operating system.
Verify that the machine type, model, and serial
number are correct for this server. If this is a
new server, check for server firmware updates;
then, install the updates if available.
Chapter 2. Diagnostics 61
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA040040 Setting the machine type, model, and
serial number failed.
BA040050 The h-call to switch off the boot
watchdog timer failed.
BA040060 Setting the firmware boot side for the
next boot failed.
BA050001 Failed to reboot a partition in logical
partition mode
BA050004 Failed to locate service processor device
tree node.
BA05000A Failed to send boot failed message
BA060008 No configurable adapters found by the
Remote IPL menu in the SMS utilities
BA06000B The system was not able to find an
operating system on the devices in the boot list.
BA06000C A pointer to the operating system was
found in non-volatile storage.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
This error occurs when the firmware cannot locate any LAN adapters that are supported by the remote IPL function. Verify that the devices in the remote IPL device list are correct using the SMS menus.
Go to “Boot problem resolution” on page 193.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
62 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA060020 The environment variable “boot-device”
exceeded the allowed character limit.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA060021 The environment variable “boot-device”
contained more than five entries.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA060022 The environment variable “boot-device”
contained an entry that exceeded 255 characters in length
1. Using the SMS menus, set the boot list to
the default boot list.
2. Shut down; then, start up the blade server.
3. Use SMS menus to customize the boot list
as required.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA060030 Logical partitioning with shared
processors is enabled and the operating system does not support it.
1. Install or boot a level of the operating
system that supports shared processors.
2. Disable logical partitioning with shared
processors in the operating system.
3. If the problem remains:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 63
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA060060 The operating system expects an IOSP
partition, but it failed to make the transition to alpha mode.
BA060061 The operating system expects a
non-IOSP partition, but it failed to make the transition to MGC mode.
BA060070 The operating system does not support
this system’s processor(s)
BA060071 An invalid number of vectors was
received from the operating system
BA060072 Client-arch-support hcall error
BA060075 Client-arch-support firmware error
1. Verify that:
v The alpha-mode operating system image
is intended for this partition.
v The configuration of the partition
supports an alpha-mode operating system.
2. If the problem remains:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Verify that:
v The alpha-mode operating system image
is intended for this partition.
v The configuration of the partition
supports an alpha-mode operating system.
2. If the problem remains:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Boot a supported version of the operating system.
Boot a supported version of the operating system.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
64 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA060200 Failed to set the operating system boot
list from the management module boot list
1. Using the SMS menus, set the boot list to
the default boot list.
2. Shut down; then, start up the blade server.
3. Use SMS menus to customize the boot list
as required.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA060201 Failed to read the VPD boot pathfield
value
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA060202 Failed to update the VPD with the new
boot pathfield value
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA060300 An I/O error on the adapter from which
the boot was attempted prevented the operating system from being booted.
1. Using the SMS menus, select another
adapter from which to boot the operating system, and reboot the system.
2. Attempt to reboot the system.
3. Go to “Boot problem resolution” on page
193.
BA07xxxx SCSI controller failure
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA090001 SCSI DASD: test unit ready failed;
hardware error
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA090002 SCSI DASD: test unit ready failed; sense
data available
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 65
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA090003 SCSI DASD: send diagnostic failed;
sense data available
BA090004 SCSI DASD: send diagnostic failed:
devofl cmd
BA09000A There was a vendor specification error.
BA09000B Generic SCSI sense error
BA09000C The media is write-protected
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Check the vendor specification for
additional information.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Verify that the SCSI cables and devices are
properly plugged.
2. Correct any problems that are found.
3. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Change the setting of the media to allow
writing, then retry the operation.
2. Insert new media of the correct type.
3. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
66 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA09000D The media is unsupported or not
recognized.
1. Insert new media of the correct type.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA09000E The media is not formatted correctly.
1. Insert the media.
2. Insert new media of the correct type.
3. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA09000F Media is not present
1. Insert new media with the correct format.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA090010 The request sense command failed.
1. Troubleshoot the SCSI devices.
2. Verify that the SCSI cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSI cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 67
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA090011 The retry limit has been exceeded.
BA090012 There is a SCSI device that is not
supported.
BA120001 On an undetermined SCSI device, test
unit ready failed; hardware error
1. Troubleshoot the SCSI devices.
2. Verify that the SCSI cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSI cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Replace the SCSI device that is not
supported with a supported device.
2. If the problem persists:
a. Troubleshoot the SCSI devices.
b. Verify that the SCSI cables and devices
are properly plugged. Correct any problems that are found.
c. Replace the SCSI cables and devices.
d. If the problem persists:
1) Go to “Checkout procedure” on
page 186.
2) Replace the system-board and
chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Troubleshoot the SCSI devices.
2. Verify that the SCSI cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSI cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
68 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA120002 On an undetermined SCSI device, test
unit ready failed; sense data available
1. Troubleshoot the SCSI devices.
2. Verify that the SCSI cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSI cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA120003 On an undetermined SCSI device, send
diagnostic failed; sense data available
1. Troubleshoot the SCSI devices.
2. Verify that the SCSI cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSI cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA120004 On an undetermined SCSI device, send
diagnostic failed; devofl command
1. Troubleshoot the SCSI devices.
2. Verify that the SCSI cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSI cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA120010 Failed to generate the SAS device
physical location code. The event log entry has the details.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 69
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA130010 USB CD-ROM in the media tray: device
remained busy longer than the time-out period
BA130011 USB CD-ROM in the media tray:
execution of ATA/ATAPI command was not completed with the allowed time.
BA130012 USB CD-ROM in the media tray:
execution of ATA/ATAPI command failed.
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
70 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA130013 USB CD-ROM in the media tray:
bootable media is missing from the drive
1. Insert a bootable CD in the drive and retry
the operation.
2. If the problem persists:
a. Retry the operation.
b. Reboot the blade server.
c. Troubleshoot the media tray and
CD-ROM drive.
d. Replace the USB CD or DVD drive.
e. If the problem persists:
1) Go to “Checkout procedure” on
page 186.
2) Replace the system-board and
chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA130014 USB CD-ROM in the media tray: the
media in the USB CD-ROM drive has been changed.
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA130015 USB CD-ROM in the media tray:
ATA/ATAPI packet command execution failed.
1. Remove the CD or DVD in the drive and
replace it with a known-good disk.
2. If the problem persists:
a. Retry the operation.
b. Reboot the blade server.
c. Troubleshoot the media tray and
CD-ROM drive.
d. Replace the USB CD or DVD drive.
e. If the problem persists:
1) Go to “Checkout procedure” on
page 186.
2) Replace the system-board and
chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 71
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA131010 The USB keyboard has been removed.
BA140001 The SCSI read/write optical test unit
ready failed; hardware error.
BA140002 The SCSI read/write optical test unit
ready failed; sense data available.
BA140003 The SCSI read/write optical send
diagnostic failed; sense data available.
1. Reseat the keyboard cable in the
management module USB port.
2. Check for server firmware updates; then,
install the updates if available.
1. Troubleshoot the SCSI devices.
2. Verify that the SCSI cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSI cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Troubleshoot the SCSI devices.
2. Verify that the SCSI cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSI cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Troubleshoot the SCSI devices.
2. Verify that the SCSI cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSI cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
72 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA140004 The SCSI read/write optical send
diagnostic failed; devofl command.
1. Troubleshoot the SCSI devices.
2. Verify that the SCSI cables and devices are
properly plugged. Correct any problems that are found.
3. Replace the SCSI cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA150001 PCI Ethernet BNC/RJ-45 or PCI
Ethernet AUI/RJ-45 adapter: internal
Replace the adapter specified by the location
code. wrap test failure
BA151001 10/100 Mbps Ethernet PCI adapter:
internal wrap test failure
BA151002 10/100 Mbps Ethernet card failure
Replace the adapter specified by the location
code.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA153002 Gigabit Ethernet adapter failure Verify that the MAC address programmed in
the FLASH/EEPROM is correct.
BA153003 Gigabit Ethernet adapter failure
1. Check for server firmware updates; then,
install the updates if available.
2. Replace the Gigabit Ethernet adapter.
BA154010 HEA software error
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA154020 The required open firmware property
was not found.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 73
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA154030 Invalid parameters were passed to the
HEA device driver.
BA154040 The TFTP package open failed
BA154050 The transmit operation failed.
BA154060 Failed to initialize the HEA port or
queue
BA154070 The receive operation failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
74 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA170000 NVRAMRC initialization failed; device
test failed
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA170100 NVRAM data validation check failed
1. Shut down the blade server; then, restart it.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA170201 The firmware was unable to expand
target partition - saving configuration variable
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA170202 The firmware was unable to expand
target partition - writing event log entry
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA170203 The firmware was unable to expand
target partition - writing VPD data
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA170210 Setenv/$Setenv parameter error - name
contains a null character
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 75
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA170211 Setenv/$Setenv parameter error - value
contains a null character
BA170220 Unable to write a variable value to
NVRAM due to lack of free memory in NVRAM.
BA170221 Setenv/$setenv had to delete stored
firmware network boot settings to free memory in NVRAM.
BA170998 NVRAMRC script evaluation error -
command line execution error.
BA180008 PCI device Fcode evaluation error
BA180009 The Fcode on a PCI adapter left a data
stack imbalance
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reduce the number of partitions, if
possible, to add more NVRAM memory to this partition.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Enter the adapter and network parameters again for the network boot or network installation.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reseat the PCI adapter card.
2. Check for adapter firmware updates; then,
install the updates if available.
3. Check for server firmware updates; then,
install the updates if available.
4. Replace the PCI adapter card.
5. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
76 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA180010 PCI probe error, bridge in freeze state
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA180011 PCI bridge probe error, bridge is not
usable
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA180012 PCI device runtime error, bridge in
freeze state
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA180014 MSI software error
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA180020 No response was received from a slot
during PCI probing.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA180099 PCI probe error; bridge in freeze state,
slot in reset state
1. Reseat the PCI adapter card.
2. Check for adapter firmware updates; then,
install the updates if available.
3. Check for server firmware updates; then,
install the updates if available.
4. Replace the PCI adapter card.
5. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 77
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA180100 The FDDI adapter Fcode driver is not
supported on this server.
BA180101 Stack underflow from fibre-channel
adapter
BA190001 Firmware function to get/set
time-of-day reported an error
BA201001 The serial interface dropped data
packets
BA201002 The serial interface failed to open
BA201003 The firmware failed to handshake
properly with the serial interface
BA210000 Partition firmware reports a default
catch
IBM may produce a compatible driver in the future, but does not guarantee one.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
78 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA210001 Partition firmware reports a stack
underflow was caught
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA210002 Partition firmware was ready before
standout was ready
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA210003 A data storage error was caught by
partition firmware
1. If the location code reported with the error
points to an adapter, check for adapter firmware updates.
2. Apply any available updates.
3. Check for server firmware updates.
4. Apply any available updates.
5. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA210004 An open firmware stack-depth assert
failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA210010 The transfer of control to the SLIC
loader failed
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 79
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA210011 The transfer of control to the IO
Reporter failed
BA210012 There was an NVRAMRC forced-boot
problem; unable to load the previous boot’s operating system image
BA210013 There was a partition firmware error
when in the SMS menus.
BA210020 I/O configuration exceeded the
maximum size allowed by partition firmware.
BA210100 An error may not have been sent to the
management module event log.
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Use the SMS menus to verify that the
partition firmware can still detect the operating system image.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Increase the logical memory block size to
256 MB and restart the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
80 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA210101 The partition firmware event log queue
is full
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA210102 There was a communication failure
between partition firmware and the hypervisor. The lpevent that was expected from the hypervisor was not received.
1. Review the event log for errors that
occurred around the time of this error.
2. Correct any errors that are found and
reboot the blade server.
3. If the problem persists:
a. Reboot the blade server.
b. If the problem persists:
1) Go to “Checkout procedure” on
page 186.
2) Replace the system-board and
chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA210103 There was a communication failure
between partition firmware and the hypervisor. There was a failing return code with the lpevent acknowledgement from the hypervisor.
1. Review the event log for errors that
occurred around the time of this error.
2. Correct any errors that are found and
reboot the blade server.
3. If the problem persists:
a. Reboot the blade server.
b. If the problem persists:
1) Go to “Checkout procedure” on
page 186.
2) Replace the system-board and
chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 81
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA220010 There was a partition firmware error
during a USB hotplug probing. USB hotplug may not work properly on this partition.
BA220020 CRQ registration error; partner vslot
may not be valid
BA278001 Failed to flash firmware: invalid image
file
BA278002 Flash file is not designed for this
platform
BA278003 Unable to lock the firmware update lid
manager
BA278004 An invalid firmware update lid was
requested
BA278005 Failed to flash a firmware update lid Download a new firmware update image and
BA278006 Unable to unlock the firmware update
lid manager
BA278007 Failed to reboot the system after a
firmware flash update
BA278009 The operating system’s server firmware
update management tools are incompatible with this system.
1. Look for EEH-related errors in the event
log.
2. Resolve any EEH event log entries that are
found.
3. Correct any errors that are found and
reboot the blade server.
4. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Verify that this client virtual slot device has a valid server virtual slot device in a hosting partition.
Download a new firmware update image and retry the update.
Download a new firmware update image and retry the update.
1. Restart the blade server.
2. Verify that the operating system is
authorized to update the firmware. If the system is running multiple partitions, verify that this partition has service authority.
Download a new firmware update image and retry the update.
retry the update.
Restart the blade server.
Restart the blade server.
Go to the IBM download site at www14.software.ibm.com/webapp/set2/sas/ f/lopdiags/home.html to download the latest version of the service aids package for Linux.
82 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA27800A The firmware installation failed due to a
hardware error that was reported.
1. Look for hardware errors in the event log.
2. Resolve any hardware errors that are found.
3. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA280000 RTAS discovered an invalid operation
that may cause a hardware error
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA290000 RTAS discovered an internal stack
overflow
1. Go to “Checkout procedure” on page 186.
2. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA290001 RTAS low memory corruption was
detected
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA290002 RTAS low memory corruption was
detected
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA310010 Unable to obtain the SRC history
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 83
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA310020 An invalid SRC history was obtained.
BA310030 Writing the MAC address to the VPD
failed.
BA330000 Memory allocation error.
BA330001 Memory allocation error.
BA330002 Memory allocation error.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
84 JS12 Type 7998: Problem Determination and Service Guide
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA330003 Memory allocation error.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA330004 Memory allocation error.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA340001 There was a logical partition event
communication failure reading the BladeCenter open fabric manager parameter data structure from the service processor.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA340002 There was a logical partition event
communication failure reading the BladeCenter open fabric manager location code mapping data from the service processor.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
BA340003 An internal firmware error occurred;
unable to allocate memory for the open fabric manager location code mapping data.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
Chapter 2. Diagnostics 85
Table 14. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 7998,” on page 235 to determine which components are CRUs and which
components are FRUs.
Error code Description Action
BA340004 An internal firmware error occurred; the
open fabric manager parameter data was corrupted.
BA340005 An internal firmware error occurred; the
location code mapping table was corrupted.
BA340006 An LP event communication failure
occurred reading the system initiator capability data from the service processor.
BA340007 An internal firmware error occurred; the
open fabric manager system initiator capability data was corrupted.
BA340008 An internal firmware error occurred; the
open fabric manager system initiator capability data version was not correct.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
186.
b. Replace the system-board and chassis
assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 274.
86 JS12 Type 7998: Problem Determination and Service Guide
Loading...