IBM BladeCenter JS21 Types 8844, BladeCenter JS21 Types 7988 Service Manual

BladeCenter JS21 Types 7988 and 8844

P roblem Dete rminatio n an d Se rvi ce Gui de
BladeCenter JS21 Types 7988 and 8844

P roblem Dete rminatio n an d Se rvi ce Gui de
Note: Before using this information and the product it supports, read the general information in Appendix B, “Notices,” on page 173, and the Warranty and Support Information document on the IBM BladeCenter Documentation CD.
Sixth Edition (November 2010)
© Copyright IBM Corporation 2007.
Contents
Safety ............................vii
Guidelines for trained service technicians ...............viii
Inspecting for unsafe conditions .................viii
Guidelines for servicing electrical equipment .............viii
Safety statements ........................ix
Chapter 1. Introduction......................1
Related documentation ......................1
Notices and statements in this document................2
Features and specifications .....................3
Blade server control panel buttons and LEDs ..............4
Turning on the blade server.....................6
Turning off the blade server.....................7
System-board layouts .......................7
System-board connectors ....................7
System-board jumpers .....................8
System-board LEDs ......................8
Chapter 2. Diagnostics ......................9
Diagnostic tools .........................9
POST checkpoint codes ......................9
Progress codes........................10
Attention codes ........................34
Error codes .........................37
Location codes ........................66
Error logs ..........................66
Service request numbers .....................67
Using the SRN tables .....................67
SRN tables .........................67
Failing function codes .....................104
Checkout procedure ......................106
About the checkout procedure ..................106
Performing the checkout procedure ................106
Verifying the partition configuration .................108
Running the diagnostics program ..................108
Starting AIX concurrent diagnostics ................108
Starting standalone diagnostics from a CD .............109
Starting standalone diagnostics from a NIM server ..........110
Using the diagnostics program ..................111
Boot problem resolution .....................112
Troubleshooting tables......................113
CD or DVD drive problems ...................114
Diskette drive problems ....................115
General problems ......................115
Hard disk drive problems ....................116
Intermittent problems .....................116
Keyboard problems ......................117
Memory problems ......................118
Microprocessor problems ....................118
Monitor or video problems ...................119
Network connection problems ..................120
Optional device problems ...................121
Power problems .......................122
© Copyright IBM Corp. 2007 iii
Service processor problems...................123
Software problems ......................123
Universal Serial Bus (USB) port problems .............123
Light path diagnostics ......................124
Viewing the light path diagnostics LEDs ..............124
Light path diagnostics LEDs ..................125
Firmware problem isolation ....................127
Recovering the system firmware ..................127
Starting the PERM image ...................127
Recovering the TEMP image from the PERM image..........128
Verifying the system firmware levels ...............129
Committing the TEMP system firmware image ............129
Solving shared BladeCenter resource problems ............130
Keyboard problems ......................130
Media tray problems .....................131
Network connection problems ..................133
Power problems .......................133
Video problems .......................134
Solving undetermined problems ..................135
Calling IBM for service .....................136
Chapter 3. Parts listing, Types 7988 and 8844 ............137
Chapter 4. Removing and replacing blade server components .....141
Installation guidelines ......................141
System reliability guidelines ...................142
Handling static-sensitive devices .................142
Returning a device or component ................142
Removing the blade server from a BladeCenter unit ...........143
Installing the blade server in a BladeCenter unit ............144
Removing and replacing Tier 1 CRUs ................145
Removing the blade server cover.................145
Installing the blade server cover .................146
Removing the bezel assembly ..................147
Installing the bezel assembly ..................148
Removing a SAS hard disk drive .................149
Installing a SAS hard disk drive .................150
Removing a memory module ..................151
Installing a memory module ...................152
Removing and installing an I/O expansion card ...........153
Removing the battery .....................157
Installing the battery .....................157
Removing a hard disk drive tray .................159
Installing a hard disk drive tray .................160
Removing the expansion bracket .................161
Installing the expansion bracket .................162
Removing and replacing Tier 2 CRUs ................163
Replacing the system-board and chassis assembly ..........163
Chapter 5. Configuration information and instructions ........165
Updating the firmware ......................165
Configuring the blade server ...................165
Using the SMS utility ......................166
Starting the SMS utility ....................166
SMS utility menu choices ...................166
Configuring the Gigabit Ethernet controllers ..............167
iv BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
Creating a CE login .......................168
Blade server Ethernet controller enumeration .............168
Configuring a SAS RAID array...................169
Updating IBM Director ......................169
Checking the status of the media tray ................170
Appendix A. Getting help and technical assistance ..........171
Before you call ........................171
Using the documentation .....................171
Getting help and information from the World Wide Web .........171
Software service and support ...................172
Hardware service and support ...................172
IBM Taiwan product service ....................172
Appendix B. Notices ......................173
Trademarks..........................174
Important notes ........................174
Product recycling and disposal ..................175
Battery return program .....................176
Electronic emission notices ....................178
Federal Communications Commission (FCC) statement ........178
Industry Canada Class A emission compliance statement ........178
Avis de conformité à la réglementation d'Industrie Canada .......178
Australia and New Zealand Class A statement ............178
United Kingdom telecommunications safety requirement ........178
European Union EMC Directive conformance statement ........178
Taiwanese Class A warning statement ...............179
Chinese Class A warning statement ................179
Japanese Voluntary Control Council for Interference (VCCI) statement 179
Index ............................181
Contents v
vi BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
Safety
Before installing this product, read the Safety Information.
Antes de instalar este produto, leia as Informações de Segurança.
Pred instalací tohoto produktu si prectete prírucku bezpecnostních instrukcí.
Læs sikkerhedsforskrifterne, før du installerer dette produkt.
Lees voordat u dit product installeert eerst de veiligheidsvoorschriften.
Ennen kuin asennat tämän tuotteen, lue turvaohjeet kohdasta Safety Information.
Avant d'installer ce produit, lisez les consignes de sécurité.
Vor der Installation dieses Produkts die Sicherheitshinweise lesen.
Prima di installare questo prodotto, leggere le Informazioni sulla Sicurezza.
Les sikkerhetsinformasjonen (Safety Information) før du installerer dette produktet.
Antes de instalar este produto, leia as Informações sobre Segurança.
Antes de instalar este producto, lea la información de seguridad.
Läs säkerhetsinformationen innan du installerar den här produkten.
© Copyright IBM Corp. 2007 vii
Guidelines for trained service technicians
This section contains information for trained service technicians.
Inspecting for unsafe conditions
Use the information in this section to help you identify potential unsafe conditions in an IBM product that you are working on. Each IBM product, as it was designed and manufactured, has required safety items to protect users and service technicians from injury. The information in this section addresses only those items. Use good judgment to identify potential unsafe conditions that might be caused by non-IBM alterations or attachment of non-IBM features or options that are not addressed in this section. If you identify an unsafe condition, you must determine how serious the hazard is and whether you must correct the problem before you work on the product.
Consider the following conditions and the safety hazards that they present: v Electrical hazards, especially primary power. Primary voltage on the frame can
cause serious or fatal electrical shock.
v Explosive hazards, such as a damaged CRT face or a bulging capacitor. v Mechanical hazards, such as loose or missing hardware.
To inspect the product for potential unsafe conditions, complete the following steps:
1. Make sure that the power is off and the power cord is disconnected.
2. Make sure that the exterior cover is not damaged, loose, or broken, and observe any sharp edges.
3. Check the power cord: v Make sure that the third-wire ground connector is in good condition. Use a
meter to measure third-wire ground continuity for 0.1 ohm or less between the external ground pin and the frame ground.
v Make sure that the power cord is the correct type, as specified in the
documentation for your BladeCenter unit type.
v Make sure that the insulation is not frayed or worn.
4. Remove the cover.
5. Check for any obvious non-IBM alterations. Use good judgment as to the safety of any non-IBM alterations.
6. Check inside the blade server for any obvious unsafe conditions, such as metal filings, contamination, water or other liquid, or signs of fire or smoke damage.
7. Check for worn, frayed, or pinched cables.
8. Make sure that the power-supply cover fasteners (screws or rivets) have not been removed or tampered with.
Guidelines for servicing electrical equipment
Observe the following guidelines when servicing electrical equipment: v Check the area for electrical hazards such as moist floors, nongrounded power
extension cords, and missing safety grounds.
v Use only approved tools and test equipment. Some hand tools have handles that
are covered with a soft material that does not provide insulation from live electrical current.
v Regularly inspect and maintain your electrical hand tools for safe operational
condition. Do not use worn or broken tools or testers.
viii BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
v Do not touch the reflective surface of a dental mirror to a live electrical circuit.
The surface is conductive and can cause personal injury or equipment damage if it touches a live electrical circuit.
v Some rubber floor mats contain small conductive fibers to decrease electrostatic
discharge. Do not use this type of mat to protect yourself from electrical shock.
v Do not work alone under hazardous conditions or near equipment that has
hazardous voltages.
v Locate the emergency power-off (EPO) switch, disconnecting switch, or electrical
outlet so that you can turn off the power quickly in the event of an electrical accident.
v Disconnect all power before you perform a mechanical inspection, work near
power supplies, or remove or install main units.
v Before you work on the equipment, disconnect the power cord. If you cannot
disconnect the power cord, have the customer power-off the wall box that supplies power to the equipment and lock the wall box in the off position.
v Never assume that power has been disconnected from a circuit. Check it to
make sure that it has been disconnected.
v If you have to work on equipment that has exposed electrical circuits, observe
the following precautions: – Make sure that another person who is familiar with the power-off controls is
near you and is available to turn off the power if necessary.
– When you are working with powered-on electrical equipment, use only one
hand. Keep the other hand in your pocket or behind your back to avoid creating a complete circuit that could cause an electrical shock.
– When using a tester, set the controls correctly and use the approved probe
leads and accessories for that tester.
– Stand on a suitable rubber mat to insulate you from grounds such as metal
floor strips and equipment frames.
v Use extreme care when measuring high voltages. v To ensure proper grounding of components such as power supplies, pumps,
blowers, fans, and motor generators, do not service these components outside of their normal operating locations.
v If an electrical accident occurs, use caution, turn off the power, and send another
person to get medical aid.
Safety statements
Important:
Each caution and danger statement in this documentation begins with a number. This number is used to cross reference an English-language caution or danger statement with translated versions of the caution or danger statement in the Safety Information document.
For example, if a caution statement begins with a number 1, translations for that caution statement appear in the Safety Information document under statement 1.
Be sure to read all caution and danger statements in this documentation before performing the instructions. Read any additional safety information that comes with your blade server or optional device before you install the device.
Safety ix
Statement 1:
DANGER
Electrical current from power, telephone, and communication cables is hazardous.
To avoid a shock hazard: v Do not connect or disconnect any cables or perform installation,
maintenance, or reconfiguration of this product during an electrical storm.
v Connect all power cords to a properly wired and grounded electrical
outlet.
v Connect to properly wired outlets any equipment that will be attached to
this product.
v When possible, use one hand only to connect or disconnect signal
cables.
v Never turn on any equipment when there is evidence of fire, water, or
structural damage.
v Disconnect the attached power cords, telecommunications systems,
networks, and modems before you open the device covers, unless instructed otherwise in the installation and configuration procedures.
v Connect and disconnect cables as described in the following table when
installing, moving, or opening covers on this product or attached devices.
To Connect: To Disconnect:
1. Turn everything OFF.
2. First, attach all cables to devices.
3. Attach signal cables to connectors.
4. Attach power cords to outlet.
5. Turn device ON.
1. Turn everything OFF.
2. First, remove power cords from outlet.
3. Remove signal cables from connectors.
4. Remove all cables from devices.
x BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
Statement 2:
CAUTION: When replacing the lithium battery, use only IBM Part Number 33F8354 or an equivalent type battery recommended by the manufacturer. If your system has a module containing a lithium battery, replace it only with the same module type made by the same manufacturer. The battery contains lithium and can explode if not properly used, handled, or disposed of.
Do not:
v Throw or immerse into water v Heat to more than 100°C (212°F) v Repair or disassemble
Dispose of the battery as required by local ordinances or regulations.
Statement 3:
CAUTION: When laser products (such as CD-ROMs, DVD drives, fiber optic devices, or transmitters) are installed, note the following:
v Do not remove the covers. Removing the covers of the laser product could
result in exposure to hazardous laser radiation. There are no serviceable parts inside the device.
v Use of controls or adjustments or performance of procedures other than
those specified herein might result in hazardous radiation exposure.
DANGER
Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following.
Laser radiation when open. Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam.
Safety xi
Statement 4:
18 kg (39.7 lb) 32 kg (70.5 lb) 55 kg (121.2 lb)
CAUTION: Use safe practices when lifting.
Statement 5:
CAUTION: The power control button on the device and the power switch on the power supply do not turn off the electrical current supplied to the device. The device also might have more than one power cord. To remove all electrical current from the device, ensure that all power cords are disconnected from the power source.
1 2
xii BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
Statement 8:
CAUTION: Never remove the cover on a power supply or any part that has the following label attached.
Hazardous voltage, current, and energy levels are present inside any component that has this label attached. There are no serviceable parts inside these components. If you suspect a problem with one of these parts, contact a service technician.
Statement 10:
CAUTION: Do not place any object on top of rack-mounted devices.
Safety xiii
xiv BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
Chapter 1. Introduction
This Problem Determination and Service Guide contains information to help you solve problems that might occur in your IBM 8844 blade server. It describes the diagnostic tools that come with the blade server, error codes and suggested actions, and instructions for replacing failing components.
Replaceable components are of three types: v Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your
responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation.
v Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or
request IBM to install it, at no additional charge, under the type of warranty service that is designated for your blade server.
v Field replaceable unit (FRU): FRUs must be installed only by trained service
technicians.
For information about the terms of the warranty and getting service and assistance, see the Warranty and Support Information document.
Related documentation
In addition to this document, the following documentation also comes with the blade server:
v Installation and User’s Guide
This printed document contains general information about the blade server, including how to install supported options and how to configure the blade server.
v Safety Information
This document is in Portable Document Format (PDF) on the Documentation CD. It contains translated caution and danger statements. Each caution and danger statement that appears in the documentation has a number that you can use to locate the corresponding statement in your language in the Safety Information document.
v Warranty and Support Information
This document is in PDF on the Documentation CD. It contains information about the terms of the warranty and about service and assistance.
®
BladeCenter®JS21 Type 7988 or
Depending on the blade server model, additional documentation might be included on the Documentation CD.
The blade server might have features that are not described in the documentation that comes with the blade server. The documentation might be updated occasionally to include information about those features, or technical updates might be available to provide additional information that is not included in the blade server documentation. The most recent versions of all BladeCenter documentation are at http://www.ibm.com/systems/support/.
In addition to the documentation in this library, be sure to review the IBM BladeCenter Planning and Installation Guide for your BladeCenter unit type for information to help you prepare for system installation and configuration. This document is also available at http://www.ibm.com/systems/support/.
© Copyright IBM Corp. 2007 1
Notices and statements in this document
The caution and danger statements that appear in this document are also in the multilingual Safety Information document, which is on the Documentation CD. Each statement is numbered for reference to the corresponding statement in the Safety Information document.
The following notices and statements are used in this document:
v Note: These notices provide important tips, guidance, or advice. v Important: These notices provide information or advice that might help you avoid
inconvenient or problem situations.
v Attention: These notices indicate potential damage to programs, devices, or
data. An attention notice is placed just before the instruction or situation in which damage could occur.
v Caution: These statements indicate situations that can be potentially hazardous
to you. A caution statement is placed just before the description of a potentially hazardous procedure step or situation.
v Danger: These statements indicate situations that can be potentially lethal or
extremely hazardous to you. A danger statement is placed just before the description of a potentially lethal or extremely hazardous procedure step or situation.
2 BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
Features and specifications
The following table is a summary of the features and specifications of the JS21 Types 7988 and 8844 blade servers operating in a non-NEBS/ETSI (a non-Network Equipment Building System/European Telecommunications Standards Institute) environment.
Notes:
v Power, cooling, removable-media drives, external ports, and advanced system
management are provided by the BladeCenter unit.
v The operating system in the blade server must provide USB support for the blade
server to recognize and use the removable-media drives and front-panel USB ports. The BladeCenter unit uses USB for internal communications with these devices.
Microprocessor:
Support for:
v Two single-core, 64-bit, IBM
PowerPC (2.7 GHz in BladeCenter H unit, 2.6 GHz in other BladeCenter units) or
v Two dual-core, 64-bit, IBM
PowerPC 970MP microprocessors (2.5 GHz in BladeCenter H unit, 2.3 GHz in other BladeCenter units)
Memory:
v Dual-channel (DDR2) with 4 DIMM
slots
v Supports 512 MB, 1 GB, 2 GB, and
4 GB DIMMs, for a maximum of 16 GB (as of the date of this publication)
v Supports 2-way interleaved, DDR2,
PC2-3200 or PC2-4200, ECC SDRAM registered x4 (Chipkill) DIMMs
Drives: Support for two internal small-form-factor Serial Attached SCSI (SAS) drives
®
970MP microprocessors
Integrated functions:
v Two 1 Gigabit Ethernet controllers
v Expansion card interface
v Intelligent Platform Management
Interface (IPMI)
v Baseboard management controller
(BMC) with IPMI firmware
v ATI RN50 ES1000 video controller
v SAS RAID controller
v Light path diagnostics
v Local service processor (BMC)
v RS-485 interface for
communication with the management module
v Automatic server restart (ASR)
v Serial over LAN (SOL)
v Four Universal Serial Bus (USB)
buses for communication with keyboard, diskette drive, and CD drive
Predictive Failure Analysis (PFA) alerts:
v Microprocessor
v Memory
Electrical input: 12Vdc
Environment:
v Air temperature:
– Blade server on: 10° to 35°C (50°
to 95°F). Altitude: 0 to 914 m (3000 ft)
– Blade server on: 10° to 32°C (50°
to 90°F). Altitude: 914 m to 2133 m (3000 ft to 7000 ft)
– Blade server off: -40° to 60°C
(-40° to 140°F)
v Humidity:
– Blade server on: 8% to 80% – Blade server off: 5% to 80%
Size:
v Height: 24.5 cm (9.7 inches)
v Depth: 44.6 cm (17.6 inches)
v Width: 2.9 cm (1.14 inches)
v Maximum weight: 5.0 kg (11 lb)
Chapter 1. Introduction 3
Blade server control panel buttons and LEDs
This section describes the blade server control panel buttons and LEDs.
Note: The control panel door is shown in the closed (normal) position in the following illustration. To access the power-control button, you must open the control panel door.
Activity LED
Location LED
Keyboard/video select button
Information LED
Blade-error LED
Media-tray select button
Power-control button
Power-on LED
Keyboard/video select button: When using a supported Linux operating system, press this button to associate the shared BladeCenter unit keyboard and video ports with the blade server.
Notes:
v The use of a mouse or pointing device is not supported by the JS21 blade
server.
v The Linux operating system in the blade server must provide USB support for the
blade server to recognize and use the keyboard, even if the keyboard has a PS/2-style connector.
v The keyboard and video are available after the Linux operating system loads.
Power-on self-test (POST) codes and diagnostics are not supported using the keyboard and video.
v For information about supported Linux operating systems, see
http://www.ibm.com/servers/eserver/serverproven/compat/us/.
The LED on this button flashes while the request is being processed, then is lit when the ownership of the keyboard and video has been transferred to the blade server. It can take approximately 20 seconds to switch the keyboard and video control to the blade server.
Using a keyboard that is directly attached to the management module, you can press keyboard keys in the following sequence to switch keyboard and video control between blade servers:
NumLock NumLock blade_server_number Enter
4 BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
Where blade_server_number is the two-digit number for the blade bay in which the blade server is installed. When using some keyboards, such as the 28L3644 (37L0888) keyboard, you will need to hold down the Shift key while entering this key sequence.
If there is no response when you press the keyboard/video select button, you can use the management-module Web interface to determine whether local control has been disabled on the blade server.
Activity LED: When this green LED is lit, it indicates that there is activity on the hard disk drive or network.
Location LED: When this blue LED is lit, it has been turned on by the system administrator to aid in visually locating the blade server. The location LED can be turned off through the management-module Web interface or through IBM Director Console.
Information LED: When this amber LED is lit, it indicates that information about a system error for the blade server has been placed in the Management Module Event Log. The information LED can be turned off through the management-module Web interface or through IBM Director Console.
Blade-error LED: When this amber LED is lit, it indicates that a system error has occurred in the blade server. The blade-error LED will turn off only after the error is corrected.
Media-tray select button: Press this button to associate the shared BladeCenter unit media tray (removable-media drives and front-panel USB ports) with the blade server. The LED on the button flashes while the request is being processed, then is lit when the ownership of the media tray has been transferred to the blade server. It can take approximately 20 seconds for the operating system in the blade server to recognize the media tray.
If there is no response when you press the media-tray select button, you can use the management-module Web interface to determine whether local control has been disabled on the blade server.
Note: The operating system in the blade server must provide USB support for the blade server to recognize and use the removable-media drives and USB ports.
Power-control button: This button is behind the control panel door. Press this button to turn on or turn off the blade server.
Note: The power-control button has effect only if local power control is enabled for the blade server. Local power control is enabled and disabled through the management-module Web interface.
Power-on LED: This green LED indicates the power status of the blade server in the following manner:
v Flashing rapidly: The service processor (BMC) on the blade server is
communicating with the management module.
v Flashing slowly: The blade server has power but is not turned on. v Lit continuously: The blade server has power and is turned on.
Chapter 1. Introduction 5
Turning on the blade server
After you connect the blade server to power through the BladeCenter unit, the blade server can start in any of the following ways:
v You can press the power-control button on the front of the blade server (behind
the control panel door, see “Blade server control panel buttons and LEDs” on page 4) to start the blade server.
Notes:
1. Wait until the power-on LED on the blade server flashes slowly before pressing the blade server power-control button. If the power-on LED is flashing rapidly, the service processor in the management module is initializing; therefore, the power-control button on the blade server does not respond.
2. While the blade server is starting, the power-on LED on the front of the blade server is lit. See “Blade server control panel buttons and LEDs” on page 4 for the power-on LED states.
v If a power failure occurs, the BladeCenter unit and then the blade server can
start automatically when power is restored (if the blade server is configured through the management module to do so).
v You can turn on the blade server remotely by using the management module. v If the blade server is connected to power (the power-on LED is flashing slowly),
the operating system supports the Wake on LAN feature, and the Wake on LAN feature has not been disabled through the management module, the Wake on LAN feature can turn on the blade server. However, the blade server can only receive the Wake on LAN command through the ethernet ports that are integrated into the system board, not through the ethernet ports on an installed I/O expansion card.
6 BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
Turning off the blade server
When you turn off the blade server, it is still connected to power through the BladeCenter unit. The blade server can respond to requests from the service processor, such as a remote request to turn on the blade server. To remove all power from the blade server, you must remove it from the BladeCenter unit.
Shut down the operating system before you turn off the blade server. See the operating-system documentation for information about shutting down the operating system.
The blade server can be turned off in any of the following ways: v You can press the power-control button on the blade server (behind the control
panel door, see “Blade server control panel buttons and LEDs” on page 4). This also starts an orderly shutdown of the operating system, if this feature is supported by the operating system.
Note: After turning off the blade server, wait at least 5 seconds before you press the power-control button to turn on the blade server again.
v If the operating system stops functioning, you can press and hold the
power-control button for more than 4 seconds to turn off the blade server.
v The management module can turn off the blade server.
System-board layouts
The following illustrations show the connectors, jumpers, and LEDs on the system board. The illustrations in this document might differ slightly from your hardware.
System-board connectors
The following illustration shows the connectors on the system board.
I/O expansion option (J18)
I/O expansion option (J22)
Blade expansion option (J200)
Hard disk drive 0 (J500)
Hard disk drive 1 (J501)
DIMM 1 (J400) DIMM 2 (J401) DIMM 3 (J402) DIMM 4 (J403)
Control panel (J4) Battery (BH1)
Chapter 1. Introduction 7
System-board jumpers
The following illustration shows the jumpers on the system board.
System-board LEDs
The following illustration shows the LEDs on the system board. You have to remove the blade server from the BladeCenter unit, open the cover, and press the light path diagnostics switch to light any error LEDs that were turned on during processing.
BIOS code page jumper (J14)
3 2 1
DIMM 1 error LED (CR40)
DIMM 2 error LED (CR45)
DIMM 3 error LED (CR46)
DIMM 4 error LED (CR53)
I/O expansion option error LED (CR34)
System-management processor error LED (CR27)
NMI error LED (CR17)
Temperature error LED (CR16)
System board error LED (CR20)
Microprocessor 1 error LED (CR19)
Microprocessor 0 error LED (CR58)
Light path diagnostics switch (SW1)
Hard disk drive 1 error LED (CR3)
Hard disk drive 0 error LED (CR4)
8 BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
Chapter 2. Diagnostics
This chapter describes the diagnostic tools that are available to help you solve problems that might occur in the blade server.
If you cannot locate and correct the problem using the information in this chapter, see Appendix A, “Getting help and technical assistance,” on page 171 for more information.
Diagnostic tools
The following tools are available to help you diagnose and solve hardware-related problems:
v POST checkpoints
The power-on self-test (POST) in the firmware generates eight-digit checkpoint codes. If the firmware detects a problem during POST, an eight-digit error code will be displayed. See “POST checkpoint codes” for more information.
v Troubleshooting tables
These tables list problem symptoms and actions to correct the problems. See “Troubleshooting tables” on page 113 for more information.
v Light path diagnostics
Use the light path diagnostics to diagnose system errors quickly. See “Light path diagnostics” on page 124 for more information.
POST checkpoint codes
When you turn on the blade server, it performs a series of tests to check the operation of the blade server components. This series of tests is called the power-on self-test, or POST. During POST, a series of eight-digit progress codes (also known as checkpoints) is displayed on the console to indicate that the blade server is initializing system resources.
Note: You must establish an SOL session with the blade server to view the codes described in this section; the shared BladeCenter unit video cannot display these codes.
If the POST is completed without detecting any problems, the firmware displays a checkpoint indicating that an operating system is being loaded. Location code information may also display on the operator panel during this time (see “Location codes” on page 66).
If POST detects a problem, an eight-digit error code will be displayed and logged in the BladeCenter management module event log. See “Attention codes” on page 34 and “Error codes” on page 37 for more information. A location code might be displayed at the same time on the second line (see “Location codes” on page 66).
Note: Some POST codes may not display on the operator panel, these codes can be viewed using the Progress Indicator History option in the SMS utility (see “Using the SMS utility” on page 166).
© Copyright IBM Corp. 2007 9
Progress codes enable users and service personnel to know what the system is doing as it initializes. These codes are not intended to be error indicators, but in some cases a system could hang at one of the progress codes without displaying an eight-digit error code. Any actions associated with the progress codes should be taken only if the system hangs.
Progress codes
The following table lists the progress codes that may be displayed by the POST, and the suggested actions to take if the system hangs on the progress code.
In the following progress codes, X can be any number or letter.
Notes:
1. For checkpoints with no associated location code, see “Light path diagnostics” on page 124 to identify the failing component.
2. For checkpoints with location codes, see “Location codes” on page 66.
3. For problems persisting after completing the suggested actions, see “Checkout procedure” on page 106 and “Solving undetermined problems” on page 135.
4. For eight-digit codes not listed here, see “Checkout procedure” on page 106.
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed
in the Action column until the problem is resolved.
v See Chapter 3, “Parts listing, Types 7988 and 8844,” on page 137 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Progress code Description Action
C2001000 Partition auto-startup during a platform
startup
C2001010 Startup source
C2001100 Adding partition resources to the
secondary configuration
C20011FF Partition resources added successfully
C2001200 Checking if startup is allowed
C20012FF Partition startup is allowed to proceed
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
10 BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed
in the Action column until the problem is resolved.
v See Chapter 3, “Parts listing, Types 7988 and 8844,” on page 137 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Progress code Description Action
C2001300 Initializing ISL roadmap
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C20013FF ISL roadmap initialized successfully
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2001400 Initializing SP Communication Area #1
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2001410 Initializing startup parameters
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C20014FF Startup parameters initialized
successfully
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2002100 Power on racks
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2002110 Issuing a power on command
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C200211F Power on command successful
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C20021FF Power on phase complete
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2002200 Begin acquiring slot locks
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
Chapter 2. Diagnostics 11
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed
in the Action column until the problem is resolved.
v See Chapter 3, “Parts listing, Types 7988 and 8844,” on page 137 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Progress code Description Action
C20022FF End acquiring slot locks
C2002300 Begin acquiring VIO slot locks
C20023FF End acquiring VIO slot locks
C2002400 Begin powering on slots
C2002450 Waiting for power on of slots to complete
C20024FF End powering on slots
C2002500 Begin power on VIO slots
C20025FF End powering on VIO slots
C2003100 Validating ISL command parameters
C2003111 Waiting for bus object to become
operational
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
12 BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed
in the Action column until the problem is resolved.
v See Chapter 3, “Parts listing, Types 7988 and 8844,” on page 137 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Progress code Description Action
C2003112 Waiting for bus unit to become disabled
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2003115 Waiting for creation of bus object
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2003150 Sending ISL command to bus unit
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C20031FF Waiting for ISL command completion
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C20032FF ISL command complete successfully
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2003300 Start SoftPOR of a failed ISL slot
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2003350 Waiting for SoftPOR of a failed ISL slot
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C20033FF Finish SoftPOR of a failed ISL slot
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2004100 Waiting for load source device to enlist
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
C2004200 Load source device has enlisted
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
Chapter 2. Diagnostics 13
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed
in the Action column until the problem is resolved.
v See Chapter 3, “Parts listing, Types 7988 and 8844,” on page 137 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Progress code Description Action
C2004300 Preparing connection to load source
device
C20043FF Load source device is connected
C2006000 Locating first LID information on the load
source
C2006005 Clearing all partition main store
C2006010 Locating next LID information on the
load source
C2006020 Verifying LID information
C2006030 Priming LP configuration LID
C2006040 Preparing to initiate LID load from load
source
C2006050 LP configuration LID primed successfully
C2006060 Waiting for LID load to complete
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
1. Go to “Recovering the system firmware” on page 127.
2. Replace the system-board and chassis assembly.
14 BladeCenter JS21 Types 7988 and 8844: Problem Determination and Service Guide
Loading...
+ 172 hidden pages