IBM 7378, 7379, System x3400 M3 Problem Determination And Service Manual

Page 1
IBM System x3400 M3 Types 7378 and 7379

P roble m Dete rminatio n and Se rvice Guide
Page 2
Page 3
IBM System x3400 M3 Types 7378 and 7379

P roble m Dete rminatio n and Se rvice Guide
Page 4
Note: Before using this information and the product it supports, read the general information in Appendix B, “Notices,” on page 327, and the IBM Safety Information, Environmental Notices and User Guide documents on the IBM Documentation CD, and the Warranty Information document that comes with the server.
Thirteenth Edition (June 2014)
© Copyright IBM Corporation 2014.
Page 5

Contents

Safety ............................vii
Guidelines for trained service technicians ...............viii
Inspecting for unsafe conditions .................viii
Guidelines for servicing electrical equipment .............ix
Safety statements ........................x
Chapter 1. Start here.......................1
Diagnosing a problem .......................1
Undocumented problems .....................4
Chapter 2. Introduction ......................5
Related documentation ......................5
Notices and statements in this document................6
Features and specifications.....................7
Server controls, LEDs, and connectors ................9
Rear view ..........................11
Power-supply LEDs ......................12
Internal LEDs, connectors, and jumpers................15
System board internal connectors .................15
System board switches and jumpers ................17
System board LEDs ......................19
System board external connectors .................20
Hard disk drive backplane connectors ...............20
Chapter 3. Diagnostics .....................23
Diagnostic tools ........................23
Event logs ..........................23
Viewing event logs through the Setup utility .............24
Viewing event logs without restarting the server ............24
POST error codes........................26
System-event log ........................37
Integrated management module error messages ............37
Checkout procedure .......................73
About the checkout procedure ..................73
Performing the checkout procedure ................74
Troubleshooting tables ......................75
DVD drive problems ......................75
General problems .......................76
Hard disk drive problems ....................76
Hypervisor problems ......................77
Intermittent problems......................78
Keyboard, mouse, or pointing-device problems ............78
Memory problems .......................80
Microprocessor problems ....................82
Monitor problems .......................83
Optional-device problems ....................85
Power problems .......................86
Serial port problems ......................87
ServerGuide problems .....................88
Software problems ......................89
Universal Serial Bus (USB) port problems ..............89
Light path diagnostics ......................90
Power-supply LEDs ......................96
© Copyright IBM Corp. 2014 iii
Page 6
Diagnostic programs, messages, and error codes ............97
Running the diagnostic programs .................97
Diagnostic text messages ....................97
Viewing the test log ......................98
Diagnostic messages .....................98
Recovering the server firmware ..................134
Automated boot recovery (ABR) ..................136
Nx boot failure ........................136
Solving power problems .....................137
Solving Ethernet controller problems ................137
Solving undetermined problems ..................138
Problem determination tips ....................139
Chapter 4. Parts listing, System x3400 M3 Types 7378 and 7379 .....141
Replaceable server components ..................142
Power cords .........................145
Chapter 5. Removing and replacing server components ........149
Installation guidelines ......................149
System reliability guidelines...................150
Working inside the server with the power on ............150
Handling static-sensitive devices .................151
Returning a device or component ................151
Opening the bezel media door...................151
Closing the bezel media door ...................154
Opening the power-supply cage ..................154
Closing the power-supply cage ..................156
Removing a ServeRAID adapter battery ...............157
Installing a ServeRAID adapter battery................158
Removing the battery ......................160
Installing the battery ......................162
Internal cable routing and connectors ................164
Tape drive cable connection ..................164
DVD drive cable connection...................167
Operator information panel cable connection ............168
Hard disk drive cable connection .................170
Removing and replacing Tier 1 CRUs ................181
Removing the left-side cover ..................181
Installing the left-side cover ...................181
Removing and installing drives .................183
Removing a 2.5-inch hot-swap hard disk drive ............186
Installing a 2.5-inch hot-swap hard disk drive ............188
Removing a 3.5-inch hot-swap hard disk drive ............190
Installing a 3.5-inch hot-swap hard disk drive ............192
Removing a simple-swap hard disk drive ..............194
Installing a simple-swap hard disk drive ..............196
Removing a hot-swap fan ...................197
Installing a hot-swap fan ....................197
Removing a DVD drive ....................199
Installing a DVD drive .....................203
Removing the air baffle ....................212
Installing the air baffle .....................213
Removing an adapter .....................214
Installing an adapter .....................215
Removing the rear adapter-retention bracket ............218
Installing the rear adapter-retention bracket .............219
iv IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 7
Removing and replacing Tier 2 CRUs ................220
Removing the operator information panel assembly ..........220
Installing the operator information panel assembly ..........222
Removing a voltage regulator module ...............223
Installing a voltage regulator module ...............224
Installing memory ......................226
Removing a USB embedded hypervisor flash device .........234
Installing a USB embedded hypervisor flash device ..........236
Removing an optional ServeRAID adapter advanced feature key .....236
Installing an optional ServeRAID adapter advanced feature key .....238
Removing the bezel .....................239
Installing the bezel ......................242
Removing the fan cage assembly ................243
Installing the fan cage assembly .................244
Removing an optional tape drive .................246
Installing an optional tape drive .................248
Removing the USB cable and light path diagnostics assembly ......250
Installing the USB cable and light path diagnostics assembly ......252
Removing a 2.5-inch disk drive backplane .............254
Installing a 2.5-inch disk drive backplane ..............256
Removing the 3.5-inch hot-swap hard disk drive backplane .......257
Installing the 3.5-inch hard disk drive backplane ...........259
Removing the simple-swap backplate ...............261
Installing the simple-swap backplate ...............263
Removing the 2.5-inch disk drive cage...............265
Installing the 2.5-inch disk drive cage ...............267
Removing and replacing FRUs ..................268
Removing the upper 3.5-inch disk drive cage ............268
Installing the upper 3.5-inch disk drive cage .............270
Turning the stabilizing feet ...................272
Removing a hot-swap power supply................273
Installing a hot-swap power supply ................274
Removing the power-supply cage ................276
Installing the power-supply cage .................278
Removing an extender card...................281
Installing an extender card ...................283
Removing a microprocessor and heat sink .............284
Installing a microprocessor and heat sink..............286
Removing a heat-sink retention module ..............292
Installing a heat-sink retention module ...............293
Removing a microprocessor retention module ............294
Installing a microprocessor retention module ............295
Removing the system board ..................296
Installing the system board ...................298
Chapter 6. Configuration information and instructions ........301
Updating the firmware ......................302
Using the Setup utility ......................303
Starting the Setup utility ....................303
Setup utility menu choices ...................303
Passwords .........................306
Using the Boot Selection Menu program ...............308
Starting the backup server firmware.................308
Using the ServerGuide Setup and Installation CD............308
ServerGuide features .....................309
Setup and configuration overview ................309
Contents v
Page 8
Typical operating-system installation ...............310
Installing your operating system without using ServerGuide .......310
Changing the Power Policy option to the default settings after loading UEFI
defaults ..........................310
Using the integrated management module ..............311
Using the embedded hypervisor ..................312
Using the remote presence capability and blue-screen capture .......313
Enabling the remote presence feature ...............313
Obtaining the IP address for the Web interface access .........314
Logging on to the Web interface .................314
Enabling the Broadcom Gigabit Ethernet Utility.............315
Configuring the Gigabit Ethernet controller ..............315
Using the LSI Configuration Utility .................315
Starting the LSI Configuration Utility program ............316
Formatting a hard disk drive ..................317
Creating a RAID array of hard disk drives .............317
IBM Advanced Settings Utility ...................317
Updating IBM Systems Director ..................318
Updating the Universal Unique Identifier (UUID) ............319
Updating the DMI/SMBIOS data ..................321
Appendix A. Getting help and technical assistance ..........325
Before you call ........................325
Using the documentation.....................325
Getting help and information from the World Wide Web .........325
Software service and support ...................326
Hardware service and support ...................326
IBM Taiwan product service ....................326
Appendix B. Notices ......................327
Trademarks..........................327
Important notes ........................328
Particulate contamination.....................329
Documentation format ......................329
Telecommunication regulatory statement ...............330
Electronic emission notices ....................330
Federal Communications Commission (FCC) statement ........330
Industry Canada Class A emission compliance statement ........330
Avis de conformité à la réglementation d'Industrie Canada .......330
Australia and New Zealand Class A statement ............330
European Union EMC Directive conformance statement ........331
Germany Class A statement ..................331
VCCI Class A statement ....................332
Japan Electronics and Information Technology Industries Association (JEITA)
statement ........................332
Korean Class A warning statement ................332
Russia Electromagnetic Interference (EMI) Class A statement ......333
People's Republic of China Class A electronic emission statement ....333
Taiwan Class A compliance statement ...............333
Index ............................335
vi IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 9

Safety

Before installing this product, read the Safety Information.
Antes de instalar este produto, leia as Informações de Segurança.
Læs sikkerhedsforskrifterne, før du installerer dette produkt.
Lees voordat u dit product installeert eerst de veiligheidsvoorschriften.
Ennen kuin asennat tämän tuotteen, lue turvaohjeet kohdasta Safety Information.
Avant d'installer ce produit, lisez les consignes de sécurité.
Vor der Installation dieses Produkts die Sicherheitshinweise lesen.
Prima di installare questo prodotto, leggere le Informazioni sulla Sicurezza.
Les sikkerhetsinformasjonen (Safety Information) før du installerer dette produktet.
Antes de instalar este produto, leia as Informações sobre Segurança.
Antes de instalar este producto, lea la información de seguridad.
Läs säkerhetsinformationen innan du installerar den här produkten.
© Copyright IBM Corp. 2014 vii
Page 10

Guidelines for trained service technicians

This section contains information for trained service technicians.

Inspecting for unsafe conditions

Use the information in this section to help you identify potential unsafe conditions in an IBM product that you are working on. Each IBM product, as it was designed and manufactured, has required safety items to protect users and service technicians from injury. The information in this section addresses only those items. Use good judgment to identify potential unsafe conditions that might be caused by non-IBM alterations or attachment of non-IBM features or options that are not addressed in this section. If you identify an unsafe condition, you must determine how serious the hazard is and whether you must correct the problem before you work on the product.
Consider the following conditions and the safety hazards that they present: v Electrical hazards, especially primary power. Primary voltage on the frame can
cause serious or fatal electrical shock.
v Explosive hazards, such as a damaged CRT face or a bulging capacitor. v Mechanical hazards, such as loose or missing hardware.
To inspect the product for potential unsafe conditions, complete the following steps:
1. Make sure that the power is off and the power cord is disconnected.
2. Make sure that the exterior cover is not damaged, loose, or broken, and observe any sharp edges.
3. Check the power cord: v Make sure that the third-wire ground connector is in good condition. Use a
meter to measure third-wire ground continuity for 0.1 ohm or less between the external ground pin and the frame ground.
v Make sure that the power cord is the correct type, as specified in “Power
cords” on page 145.
v Make sure that the insulation is not frayed or worn.
4. Remove the cover.
5. Check for any obvious non-IBM alterations. Use good judgment as to the safety of any non-IBM alterations.
6. Check inside the server for any obvious unsafe conditions, such as metal filings, contamination, water or other liquid, or signs of fire or smoke damage.
7. Check for worn, frayed, or pinched cables.
8. Make sure that the power-supply cover fasteners (screws or rivets) have not been removed or tampered with.
viii IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 11

Guidelines for servicing electrical equipment

Observe the following guidelines when servicing electrical equipment: v Check the area for electrical hazards such as moist floors, nongrounded power
extension cords, power surges, and missing safety grounds.
v Use only approved tools and test equipment. Some hand tools have handles that
are covered with a soft material that does not provide insulation from live electrical currents.
v Regularly inspect and maintain your electrical hand tools for safe operational
condition. Do not use worn or broken tools or testers.
v Do not touch the reflective surface of a dental mirror to a live electrical circuit.
The surface is conductive and can cause personal injury or equipment damage if it touches a live electrical circuit.
v Some rubber floor mats contain small conductive fibers to decrease electrostatic
discharge. Do not use this type of mat to protect yourself from electrical shock.
v Do not work alone under hazardous conditions or near equipment that has
hazardous voltages.
v Locate the emergency power-off (EPO) switch, disconnecting switch, or electrical
outlet so that you can turn off the power quickly in the event of an electrical accident.
v Disconnect all power before you perform a mechanical inspection, work near
power supplies, or remove or install main units.
v Before you work on the equipment, disconnect the power cord. If you cannot
disconnect the power cord, have the customer power-off the wall box that supplies power to the equipment and lock the wall box in the off position.
v Never assume that power has been disconnected from a circuit. Check it to
make sure that it has been disconnected.
v If you have to work on equipment that has exposed electrical circuits, observe
the following precautions: – Make sure that another person who is familiar with the power-off controls is
near you and is available to turn off the power if necessary.
– When you are working with powered-on electrical equipment, use only one
hand. Keep the other hand in your pocket or behind your back to avoid creating a complete circuit that could cause an electrical shock.
– When you use a tester, set the controls correctly and use the approved probe
leads and accessories for that tester.
– Stand on a suitable rubber mat to insulate you from grounds such as metal
floor strips and equipment frames.
v Use extreme care when you measure high voltages. v To ensure proper grounding of components such as power supplies, pumps,
blowers, fans, and motor generators, do not service these components outside of their normal operating locations.
v If an electrical accident occurs, use caution, turn off the power, and send another
person to get medical aid.
Safety ix
Page 12

Safety statements

Important:
Each caution and danger statement in this document is labeled with a number. This number is used to cross reference an English-language caution or danger statement with translated versions of the caution or danger statement in the Safety Information document.
For example, if a caution statement is labeled "Statement 1," translations for that caution statement are in the Safety Information document under "Statement 1."
Be sure to read all caution and danger statements in this document before you perform the procedures. Read any additional safety information that comes with the server or optional device before you install the device.
Attention: Use No. 26 AWG or larger UL-listed or CSA certified telecommunication line cord.
x IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 13
Statement 1:
DANGER
Electrical current from power, telephone, and communication cables is hazardous.
To avoid a shock hazard: v Do not connect or disconnect any cables or perform installation,
maintenance, or reconfiguration of this product during an electrical storm.
v Connect all power cords to a properly wired and grounded electrical
outlet.
v Connect to properly wired outlets any equipment that will be attached to
this product.
v When possible, use one hand only to connect or disconnect signal
cables.
v Never turn on any equipment when there is evidence of fire, water, or
structural damage.
v Disconnect the attached power cords, telecommunications systems,
networks, and modems before you open the device covers, unless instructed otherwise in the installation and configuration procedures.
v Connect and disconnect cables as described in the following table when
installing, moving, or opening covers on this product or attached devices.
To Connect: To Disconnect:
1. Turn everything OFF.
2. First, attach all cables to devices.
3. Attach signal cables to connectors.
4. Attach power cords to outlet.
5. Turn device ON.
1. Turn everything OFF.
2. First, remove power cords from outlet.
3. Remove signal cables from connectors.
4. Remove all cables from devices.
Safety xi
Page 14
Statement 2:
CAUTION: When replacing the lithium battery, use only IBM Part Number 33F8354 or an equivalent type battery recommended by the manufacturer. If your system has a module containing a lithium battery, replace it only with the same module type made by the same manufacturer. The battery contains lithium and can explode if not properly used, handled, or disposed of.
Do not:
v Throw or immerse into water v Heat to more than 100°C (212°F) v Repair or disassemble
Dispose of the battery as required by local ordinances or regulations.
xii IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 15
Statement 3:
CAUTION: When laser products (such as CD-ROMs, DVD drives, fiber optic devices, or transmitters) are installed, note the following:
v Do not remove the covers. Removing the covers of the laser product could
result in exposure to hazardous laser radiation. There are no serviceable parts inside the device.
v Use of controls or adjustments or performance of procedures other than
those specified herein might result in hazardous radiation exposure.
DANGER
Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following.
Laser radiation when open. Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam.
Class 1 Laser Product Laser Klasse 1 Laser Klass 1 Luokan 1 Laserlaite Appareil A Laser de Classe 1
`
Safety xiii
Page 16
Statement 4:
18 kg (39.7 lb) 32 kg (70.5 lb) 55 kg (121.2 lb)
CAUTION: Use safe practices when lifting.
Statement 5:
CAUTION: The power control button on the device and the power switch on the power supply do not turn off the electrical current supplied to the device. The device also might have more than one power cord. To remove all electrical current from the device, ensure that all power cords are disconnected from the power source.
2
1
xiv IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 17
Statement 8:
CAUTION: Never remove the cover on a power supply or any part that has the following label attached.
Hazardous voltage, current, and energy levels are present inside any component that has this label attached. There are no serviceable parts inside these components. If you suspect a problem with one of these parts, contact a service technician.
Statement 11:
CAUTION: The following label indicates sharp edges, corners, or joints nearby.
Statement 12:
CAUTION: The following label indicates a hot surface nearby.
Safety xv
Page 18
Statement 13:
DANGER
Overloading a branch circuit is potentially a fire hazard and a shock hazard under certain conditions. To avoid these hazards, ensure that your system electrical requirements do not exceed branch circuit protection requirements. Refer to the information that is provided with your device for electrical specifications.
Statement 15:
CAUTION: Make sure that the rack is secured properly to avoid tipping when the server unit is extended.
Statement 17:
CAUTION: The following label indicates moving parts nearby.
Statement 26:
CAUTION: Do not place any object on top of rack-mounted devices.
Attention: This server is suitable for use on an IT power distribution system
whose maximum phase-to-phase voltage is 240 V under any distribution fault condition.
xvi IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 19

Chapter 1. Start here

You can solve many problems without outside assistance by following the troubleshooting procedures in this Problem Determination and Service Guide and on the IBM Web site. This document describes the diagnostic tests that you can perform, troubleshooting procedures, and explanations of error messages and error codes. The documentation that comes with your operating system and software also contains troubleshooting information.

Diagnosing a problem

Before you contact IBM or an approved warranty service provider, follow these procedures in the order in which they are presented to diagnose a problem with your server:
1. Determine what has changed. Determine whether any of the following items were added, removed, replaced,
or updated before the problem occurred:
v IBM System x Server Firmware (server firmware) v Device drivers v Firmware v Hardware components v Software
If possible, return the server to the condition it was in before the problem occurred.
2. Collect data. Thorough data collection is necessary for diagnosing hardware and software
problems. a. Document error codes and system board LEDs.
v System error codes: See “Viewing the test log” on page 98 for
information about error codes.
v Software or operating-system error codes: See the documentation for
the software or operating system for information about a specific error code. See the manufacturer's Web site for documentation.
v Light path diagnostics LEDs: See “Light path diagnostics” on page 90
for information about light path diagnostics LEDs that are lit.
v System board LEDs: See “System board LEDs” on page 19 for
information about system board LEDs that are lit.
“Light path diagnostics” on page 90
b. Collect system data.
Run Dynamic System Analysis (DSA) to collect information about the hardware, firmware, software, and operating system. Have this information available when you contact IBM or an approved warranty service provider. For instructions for running the DSA program, see “Running the diagnostic programs” on page 97.
If you have to download the latest version of DSA , go to http://www.ibm.com/systems/support/supportsite.wss/ docdisplay?brandind=5000008&lndocid=SERV-DSA or complete the following steps.
© Copyright IBM Corp. 2014 1
Page 20
Note: Changes are made periodically to the IBM Web site. The actual procedure might vary slightly from what is described in this document.
1) Go to http://www.ibm.com/systems/support/.
2) Under Product support, click System x.
3) Under Popular links, click Software and device drivers.
4) Under Related downloads, click Dynamic System Analysis (DSA). For information about DSA command-line options, go to
http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp?topic=/ com.ibm.xseries.tools.doc/erep_tools_dsa.html or complete the following steps:
1) Go to http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp.
2) In the navigation pane, click IBM System x and BladeCenter Tools Center.
3) Click Tools reference > Error reporting and analysis tools > IBM Dynamic System Analysis.
3. Follow the problem-resolution procedures. The four problem-resolution procedures are presented in the order in which they
are most likely to solve your problem. Follow these procedures in the order in which they are presented:
a. Check for and apply code updates.
Most problems that appear to be caused by faulty hardware are actually caused by IBM System x Server Firmware (server firmware), system firmware, device firmware, or device drivers that are not at the latest levels.
Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
1) Determine the existing code levels. In DSA, click Firmware/VPD to view system firmware levels, or click
Software to view operating-system levels.
2) Download and install updates of code that is not at the latest level. To display a list of available updates for your server, go
tohttp://www.ibm.com/systems/support/supportsite.wss/ docdisplay?brandind=5000008&lndocid=MIGR-4JT or complete the following steps.
Note: Changes are made periodically to the IBM Web site. The actual procedure might vary slightly from what is described in this document.
a) Go to http://www.ibm.com/systems/support/. b) Under Product support, click System x. c) Under Popular links, click Software and device drivers. d) Click System x3400 M3 to display the list of downloadable files for
the server.
You can install code updates that are packaged as an UpdateXpress System Pack or UpdateXpress CD image. An UpdateXpress System Pack contains an integration-tested bundle of online firmware and device-driver updates for your server. Use UpdateXpress System Pack Installer to acquire and apply UpdateXpress System Packs and individual firmware and device-driver updates. For additional information and to download the UpdateXpress System Pack Installer, go to the
2 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 21
System x and BladeCenter Tools Center at http://publib.boulder.ibm.com/ infocenter/toolsctr/v1r0/index.jsp and click UpdateXpress System Pack Installer.
Be sure to separately install any listed critical updates that have release dates that are later than the release date of the UpdateXpress System Pack or UpdateXpress image.
When you click an update, an information page is displayed, including a list of the problems that the update fixes. Review this list for your specific problem; however, even if your problem is not listed, installing the update might solve the problem.
b. Check for and correct an incorrect configuration.
If the server is incorrectly configured, a system function can fail to work when you enable it; if you make an incorrect change to the server configuration, a system function that has been enabled can stop working.
1) Make sure that all installed hardware and software are supported. See http://www.ibm.com/servers/eserver/serverproven/compat/us/ to
verify that the server supports the installed operating system, optional devices, and software levels. If any hardware or software component is not supported, uninstall it to determine whether it is causing the problem. You must remove nonsupported hardware before you contact IBM or an approved warranty service provider for support.
2) Make sure that the server, operating system, and software are
installed and configured correctly.
Many configuration problems are caused by loose power or signal cables or incorrectly seated adapters. You might be able to solve the problem by turning off the server, reconnecting cables, reseating adapters, and turning the server back on. For information about performing the checkout procedure, see “Checkout procedure” on page
73. If the problem is associated with a specific function (for example, if a
RAID hard disk drive is marked offline in the RAID array), see the documentation for the associated adapter and management or controlling software to verify that the adapter is correctly configured.
Problem determination information is available for many devices such as RAID and network adapters.
For problems with operating systems or IBM software or devices, complete the following steps.
Note: Changes are made periodically to the IBM Web site. The actual procedure might vary slightly from what is described in this document.
a) Go to http://www.ibm.com/systems/support/. b) Under Product support, click System x. c) From the Product family list, select System x3400 M3. d) Under Support & downloads, click Documentation, Install, and
Use to search for related documentation.
c. Check for troubleshooting procedures and RETAIN tips.
Troubleshooting procedures and RETAIN tips document known problems and suggested solutions. To search for troubleshooting procedures and RETAIN tips, complete the following steps.
Note: Changes are made periodically to the IBM Web site. The actual procedure might vary slightly from what is described in this document.
Chapter 1. Start here 3
Page 22
1) Go to http://www.ibm.com/systems/support/.
2) Under Product support, click System x.
3) From the Product family list, select System x3400 M3.
4) Under Support & downloads, click Troubleshoot.
5) Select the troubleshooting procedure or RETAIN tip that applies to your problem:
v Troubleshooting procedures are under Diagnostic. v RETAIN tips are under Troubleshoot.
d. Check for and replace defective hardware.
If a hardware component is not operating within specifications, it can cause unpredictable results. Most hardware failures are reported as error codes in a system or operating-system log. For more information, see “Troubleshooting tables” on page 75 and Chapter 5, “Removing and replacing server components,” on page 149. Hardware errors are also indicated by light path diagnostics LEDs.
A single problem might cause multiple symptoms. Follow the troubleshooting procedure for the most obvious symptom. If that procedure does not diagnose the problem, use the procedure for another symptom, if possible.
If the problem remains, contact IBM or an approved warranty service provider for assistance with additional problem determination and possible hardware replacement. To open an online service request, go to http://www.ibm.com/support/electronic/. Be prepared to provide information about any error codes and collected data.

Undocumented problems

If you have completed the diagnostic procedure and the problem remains, the problem might not have been previously identified by IBM. After you have verified that all code is at the latest level, all hardware and software configurations are valid, and no light path diagnostics LEDs or log entries indicate a hardware component failure, contact IBM or an approved warranty service provider for assistance. To open an online service request, go to http://www.ibm.com/support/electronic/. Be prepared to provide information about any error codes and collected data and the problem determination procedures that you have used.
4 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 23

Chapter 2. Introduction

This Problem Determination and Service Guide contains information to help you solve problems that might occur in your IBM server. It describes the diagnostic tools that come with the server, error codes and suggested actions, and instructions for replacing failing components.
Replaceable components are of four types: v Consumable parts: Purchase and replacement of consumable parts
(components, such as batteries and printer cartridges, that have depletable life) is your responsibility. If IBM acquires or installs a consumable part at your request, you will be charged for the service.
v Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your
responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation.
v Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or
request IBM to install it, at no additional charge, under the type of warranty service that is designated for your server.
v Field replaceable unit (FRU): FRUs must be installed only by trained service
technicians.
For information about the terms of the warranty and getting service and assistance, see the Warranty Information document.

Related documentation

®
System x3400 M3 Type 7378/7379
In addition to this document, the following documentation also comes with the server:
v Environmental Notices and User's Guide
This document is in PDF on the IBM Documentation CD. It contains translated environmental notices.
v IBM License Agreement for Machine Code
This document is in PDF on the IBM Documentation CD. It provides translated versions of the IBM License Agreement for Machine Code for your product.
v Warranty Information
This is a document that comes with the server. It contains information about the terms of the warranty and getting service and assistance.
v Installation and User's Guide
This document is in Portable Document Format (PDF) on the IBM Documentation CD. It provides general information about setting up and cabling the server, including information about features, and how to configure the server. It also contains detailed instructions for installing, removing, and connecting optional devices that the server supports.
v Licenses and Attributions Documents
This document is in PDF. It contains information about the open-source notices.
v Rack Installation Instructions
This printed document contains instructions for installing the server in a rack.
v Safety Information
This document is in PDF on the IBM Documentation CD. It contains translated caution and danger statements. Each caution and danger statement that appears
© Copyright IBM Corp. 2014 5
Page 24
in the documentation has a number that you can use to locate the corresponding statement in your language in the Safety Information document.
v Warranty Information
This is a document that comes with the server. It contains information about the terms of the warranty and getting service and assistance.
The System x and xSeries Tools Center is an online information center that contains information about tools for updating, managing, and deploying firmware, device drivers, and operating systems. The System x and xSeries Tools Center is at http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp
Depending on the server model, additional documentation might be included on the IBM Documentation CD.
The server might have features that are not described in the documentation that comes with the server. The documentation might be updated occasionally to include information about those features, or technical updates might be available to provide additional information that is not included in the server documentation. These updates are available from the IBM Web site. To check for updated documentation and technical updates, complete the following steps.
Note: Changes are made periodically to the IBM Web site. The actual procedure might vary slightly from what is described in this document.
1. Go to http://www.ibm.com/support/.
2. Under Product support, click System x.
3. Under Popular links, click Publications lookup.
4. From the Product family menu, select System x3400 and click Continue.

Notices and statements in this document

The caution and danger statements in this document are also in the multilingual Safety Information document, which is on the IBM System x Documentation CD. Each statement is numbered for reference to the corresponding statement in the Safety Information document.
The following notices and statements are used in this document:
v Note: These notices provide important tips, guidance, or advice. v Important: These notices provide information or advice that might help you avoid
inconvenient or problem situations.
v Attention: These notices indicate potential damage to programs, devices, or
data. An attention notice is placed just before the instruction or situation in which damage might` occur.
v Caution: These statements indicate situations that can be potentially hazardous
to you. A caution statement is placed just before the description of a potentially hazardous procedure step or situation.
v Danger: These statements indicate situations that can be potentially lethal or
extremely hazardous to you. A danger statement is placed just before the description of a potentially lethal or extremely hazardous procedure step or situation.
6 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 25

Features and specifications

The following information is a summary of the features and specifications of the server. Depending on the server model, some features might not be available, or some specifications might not apply.
Table 1. Features and specifications
Microprocessor:
v Intel Xeon up to six-core with
integrated memory controller and Quick Path Interconnect (QPI) architecture
v Designed for LGA 1366 socket v Scalable up to twelve cores v 32 KB instruction cache, 32 KB
data cache, and 4MB, 8 MB and 12MB cache that is shared among the cores
v Support for up to two
microprocessors, second microprocessor with pluggable VRM
v Support for Intel Extended Memory
64 Technology (EM64T)
Note:
v Use the Setup utility to determine
the type and speed of the microprocessors. For a list of supported microprocessors, see http://www.ibm.com/servers/ eserver/serverproven/compat/us/
v Do not install an Intel Xeon
series microprocessor and an
Xeon
5600 series microprocessor
5500
in the same server.
.
Video controller:
v Matrox G200eV video on system
board
v Compatible with SVGA and VGA
Power supply:
v Standard: One 670 watt (100 - 240
V AC)
Note: On models with eight 3.5-inch or sixteen 2.5-inch hard disk drives, need to upgrade power supply to 920-watt.
Memory:
v Sixteen DIMM connectors (eight
per microprocessor)
v Minimum: 1 GB
Note: If you install a ServeRAID-M1015 SAS/SATA adapter, make sure at least 2 GB of memory is installed in the server before you run DSA from a bootable CD.
v Maximum: 128 GB
– 48 GB using unbuffered
– 128 GB using registered
v Type: Registered or unbuffered
ECC double-data-rate 3 (DDR3) 800, 1066, and 1333 MHz DIMMs only
v RDIMMs sizes: 1 GB, 2 GB , 4
GB , and 8 GB single-rank or dual-rank
v UDIMMs sizes: 1 GB, 2 GB, and
4 GB single-rank or dual-rank
v Chipkill supported
Drives:
v SATA:
– DVD (standard) – DVD/CD-RW (optional) – Maximum of two devices can
v Diskette (optional): External USB
1.44 MB
v Supported hard disk drives:
– Serial Attached SCSI (SAS)
DIMMs (UDIMMs)
DIMMs (RDIMMs)
be installed
Expansion bays:
v Sixteen 2.5-inch HDD bays (three
optical DVD drive bays)
v Four 3.5-inch simple-swap SATA
drives
v Eight 3.5-inch HDD bays (one
UltraSlim DVD drive)
v Three half-high 5.25-inch bays (one
DVD drive installed)
Note:
– SAS expander card does not
support 3 GB RAID adapters.
– If the server is configured for
RAID operation using a ServeRAID adapter, you might have to reconfigure your disk arrays after you install drives. See the ServeRAID adapter documentation for additional information about RAID operation and complete instructions for using the ServeRAID adapter.
– Full-high devices such as an
optional tape drive will occupy two half-high
5.25-inch bays.
PCI and PCI-X expansion slots:
v Six PCI expansion slots on the
system board: – Four PCI Express x8 (2x8 link,
2x4 link) – One PCI Express x16 (x8 link) – One PCI 32-bit
v One or two expansion slots on the
PCI extender card: – Optional - One PCI Express x8
(x4 link) on the PCI-Express
extender card – Optional - Two PCI-X 64/133
slots on the PCI-X extender card
Hot-swap fans:
v Three (maximum)
Chapter 2. Introduction 7
Page 26
Table 1. Features and specifications (continued)
Size:
v Tower
– Height: 440 mm (17.3 in.) – Depth: 767 mm (30.2 in.) – Width: 218 mm (8.6 in.) – Weight: approximately 37.85 kg
(83.4 lb) when fully configured or 27.1 kg (59.7 lb) minimum
v Rack
–5U – Height: 218 mm (8.6 in.) – Depth: 702 mm (27.6 in.) – Width: 424 mm (16.7 in.) – Weight: approximately 36 kg
(79.3 lb) when fully configured or 25.8 kg (56.9 lb) minimum
Racks are marked in vertical increments of 4.45 cm (1.75 inches). Each increment is referred to as a unit, or “U.” A 1-U-high device is 4.45 cm (1.75 inches) tall.
Integrated functions:
v Integrated Management Module
(IMM), which provides service processor control and monitoring functions, video controller, and (when the optional virtual media key is installed) remote keyboard, video, mouse, and remote hard disk drive capabilities
v Dedicated or shared management
network connections
v Six-port Serial ATA (SATA)
controller embedded
v Serial over LAN (SOL) and serial
redirection over Telnet or Secure Shell (SSH)
v USB flash device with embedded
hypervisor software.
v Support for remote management
presence
v One systems-management RJ-45
for connection to a dedicated systems-management network. This system management connector is dedicated to the IMM functions.
v Six Universal Serial Bus (USB)
ports standard (v2.0 supporting v1.1) – Four on rear of server – Two on front of server
v One internal USB tape connector v One Broadcom dual-port
10/100/1000 Ethernet controller with Wake on LAN support
v One serial connector, shared with
the IMM
Note: In messages and documentation, the term service processor refers to the integrated management module (IMM).
ServeRAID SAS adapter:
v ServeRAID-BR10i SAS/SATA
adapter that supports RAID levels 0, 1 and 1E (standard)
v ServeRAID-BR10il SAS/SATA
adapter that supports RAID levels 0, 1 and 1E (standard)
v Upgradeable to ServeRAID-MR10i
SAS/SATA adapter, which supports RAID levels 0, 1, 5, 6, 10
v Optional ServeRAID-MR10is
SAS/SATA adapter, which supports RAID levels 0, 1, 5, 6, 10
v Optional ServeRAID-M1015
SAS/SATA adapter, which supports RAID levels 0, 1 and 1E
v Optional ServeRAID-M5014
SAS/SATA adapter, which supports RAID level 0, 1, 5, 10, 50
v Optional ServeRAID-M5015
SAS/SATA adapter, which supports RAID level 0, 1, 5, 10, 50 Note: If the server is configured for RAID operation using a ServeRAID adapter, you might have to reconfigure your disk arrays after you install drives. See the ServeRAID adapter documentation for additional information about RAID operation and complete instructions for using the ServeRAID adapter.
Acoustical noise emissions:
v Sound power, idle: 5.5 bel declared v Sound power, operating: 6.0 bel
declared
8 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 27
Table 1. Features and specifications (continued)
Environment:
v Air temperature:
– Server on: 10°C to 35°C (50.0°F
to 95.0°F); altitude: 0 to 915 m (3000 ft)
– Server on: 10°C to 32°C (50.0°F
to 90.0°F); altitude: 915 m (3000 ft) to 2134 m (7000 ft)
– Server on: 10°C to 28°C (50.0°F
to 83.0°F); altitude: 2134 m (7000 ft) to 3050 m (10000 ft)
– Server off: 5°C to 45°C (41°F to
113°F)
– Shipping: -40°C to 60°C
(-40.0°F to 140°F)
Electrical input:
v Sine-wave input (50-60 Hz)
required
v Input voltage low range:
– Minimum: 100 V AC – Maximum: 127 V AC
v Input voltage high range:
– Minimum: 200 V AC – Maximum: 240 V AC
v Approximate input
kilovolt-amperes (kVA): – Minimum: 0.60 kVA – Maximum: 1.10 kVA
Notes:
1. Power consumption and heat output vary depending on the number and type of optional features that are installed and the power-management optional features that are in use.
2. These levels were measured in controlled acoustical environments according to the procedures that are specified by the American National Standards Institute (ANSI) S12.10 and ISO 7779 and are reported in accordance with ISO 9296. Actual sound-pressure levels in a given location might exceed the average stated values because of room reflections and other nearby noise sources. The declared sound-power levels indicate an upper limit, below which a large number of computers will operate.
Heat output:
Approximate heat output: v Minimum configuration: 2013 Btu
per hour (590 watts)
v Maximum configuration: 3610 Btu
per hour (1058 watts)
Humidity: v Server on: 20% to 80%, maximum
dew point 21°C, maximum rate of change 5°C/hour
v Server off: 8% to 80%, maximum
dew point 27°C

Server controls, LEDs, and connectors

This section describes the controls, light-emitting diodes (LEDs), and connectors on the front and rear of the server.
Power control button and power-on LED
Press this button to turn the server on and off manually or to wake the server from a reduced-power state. The states of the power-on LED are as follows:
Off: AC power is not present, or the power supply or the LED itself has failed.
Flashing rapidly (4 times per second): The server is turned off and is not ready to be turned on. The power-control button is disabled. This will last approximately 20 to 40 seconds.
Note: Approximately 20 seconds after the server is connected to ac power, the power-control button becomes active.
Chapter 2. Introduction 9
Page 28
Flashing slowly (once per second): The server is turned off and is ready to be turned on. You can press the power-control button to turn on the server.
Lit: The server is turned on. Fading on and off: The server is in a reduced-power state. To wake the
server, press the power-control button or use the IMM Web interface. See “Logging on to the Web interface” on page 314 for information on logging on to the IMM Web interface.
Hard disk drive activity LED
When this LED is flashing, it indicates that a hard disk drive is in use.
System-error LED
When this amber LED is lit, it indicates that a system error has occurred. An LED on the system board might also be lit to help isolate the error. See Chapter 3, “Diagnostics,” on page 23 for additional information.
USB connectors
Connect USB devices to these connectors.
DVD-eject button
Press this button to release a CD or DVD from the DVD drive.
DVD drive activity LED
When this LED is lit, it indicates that the DVD drive is in use.
Hot-swap hard disk drive activity LED (some models)
On some server models, each hot-swap drive has a hard disk drive activity LED. When this green LED is flashing, it indicates that the drive is in use.
When the drive is removed, this LED also is visible on the SAS/SATA backplane, next to the drive connector. The backplane is the printed circuit board behind drive bays 4 through 7 on 3.5-inch hard disk drive models and bays 4 through 11 on 2.5-inch hard disk drive models.
Hot-swap hard disk drive status LED (some models)
On some server models, each hot-swap hard disk drive has an amber status LED. If this amber status LED for a drive is lit, it indicates that the associated hard disk drive has failed.
If an optional ServeRAID adapter is installed in the server and the LED flashes slowly (one flash per second), the drive is being rebuilt. If the LED flashes rapidly (three flashes per second), the adapter is identifying the drive.
When the drive is removed, this LED also is visible on the SAS/SATA backplane, below the hot-swap hard disk drive activity LED.
Please see “Event logs” on page 23 for more information.
10 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 29

Rear view

The following illustration shows the connectors and LEDs on the rear of the server.
AC power LED
DC power LED
Fault (error) LED
Serial 1
(COM 1)
Video
System management Ethernet connector
NMI button
Ethernet 1 10/100/1000
USB 1 USB 2 USB 3
USB 4
Ethernet 2 10/100/1000
Power cord connector
Ethernet transmit/receive activity LED
Ethernet link status LED
Ethernet transmit/receive activity LED
Ethernet link status LED
Power-cord connector
Connect the power cord to this connector.
AC power LED
This green LED provides status information about the power supply. During typical operation, both the AC and DC power LEDs are lit.
DC power LED
This green LED provides status information about the power supply. During typical operation, both the AC and DC power LEDs are lit.
Power-error (Fault) LED
When this amber LED is lit, it indicates that the power supply has failed.
Video connector
Connect a monitor to this connector.
Note: The maximum video resolution is 1600 x 1200 at 85 Hz.
Serial connector
Connect a 9-pin serial device to this connector.
Systems-mamagement Ethernet connector
Use this connector to manage the server, using a dedicated management network. If you use this connector, the IMM cannot be accessed directly from a production network. A dedicated management network provides additional security by physically separating the management network traffic
Chapter 2. Introduction 11
Page 30
USB connectors
Ethernet connectors
Ethernet transmit/receive activity LED
Ethernet link status LED

Power-supply LEDs

The following illustration shows the locations of the 670-watt power supply LEDs.
AC power LED
DC power LED
Fault (error) LED
from the production network. You can use the Setup utility to configure the server to use a dedicated systems management network or a shared network.
Connect USB devices to these connectors.
Use either of these connectors to connect the server to a network. When you use the Ethernet 1 connector, the network can be shared with the IMM through a single network cable.
This LED is on the Ethernet connector. When this LED is lit, it indicates that there is activity between the server and the network.
This LED is on the Ethernet connector. When this LED is lit, it indicates that there is an active connection on the Ethernet port.
Power cord connector
The following table describes the problems that are indicated by various combinations of the power-supply LEDs. For more information about solving power-supply problems, see “Power-supply LEDs” on page 96.
12 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 31
Table 2. Power-supply LEDs
Power-supply LEDs
Description Action NotesAC DC Error
Off Off Off No AC power to
the server or a problem with the AC power source
Off Off On No AC power to
the server or a problem with the AC power source and the power supply had detected an internal problem
Off On Off Faulty power
supply
Off On On Faulty power
supply
On Off Off Power supply not
fully seated, faulty system board, or faulty power supply
On Off or
Flashing On On Off Normal operation On On On Power supply is
On Faulty power
supply
faulty but still operational
1. Check the AC power to the server.
2. Make sure that the power cord is connected to a functioning power source.
3. Turn the server off and then turn the server back on.
4. If the problem remains, replace the power supply.
1. Replace the power supply.
2. Make sure that the power cord is connected to a functioning power source.
Replace the power supply.
Replace the power supply.
1. If the system board error (fault) LED is not lit, replace the power supply.
2. If the system board error (fault) LED is lit, (Trained service technician only) replace the system board.
Replace the power supply.
Replace the power supply.
This is a normal condition when no AC power is present.
This happens only when a second power supply is providing power to the server.
Typically indicates that a power supply is not fully seated.
Note: On models with eight 3.5-inch or sixteen 2.5-inch hard disk drives, need to upgrade power supply to 920-watt. The following illustration shows the 920-watt power-supply LEDs on the rear of the server.
Chapter 2. Introduction 13
Page 32
Table 3. Power-supply LEDs
Off Off Off No AC power to the server or a problem
Off Off On No AC power to the server or a problem
Off On Off Faulty power supply Off On On Faulty power supply On Off Off Power supply not fully seated, faulty
On Off or flashing On Faulty power supply On On Off Normal operation On On On Power supply is faulty but still operational
Power-supply LEDs
DescriptionAC power DC power Power error
with the AC power source
with the AC power source and the power supply has detected an internal problem
system board, or faulty power supply
14 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 33

Internal LEDs, connectors, and jumpers

The illustrations in this section show the LEDs, connectors, and jumpers on the internal boards. The illustrations might differ slightly from your hardware.

System board internal connectors

The following illustration shows the internal connectors on the system board.
Chapter 2. Introduction 15
Page 34
The following illustration shows one additional PCI Express expansion slot that is available on the PCI Express extender card, if equipped.
The following illustration shows two additional PCI-X expansion slots that are available on the PCI-X extender card, if equipped.
16 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 35

System board switches and jumpers

The following tables show the settings of the switches and the jumpers.
See Table 4 and Table 5 for information about the switch and jumper settings.
Table 4. System board jumpers
Jumper number
JP1 CMOS clear
JP6 UEFI boot
JP7 Trust
Note: If no jumper is present, the server responds as on default position.
Jumper name Jumper setting
v Pins 1 and 2: Normal operation (default). v Pins 2 and 3: Clears CMOS memory.
recovery
Platform Module (TPM)
v Pins 1 and 2: Normal operation (default). v Pins 2 and 3: Enable the UEFI recovery mode.
v Pins 1 and 2: TPM physical presence is asserted. v Pins 2 and 3: TPM physical presence is not asserted (default).
Note: The physical presence requires manual setting on the server to change the TPM configuration. The TPM is enabled and physical presence is not asserted by default. The physical presence needs to be asserted to activate, deactivate, clear or change ownership of the TPM.
Table 5. System board switch 6
SW 6 Switches Switch description
1 Reserved (default off) 2 Power-on password override when on. (default off) 3 Reserved (default off) 4 When this switch is off, the primary IMM firmware ROM page is loaded. When this switch is on,
the secondary (backup) IMM firmware ROM page is loaded. (default off)
Chapter 2. Introduction 17
Page 36
Notes:
1. Before you change any switch settings or move any jumpers, turn off the server; then, disconnect all power cords and external cables. (Review the information in “Safety” on page vii, “Installation guidelines” on page 149, and “Handling static-sensitive devices” on page 151.)
18 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 37

System board LEDs

The following illustration shows the LEDs on the system board.
The system board is equipped with a PCI extender card that provides either one or two additional expansion slots. The following illustration shows the LEDs on the PCI Express extender card, if equipped.
The following illustration shows the LEDs on the PCI-X extender card, if equipped.
Chapter 2. Introduction 19
Page 38

System board external connectors

The following illustration shows the external input/output connectors on the system board.
NMI button
Video port
USB ports
Serial port
Ethernet
System management

Hard disk drive backplane connectors

The following illustrations show the connectors on the 2.5-inch and 3.5-inch hard disk drive backplanes.
20 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 39
Figure 1. Connectors on the 3.5-inch hard disk drive backplane
Figure 2. Connectors on the 2.5-inch hard disk drive backplane
Chapter 2. Introduction 21
Page 40
22 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 41

Chapter 3. Diagnostics

This chapter describes the diagnostic tools that are available to help you solve problems that might occur in the server.
If you cannot diagnose and correct a problem by using the information in this chapter, see Appendix A, “Getting help and technical assistance,” on page 325 for more information.

Diagnostic tools

The following tools are available to help you diagnose and solve hardware-related problems:
v POST error messages
The power-on self-test (POST) generates messages to indicate successful test completion or the detection of a problem. See “POST error codes” on page 26 for more information.
v Event logs
For information about the POST event log, the system-event log, the integrated management module (IMM) event log, and the DSA log, see “Event logs” and “System-event log” on page 37.
v Troubleshooting tables
These tables list problem symptoms and actions to correct the problems. See “Troubleshooting tables” on page 75.
v Light path diagnostics
Use the light path diagnostics to diagnose system errors quickly. See “Light path diagnostics” on page 90 for more information.
v Diagnostic programs, messages, and error codes
The diagnostic programs are the primary method of testing the major components of the server. See “Diagnostic programs, messages, and error codes” on page 97 for more information.

Event logs

Error codes and messages are displayed in the following types of event logs: v POST event log: This log contains the three most recent error codes and
messages that were generated during POST. You can view the POST event log through the Setup utility.
v System-event log: This log contains all IMM, POST, and system management
interrupt (SMI) events. You can view the system-event log through the Setup utility and through the Dynamic System Analysis (DSA) program (as the IPMI event log).
The system-event log is limited in size. When it is full, new entries will not overwrite existing entries; therefore, you must periodically save and then clear the system-event log through the Setup utility when the IMM logs an event that indicates that the log is more than 75% full. When you are troubleshooting, you might have to save and then clear the system-event log to make the most recent events available for analysis.
Messages are listed on the left side of the screen, and details about the selected message are displayed on the right side of the screen. To move from one entry to the next, use the Up Arrow () and Down Arrow () keys.
© Copyright IBM Corp. 2014 23
Page 42
Some IMM sensors cause assertion events to be logged when their setpoints are reached. When a setpoint condition no longer exists, a corresponding deassertion event is logged. However, not all events are assertion-type events.
v Integrated management module (IMM) event log: This log contains a filtered
subset of all IMM, POST, and system management interrupt (SMI) events. You can view the IMM event log through the IMM Web interface and through the Dynamic System Analysis (DSA) program (as the ASM event log).
v DSA log: This log is generated by the Dynamic System Analysis (DSA) program,
and it is a chronologically ordered merge of the system-event log (as the IPMI event log), the IMM event log (as the ASM event log), and the operating-system event logs. You can view the DSA log through the DSA program.

Viewing event logs through the Setup utility

To view the POST event log or system-event log, complete the following steps:
1. Turn on the server.
2. When the prompt <F1> Setup is displayed, press F1. If you have set both a power-on password and an administrator password, you must type the administrator password to view the event logs.
3. Select System Event Logs and use one of the following procedures:
v To view the POST event log, select POST Event Viewer. v To view the system-event log, select System Event Log.
Attention: If you set an administrator password and then forget it, there is no way to change, override, or remove it. You must replace the system board.

Viewing event logs without restarting the server

If the server is not hung, methods are available for you to view one or more event logs without having to restart the server.
If you have installed Portable or Installable Dynamic System Analysis (DSA), you can use it to view the system-event log (as the IPMI event log), the IMM event log (as the ASM event log), or the merged DSA log. You can also use DSA Preboot to view these logs, although you must restart the server to use DSA Preboot. To install Portable DSA, Installable DSA, or DSA Preboot or to download a DSA Preboot CD image, go to http://www.ibm.com/systems/support/supportsite.wss/ docdisplay?lndocid=SERV-DSA&brandind=5000008 or complete the following steps.
Note: Changes are made periodically to the IBM Web site. The actual procedure might vary slightly from what is described in this document.
1. Go to http://www.ibm.com/systems/support/.
2. Under Product support, click System x.
3. Under Popular links, click Software and device drivers.
4. Under Related downloads, click Dynamic System Analysis (DSA) to display the matrix of downloadable DSA files.
If IPMItool is installed in the server, you can use it to view the system-event log. Most recent versions of the Linux operating system come with a current version of IPMItool. For information about IPMItool, see http://publib.boulder.ibm.com/ infocenter/toolsctr/v1r0/index.jsp?topic=/com.ibm.xseries.tools.doc/ config_tools_ipmitool.html or complete the following steps.
24 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 43
Note: Changes are made periodically to the IBM Web site. The actual procedure
might vary slightly from what is described in this document.
1. Go to http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp.
2. In the navigation pane, click IBM System x and BladeCenter Tools Center.
3. Expand Tools reference, expand Configuration tools, expand IPMI tools, and click IPMItool.
For an overview of IPMI, go to http://publib.boulder.ibm.com/infocenter/systems/ index.jsp?topic=/liaai/ipmi/liaaiipmi.htm or complete the following steps:
1. Go to http://publib.boulder.ibm.com/infocenter/systems/index.jsp.
2. In the navigation pane, click IBM Systems Information Center.
3. Expand Operating systems, expand Linux information, expand Blueprints
for Linux on IBM systems, and click Using Intelligent Platform Management Interface (IPMI) on IBM Linux platforms.
You can view the IMM event log through the Event Log link in the integrated management module (IMM) Web interface.
The following table describes the methods that you can use to view the event logs, depending on the condition of the server. The first two conditions generally do not require that you restart the server.
Table 6. Methods for viewing event logs
Condition Action
The server is not hung and is connected to a network.
The server is not hung and is not connected to a network.
The server is hung.
Use any of the following methods: v Run Portable or Installable DSA to view
the event logs or create an output file that you can send to IBM service and support.
v Type the IP address of the IMM and go to
the Event Log page.
v Use IPMItool to view the system-event log. Use IPMItool locally to view the system-event
log.
v If DSA Preboot is installed, restart the
server and press F2 to start DSA Preboot and view the event logs.
v If DSA Preboot is not installed, insert the
DSA Preboot CD and restart the server to start DSA Preboot and view the event logs.
v Alternatively, you can restart the server
and press F1 to start the Setup utility and view the POST event log or system-event log. For more information, see “Viewing event logs through the Setup utility” on page 24.
Chapter 3. Diagnostics 25
Page 44

POST error codes

When you turn on the server, it performs a series of tests to check the operation of the server components and some optional devices in the server. This series of tests is called the power-on self-test, or POST.
If a power-on password is set, you must type the password and press Enter, when you are prompted, for POST to run.
If POST is completed without detecting any problems, the server startup is completed.
If POST detects a problem, an error message is sent to the POST event log.
The following table describes the POST error codes and suggested actions to correct the detected problems. These errors can appear as severe, warning, or informational.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
0010002 Microprocessor not supported
0011000 Invalid microprocessor type
1. Reseat the following components one at a time, in the order shown, restarting the server each time:
a. (Trained service technician only)
Microprocessor 1
b. (Trained service technician only)
Microprocessor 2 (if one is installed)
2. (Trained service technician only) Remove microprocessor 2 and restart the server.
3. (Trained service technician only) Remove microprocessor 1 and install microprocessor 2 in the microprocessor 1 connector. Restart the server. If the error is corrected, microprocessor 1 is bad and must be replaced.
4. Replace the following components one at a time, in the order shown, restarting the server each time:
a. (Trained service technician only)
Microprocessor 1
b. (Trained service technician only)
Microprocessor 2
c. (Trained service technician only) System
board
1. Update the firmware (see “Updating the firmware” on page 302).
2. (Trained service technician only) Remove and replace the affected microprocessor (error LED is lit) with a supported type.
26 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 45
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
0011002 Microprocessor mismatch
1. Run the Setup utility and view the microprocessor information to compare the installed microprocessor specifications.
2. (Trained service technician only) Remove and replace one of the microprocessors so that they both match.
0011004 Microprocessor failed BIST
1. Update the firmware (see “Updating the firmware” on page 302).
2. (Trained service technician only) Reseat microprocessor 2.
3. Replace the following components one at a time, in the order shown, restarting the server each time:
a. (Trained service technician only)
Microprocessor
b. (Trained service technician only) System
board
001100A Microcode update failed
1. Update the server firmware (see “Updating the firmware” on page 302).
2. (Trained service technician only) Replace the microprocessor.
0050001 DIMM disabled Note: Each time you install or remove a DIMM, you
must disconnect the server from the power source; then, wait 10 seconds before restarting the server.
1. Make sure the DIMM is installed correctly (see “Installing a memory module” on page 231).
2. If the DIMM was disabled because of a memory fault, follow the suggested actions for that error event and restart the server.
3. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory event. If no memory fault is recorded in the logs and no DIMM connector error LED is lit, you can re-enable the DIMM through the Setup utility or the Advanced Settings Utility (ASU).
Chapter 3. Diagnostics 27
Page 46
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
0051003 Uncorrectable DIMM error Note: Each time you install or remove a DIMM, you
must disconnect the server from the power source; then, wait 10 seconds before restarting the server.
1. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
2. Manually re-enable all affected DIMMs if the server firmware version is older than UEFI v1.10. If the server firmware version is UEFI v1.10 or newer, disconnect and reconnect the server to the power source and restart the server.
3. If the problem remains, replace the failing DIMM (see “Removing a memory module” on page 230 and “Installing a memory module” on page 231).
4. (Trained service technician only) If the problem occurs on the same DIMM connector, check the DIMM connector. If the connector contains any foreign material or is damaged, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
5. (Trained service technician only) Remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page
298).
6. (Trained Service technician only) Replace the affected microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
0051006 DIMM mismatch detected Make sure that the DIMMs match and are installed in
the correct sequence (see “Installing a memory module” on page 231).
28 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 47
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
0051009 No memory detected
1. Make sure that the server contains DIMMs.
2. Reseat the DIMMs (see “Removing a memory module” on page 230 and “Installing a memory module” on page 231).
3. Install DIMMs in the correct sequence (see “Installing a memory module” on page 231).
4. (Trained service technician only) Replace the failing microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
5. (Trained service technician only) Replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
0600369 No memory detected
1. Make sure that the server contains DIMMs.
2. Reseat the DIMMs.
3. Install DIMMs in the correct sequence (see “Installing a memory module” on page 231).
4. (Trained service technician only) Replace the failing microprocessor.
5. (Trained service technician only) Replace the system board.
005100A No usable memory detected
1. Make sure that the server contains DIMMs.
2. Reseat the DIMMs (see “Removing a memory module” on page 230 and “Installing a memory module” on page 231).
3. Install DIMMs in the correct sequence (see “Installing a memory module” on page 231).
4. Clear CMOS memory to re-enable all the memory connectors (see “System board switches and jumpers” on page 17). Note that all firmware settings will be reset to the default settings.
Chapter 3. Diagnostics 29
Page 48
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
0058001 PFA threshold exceeded
0058007 DIMM population is unsupported
1. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
2. Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see “Installing a memory module” on page 231 for memory population).
3. If the error still occurs on the same DIMM, replace the affected DIMM.
4. (Trained service technician only) If the problem occurs on the same DIMM connector, check the DIMM connector. If the connector contains any foreign material or is damaged, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
5. (Trained service technician only) Remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page
298).
6. (Trained Service technician only) Replace the affected microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
1. Reseat the DIMMs, and then restart the server (see “Removing a memory module” on page 230 and “Installing a memory module” on page 231).
2. Make sure that the DIMMs are installed in the proper sequence (see “Installing a memory module” on page 231).
30 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 49
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
0058008 DIMM failed memory test
1. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
2. Manually re-enable all affected DIMMs if the server firmware version is older than UEFI v1.10. If the server firmware version is UEFI v1.10 or newer, disconnect and reconnect the server to the power source and restart the server.
3. Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see “Installing a memory module” on page 231 for memory population).
4. If the problem is related to a DIMM, replace the failing DIMM (see “Removing a memory module” on page 230 and “Installing a memory module” on page 231).
5. (Trained service technician only) If the problem occurs on the same DIMM connector, check the DIMM connector. If the connector contains any foreign material or is damaged, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
6. (Trained service technician only) Remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page
298).
7. (Trained service technician only) If the problem is related to microprocessor socket pins, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
8. (Trained Service technician only) Replace the affected microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
0058015 Start to Activate Spare Memory Channel Information only. A failed DIMM has been detected to
activate the memory online-spare feature. Check the event log for uncorrected DIMM failure events. Note: The memory online-spare feature is supported on server models with an Intel Xeon
5600 series
microprocessor.
Chapter 3. Diagnostics 31
Page 50
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
00580A1 Invalid DIMM population for mirroring mode
1. If a fault LED is lit, resolve the failure.
2. Install the DIMMs in the correct sequence (see “Installing a memory module” on page 231).
00580A4 Memory population changed Information only. Memory has been added, moved, or
changed.
00580A5 Mirror failover complete Information only. Memory redundancy has been lost.
Check the event log for uncorrected DIMM failure events (see “Event logs” on page 23).
00580A6 Spare Memory Channel Activated Information only. Memory online-spare channel has
been activated to back up a failed DIMM. Check the event log for uncorrected DIMM failure events. Note: The memory online-spare feature is supported on server models with an Intel Xeon
5600 series
microprocessor.
0068002 CMOS battery cleared
1. Reseat the battery.
2. Clear the CMOS memory (see “System board switches and jumpers” on page 17).
3. Replace the following components one at a time, in the following order, restarting the server after each one:
a. Battery b. (Trained service technician only) System
board
2011000 PCI-X PERR
1. Check the extender card LEDs.
2. Reseat all affected adapters and extender cards.
3. Update the PCI device firmware.
4. Remove the adapters from the extender card.
5. Replace the following components one at a time, in the order shown, restarting the server each time:
a. Extender card b. (Trained service technician only) System
board
32 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 51
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
2011001 PCI-X SERR
1. Check the extender-card LEDs.
2. Reseat all affected adapters and extender cards.
3. Update the PCI device firmware.
4. Remove the adapters from the extender card.
5. Replace the following components one at a time, in the order shown, restarting the server each time:
a. Extender card b. (Trained service technician only) System
board
2018001 PCI Express uncorrected or uncorrected
error
1. Check the extender-card LEDs.
2. Reseat all affected adapters and extender cards.
3. Update the PCI device firmware.
4. Remove both adapters from the extender card.
5. Replace the following components one at a time, in the order shown, restarting the server each time:
a. Extender card b. (Trained service technician only) System
board
2018002 Option ROM resource allocation failure Informational message that some devices might not
be initialized.
1. If possible, rearrange the order of the adapters in the PCI slots to change the load order of the optional-device ROM code.
2. Run the Setup utility, select Start Options, and change the boot priority to change the load order of the optional-device ROM code.
3. Run the Setup utility and disable some other resources, if their functions are not being used, to make more space available. Select Devices and I/O Ports to disable any of the integrated devices.
4. Replace the following components one at a time, in the order shown, restarting the server each time:
a. Each adapter b. (Trained service technician only) System
board
3xx0007 (xx can be 00 - 19)
Firmware fault detected, system halted
1. Recover the server firmware to the latest level.
2. Undo any recent configuration changes, or clear CMOS memory to restore the settings to the default values.
3. Remove any recently installed hardware.
Chapter 3. Diagnostics 33
Page 52
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
3038003 Firmware corrupted
3048005 Booted secondary (backup) server firmware
image
3048006 Booted secondary (backup) server firmware
image because of ABR
305000A RTC date/time is incorrect
3058001 System configuration invalid
1. Run the Setup utility, select Load Default Settings, and save the settings to recover the server firmware.
2. (Trained service technician only) Replace the system board.
Information only. The backup switch was used to boot the secondary bank.
1. Run the Setup utility, select Load Default Settings, and save the settings to recover the primary server firmware settings.
2. Turn off the server and remove it from the power source.
3. Reconnect the server to the power source, and then turn on the server.
1. Adjust the date and time settings in the Setup utility, and then restart the server.
2. Reseat the battery.
3. Replace the following components one at a time, in the order shown, restarting the server each time:
a. Battery b. (Trained service technician only) System
board
1. Run the Setup utility, and select Save Settings.
2. Run the Setup utility, select Load Default Settings, and save the settings.
3. Reseat the following components one at a time in the order shown, restarting the server each time:
a. Battery b. Failing device (if the device is a FRU, it must
be reseated by a trained service technician only)
4. Replace the following components one at a time, in the order shown, restarting the server each time:
a. Battery b. Failing device (if the device is a FRU, it must
be replaced by a trained service technician only)
c. (Trained service technician only) System
board
34 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 53
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
3058004 Three boot failures
1. Undo any recent system changes, such as new settings or newly installed devices.
2. Make sure that the server is attached to a reliable power source.
3. Remove all hardware that is not listed on the ServerProven Web site.
4. Make sure that the operating system is not corrupted.
5. Run the Setup utility, save the configuration, and then restart the server.
3108007 System configuration restored to default
settings
3138002 Boot configuration error
Information only. This is message is usually associated with the CMOS battery clear event.
1. Remove any recent configuration changes that you made in the Setup utility.
2. Run the Setup utility, select Load Default Settings, and save the settings.
3808000 IMM communication failure
1. Remove power from the server for 30 seconds, and then reconnect the server to power and restart it.
2. Update the IMM firmware. (See “Updating the firmware” on page 302).
3. Make sure that the virtual media key is seated and not damaged.
4. (Trained service technician only) Replace the system board.
3808002 Error updating system configuration to IMM
1. Remove power from the server, and then reconnect the server to power and restart it.
2. Run the Setup utility and select Save Settings.
3. Update the firmware.
3808003 Error retrieving system configuration from
IMM
1. Remove power from the server, and then reconnect the server to power and restart it.
2. Run the Setup utility and select Save Settings.
3. Update the IMM firmware.
3808004 IMM system-event log full
v When out-of-band, use the IMM Web interface or
IPMItool to clear the logs from the operating system.
v When using the local console:
1. Run the Setup utility.
2. Select System Event Logs.
3. Select Clear System Event Log.
4. Restart the server.
Chapter 3. Diagnostics 35
Page 54
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
3818001 Core Root of Trust Measurement (CRTM)
update failed
3818002 Core Root of Trust Measurement (CRTM)
update aborted
3818003 Core Root of Trust Measurement (CRTM)
flash lock failed
3818004 Core Root of Trust Measurement (CRTM)
system error
3818005 Current Bank Core Root of Trust
Measurement (CRTM) capsule signature invalid
3818006 Opposite bank CRTM capsule signature
invalid
3818007 CRTM update capsule signature invalid
3828004 AEM power capping disabled
1. Run the Setup utility, select Load Default Settings, and save the settings.
2. (Trained service technician only) Replace the system board.
1. Run the Setup utility, select Load Default Settings, and save the settings.
2. (Trained service technician only) Replace the system board.
1. Run the Setup utility, select Load Default Settings, and save the settings.
2. (Trained service technician only) Replace the system board.
1. Run the Setup utility, select Load Default Settings, and save the settings.
2. (Trained service technician only) Replace the system board.
1. Run the Setup utility, select Load Default Settings, and save the settings.
2. (Trained service technician only) Replace the system board.
1. Switch the firmware bank to the backup bank.
2. Run the Setup utility, select Load Default Settings, and save the settings.
3. Switch the bank back to the current bank.
4. (Trained service technician only) Replace the system board.
1. Run the Setup utility, select Load Default Settings, and save the settings.
2. (Trained service technician only) Replace the system board.
1. Check the settings and the event logs.
2. Make sure that the Active Energy Manager feature is enabled in the Setup utility. Select
System Settings>Power>Active Energy Manager>Capping Enabled.
3. Update the server firmware.
4. Update the IMM firmware.
36 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 55

System-event log

The system-event log contains messages of three types:
Information
Information messages do not require action; they record significant system-level events, such as when the server is started.
Warning
Warning messages do not require immediate action; they indicate possible problems, such as when the recommended maximum ambient temperature is exceeded.
Error Error messages might require action; they indicate system errors, such as
when a fan is not detected.
Each message contains date and time information, and it indicates the source of the message (POST or the IMM).

Integrated management module error messages

The following table describes the IMM error messages and suggested actions to correct the detected problems. For more information about IMM, see the Integrated Management Module User's Guide at http://www.ibm.com/systems/support/ supportsite.wss/docdisplay?lndocid=MIGR-5079770&brandind=5000008.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Message Severity Description Action
Numeric sensor Ambient Temp going high (upper critical) has asserted.
Numeric sensor Ambient Temp going high (upper non-recoverable) has asserted.
Numeric sensor Planar 3.3V going low (lower critical) has asserted.
Numeric sensor Planar 3.3V going high (upper critical) has asserted.
Numeric sensor Planar 5V going low (lower critical) has asserted.
Numeric sensor Planar 5V going high (upper critical) has asserted.
Numeric sensor Planar VBAT going low (lower critical) has asserted.
Numeric sensor Fan n Tach going low (lower critical) has asserted. (n = fan number)
Error An upper critical sensor
going high has asserted.
Error An upper nonrecoverable
sensor going high has asserted.
Error A lower critical sensor going
low has asserted.
Error An upper critical sensor
going high has asserted.
Error A lower critical sensor going
low has asserted.
Error An upper critical sensor
going high has asserted.
Error A lower critical sensor going
low has asserted.
Error A lower critical sensor going
low has asserted.
Reduce the ambient temperature.
Reduce the ambient temperature.
(Trained service technician only) Replace the system board.
(Trained service technician only) Replace the system board.
(Trained service technician only) Replace the system board.
(Trained service technician only) Replace the system board.
Replace the 3 V battery.
1. Reseat the failing fan n, which is indicated by a lit LED on the fan.
2. Replace the failing fan.
(n = fan number)
Chapter 3. Diagnostics 37
Page 56
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
The Processor CPU nStatus has Failed with IERR. (n = microprocessor number)
An Over-Temperature Condition has been detected on the Processor CPU nStatus. (n = microprocessor number)
Error A processor failed - IERR
condition has occurred.
Error An overtemperature
condition has occurred for microprocessor n. (n = microprocessor number)
1. Make sure that the latest levels of firmware and device drivers are installed for all adapters and standard devices, such as Ethernet, SCSI, and SAS. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
2. Run the DSA program for the hard disk drives and other I/O devices.
3. (Trained service technician only) Replace microprocessor n.
(n = microprocessor number)
1. Make sure that the fans are operating, that there are no obstructions to the airflow, that the air baffle is in place and correctly installed, and that the server cover is installed and completely closed.
2. Make sure that the heat sink for microprocessor nis installed correctly.
3. (Trained service technician only) Replace microprocessor n.
(n = microprocessor number)
38 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 57
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
The Processor CPU nStatus has Failed with FRB1/BIST condition. (n = microprocessor number)
Error A processor failed -
FRB1/BIST condition has occurred.
1. Check for a server firmware update. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
2. Make sure that the installed microprocessors are compatible with each other (see “Installing a microprocessor and heat sink” on page 286 for information about microprocessor requirements).
3. (Trained service technician only) Reseat microprocessor n.
4. (Trained service technician only) Replace microprocessor n.
(n = microprocessor number)
The Processor CPU nStatus has a Configuration Mismatch. (n = microprocessor number)
Error A processor configuration
mismatch has occurred.
1. Make sure that the installed microprocessors are compatible with each other (see “Installing a microprocessor and heat sink” on page 286 for information about microprocessor requirements).
2. (Trained service technician only) Replace the incompatible microprocessor.
Chapter 3. Diagnostics 39
Page 58
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
An SM BIOS Uncorrectable CPU complex error for Processor CPU nStatus has asserted. (n = microprocessor number)
Sensor CPU nOverTemp has transitioned to critical from a less severe state. (n = microprocessor number)
Error An SMBIOS uncorrectable
CPU complex error has asserted.
Error A sensor has changed to
Critical state from a less severe state.
1. Check for a server firmware update. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
2. Make sure that the installed microprocessors are compatible with each other (see “Installing a microprocessor and heat sink” on page 286 for information about microprocessor requirements).
3. (Trained service technician only) Reseat microprocessor n.
4. (Trained service technician only) Replace microprocessor n.
(n = microprocessor number)
1. Make sure that the fans are operating, that there are no obstructions to the airflow, that the air baffle is in place and correctly installed, and that the server cover is installed and completely closed.
2. Make sure that the heat sink for microprocessor n is installed correctly.
3. (Trained service technician only) Replace microprocessor n.
(n = microprocessor number)
40 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 59
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Sensor CPU nOverTemp has transitioned to non-recoverable from a less severe state. (n = microprocessor number)
Error A sensor has changed to
Nonrecoverable state from a less severe state.
1. Make sure that the fans are operating, that there are no obstructions to the airflow, that the air baffle is in place and correctly installed, and that the server cover is installed and completely closed.
2. Make sure that the heat sink for microprocessor n is installed correctly.
3. (Trained service technician only) Replace microprocessor n.
(n = microprocessor number)
Sensor CPU nOverTemp has transitioned to critical from a non-recoverable state. (n = microprocessor number)
Error A sensor has changed to
Critical state from Nonrecoverable state.
1. Make sure that the fans are operating, that there are no obstructions to the airflow, that the air baffle is in place and correctly installed, and that the server cover is installed and completely closed.
2. Make sure that the heat sink for microprocessor nis installed correctly.
3. (Trained service technician only) Replace microprocessor n.
(n = microprocessor number)
Sensor CPU nOverTemp has transitioned to non-recoverable. (n = microprocessor number)
Error A sensor has changed to
Nonrecoverable state.
1. Make sure that the fans are operating, that there are no obstructions to the airflow, that the air baffle is in place and correctly installed, and that the server cover is installed and completely closed.
2. Make sure that the heat sink for microprocessor nis installed correctly.
3. (Trained service technician only) Replace microprocessor n.
(n = microprocessor number)
Chapter 3. Diagnostics 41
Page 60
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
A diagnostic interrupt has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)
A bus timeout has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)
A software NMI has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)
The System %1 encountered a POST Error. (%1 = CIM_ComputerSystem. ElementName)
Error An operator information
panel NMI/diagnostic interrupt has occurred.
Error A bus timeout has occurred.
Error A software NMI has
occurred.
Error A POST error has occurred.
(Sensor = ABR Status)
If the NMI button on the system board has not been pressed, complete the following steps:
1. Make sure that the NMI button is not pressed.
2. Replace the operator information panel cable.
3. Replace the operator information panel.
1. Remove the adapter from the PCI slot that is indicated by a lit LED.
2. Replace the extender card.
3. Remove all PCI adapters.
4. (Trained service technicians only) Replace the system board.
1. Check the device driver.
2. Reinstall the device driver.
1. Recover the server firmware from the backup page (see “Recovering the server firmware” on page 134).
2. Update the server firmware to the latest level. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
42 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 61
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
The System %1 encountered a POST Error. (%1 = CIM_ComputerSystem. ElementName)
Error A POST error has occurred.
(Sensor = Firmware Error)
1. Make sure that the server contains DIMMs.
2. Reseat the DIMMs.
3. Install DIMMs in the correct sequence (see “Installing a memory module” on page
231).
4. Update the server firmware on the primary page. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
5. (Trained service technician only) Replace the failing microprocessor.
6. (Trained service technician only) Replace the system board.
An Uncorrectable Bus Error has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)
Error A bus uncorrectable error
has occurred. (Sensor = Critical Int PCI)
1. Check the system-event log.
2. Check the PCI error LEDs.
3. Remove the adapter from the indicated PCI slot.
4. Check for a server firmware update. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
5. (Trained service technician only) Replace the system board.
Chapter 3. Diagnostics 43
Page 62
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
An Uncorrectable Bus Error has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)
An Uncorrectable Bus Error has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)
Error A bus uncorrectable error
has occurred. (Sensor = Critical Int CPU)
Error A bus uncorrectable error
has occurred. (Sensor = Critical Int DIM)
1. Check the system-event log.
2. Check the microprocessor error LEDs.
3. Remove the failing microprocessor from the system board.
4. Check for a server firmware update. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
5. Make sure that the two microprocessors are matching.
6. (Trained service technician only) Replace the system board.
1. Check the system-event log.
2. Check the DIMM error LEDs.
3. Remove the failing DIMM from the system board.
4. Check for a server firmware update. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
5. Make sure that the installed DIMMs are supported and configured correctly.
6. (Trained service technician only) Replace the system board.
44 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 63
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Sensor Sys Board Fault has transitioned to critical from a less severe state.
Error A sensor has changed to
Critical state from a less severe state.
1. Check the system-event log.
2. Check for an error LED on the system board.
3. Replace any failing device.
4. Check for a server firmware update. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
5. (Trained service technician only) Replace the system board.
The Power Supply (Power Supply: n) has Failed. (n = power supply number)
Error Power supply nhas failed.
(n = power supply number)
1. If the power-on LED is lit, complete the following steps:
a. Reduce the server to the
minimum configuration.
b. Reinstall the components
one at a time, restarting the server each time.
c. If the error recurs, replace
the component that you just reinstalled.
2. Reseat power supply n.
3. Replace power supply n.
(n = power supply number)
Sensor PS n Fan Fault has transitioned to critical from a less severe state. (n = power supply number)
Error A sensor has changed to
Critical state from a less severe state.
1. Make sure that there are no obstructions, such as bundled cables, to the airflow from the power-supply fan.
2. Replace power supply n.
(n = power supply number)
Chapter 3. Diagnostics 45
Page 64
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Sensor Pwr Rail A Fault has transitioned to non-recoverable.
Sensor Pwr Rail B Fault has transitioned to non-recoverable.
Error A sensor has changed to
Nonrecoverable state.
Error A sensor has changed to
Nonrecoverable state.
1. Turn off the server and disconnect it from power.
2. (Trained service technician only) Remove the PCI adapter and microprocessor
1. Reinstall the microprocessor in socket 1 and restart the server.
3. Restart the server.
4. Reinstall each device, one at a time, starting the server each time to isolate the failing device.
5. Replace the failing device.
6. (Trained service technician only) Replace the system board.
1. Turn off the server and disconnect it from power.
2. (Trained service technician only) Remove the PCI adapter and microprocessor
2.
3. Restart the server.
4. Reinstall each device, one at a time, starting the server each time to isolate the failing device.
5. Replace the failing device.
6. (Trained service technician only) Replace the system board.
46 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 65
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Sensor Pwr Rail C Fault has transitioned to non-recoverable.
Error A sensor has changed to
Nonrecoverable state.
1. Turn off the server and disconnect it from power.
2. Remove the hard disk drives, hard disk drive backplanes, and DIMMs in connectors 1 through 8.
3. Restart the server.
4. Reinstall each device, one at a time, starting the server each time to isolate the failing device.
5. Replace the failing device.
6. (Trained service technician only) Replace the system board.
Sensor Pwr Rail D Fault has transitioned to non-recoverable.
Error A sensor has changed to
Nonrecoverable state.
1. Turn off the server and disconnect it from power.
2. Remove the optical drive and the DIMMs in connectors 9 through 16.
3. Restart the server.
4. Reinstall the microprocessor in socket 1 and restart the server.
5. (Trained service technician only) Replace the failing microprocessor.
6. (Trained service technician only) Replace the system board.
Sensor Pwr Rail E Fault has transitioned to non-recoverable.
Error A sensor has changed to
Nonrecoverable state.
1. Turn off the server and disconnect it from power.
2. (Trained service technician only) Remove the optical drive and the PCI adapter.
3. Restart the server.
4. Reinstall each device, one at a time, starting the server each time to isolate the failing device.
5. Replace the failing device.
6. (Trained service technician only) Replace the system board.
Chapter 3. Diagnostics 47
Page 66
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Sensor Pwr Rail F Fault has transitioned to non-recoverable.
Sensor PS n Therm Fault has transitioned to critical from a less severe state. (n = power supply number)
Sensor PSn 12V OV Fault has transitioned to non-recoverable. (n = power supply number)
Sensor PSn 12V UV Fault has transitioned to non-recoverable.
Sensor PSn 12V OC Fault has transitioned to non-recoverable. (n = power supply number)
Sensor PS n VCO Fault has transitioned to non-recoverable. (n = power supply number)
Error A sensor has changed to
Nonrecoverable state.
Error A sensor has changed to
Critical state from a less severe state.
Error A sensor has changed to
Nonrecoverable state.
Error A sensor has changed to
Nonrecoverable state.
Error A sensor has changed to
Nonrecoverable state.
Error A sensor has changed to
Nonrecoverable state.
1. Turn off the server and disconnect it from power.
2. Remove the hard disk drives and the hard disk drive backplanes.
3. Restart the server.
4. Reinstall each device, one at a time, starting the server each time to isolate the failing device.
5. Replace the failing device.
6. (Trained service technician only) Replace the system board.
1. Make sure that there are no obstructions, such as bundled cables, to the airflow from the power-supply fan.
2. Replace power supply n.
(n = power supply number)
1. Remove the power supplies.
2. Replace power supply n.
3. (Trained service technician only) Replace the system board.
(n = power supply number)
1. Remove the power supplies.
2. Replace power supply n.
3. (Trained service technician only) Replace the system board.
(n = power supply number)
1. Remove the power supplies.
2. Replace power supply n.
3. (Trained service technician only) Replace the system board.
(n = power supply number)
1. Replace the failing power supply.
(n = power supply number)
48 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 67
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Redundancy Power Unit has been reduced.
Error Redundancy has been lost
and is insufficient to continue operation.
1. Check the LEDs for both power supplies.
2. Follow the actions in “Power-supply LEDs” on page
96.
Sensor RAID Error has transitioned to critical from a less severe state.
Error A sensor has changed to
Critical state from a less severe state.
1. Check the hard disk drive LEDs.
2. Reseat the hard disk drive for which the status LED is lit.
3. Replace the defective hard disk drive.
The Drive n Status has been removed from unit Drive 0 Status.
Error A drive has been removed. Reseat hard disk drive n.
(n = hard disk drive number)
(n = hard disk drive number) The Drive n Status has been
disabled due to a detected fault. (n = hard disk drive number)
Error A drive has been disabled
because of a fault.
1. Run the hard disk drive diagnostic test on drive n.
2. Reseat the following components:
a. Hard disk drive b. Cable from the system
board to the backplane
3. Replace the following components one at a time, in the order shown, restarting the server each time:
a. Hard disk drive b. Cable from the system
board to the backplane
c. Hard disk drive backplane
(n = hard disk drive number)
Array %1 is in critical condition. (%1 = CIM_ComputerSystem. ElementName)
Array %1 has failed. (%1 = CIM_ComputerSystem. ElementName)
Error An array is in Critical state.
(Sensor = Drive n Status) (n = hard disk drive number)
Error An array is in Failed state.
(Sensor = Drive n Status) (n = hard disk drive number)
Replace the hard disk drive that is indicated by a lit status LED.
Replace the hard disk drive that is indicated by a lit status LED.
Chapter 3. Diagnostics 49
Page 68
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected for DIMM All DIMMs on Memory Subsystem All DIMMs.
Error A memory uncorrectable
error has occurred.
1. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
2. Manually re-enable all affected DIMMs if the server firmware version is older than UEFI v1.10. If the server firmware version is UEFI v1.10 or newer, disconnect and reconnect the server to the power source and restart the server.
3. Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see “Installing a memory module” on page 231 for memory population).
4. If the problem follows the DIMM, replace the failing DIMM (see “Removing a memory module” on page 230 and “Installing a memory module” on page 231).
(Continued on the next page)
50 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 69
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected for DIMM All DIMMs on Memory Subsystem All DIMMs.
Error A memory uncorrectable
error has occurred.
5. (Trained service technician only) If the problem occurs on the same DIMM connector, check the DIMM connector. If the connector contains any foreign material or is damaged, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
6. (Trained service technician only) Remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
7. (Trained Service technician only) Replace the affected microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
Chapter 3. Diagnostics 51
Page 70
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory Logging Limit Reached for DIMM All DIMMs on Memory Subsystem All DIMMs.
Memory DIMM Configuration Error for All DIMMs on Memory Subsystem All DIMMs.
Error The memory logging limit
has been reached.
Error A DIMM configuration error
has occurred.
1. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
2. Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see “Installing a memory module” on page 231 for memory population).
3. If the error still occurs on the same DIMM, replace the affected DIMM.
4. (Trained service technician only) If the problem occurs on the same DIMM connector, check the DIMM connector. If the connector contains any foreign material or is damaged, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
5. (Trained service technician only) Remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
6. (Trained Service technician only) Replace the affected microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
Make sure that DIMMs are installed in the correct sequence and have the same size, type, speed, and technology.
52 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 71
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory DIMM disabled for All DIMMs on Memory Subsystem All DIMMs.
Info DIMM disabled
1. Make sure the DIMM is installed correctly (see “Installing a memory module” on page 231).
2. If the DIMM was disabled because of a memory fault (memory uncorrectable error or memory logging limit reached), follow the suggested actions for that error event and restart the server.
3. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory event. If no memory fault is recorded in the logs and no DIMM connector error LED is lit, you can re-enable the DIMM through the Setup utility or the Advanced Settings Utility (ASU).
Chapter 3. Diagnostics 53
Page 72
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected for DIMM One of the DIMMs on Memory Subsystem One of the DIMMs.
Error A memory uncorrectable
error has occurred.
1. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
2. Manually re-enable all affected DIMMs if the server firmware version is older than UEFI v1.10. If the server firmware version is UEFI v1.10 or newer, disconnect and reconnect the server to the power source and restart the server.
3. Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see “Installing a memory module” on page 231 for memory population).
4. If the problem follows the DIMM, replace the failing DIMM (see “Removing a memory module” on page 230 and “Installing a memory module” on page 231).
(Continued on the next page)
54 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 73
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected for DIMM One of the DIMMs on Memory Subsystem One of the DIMMs.
Error A memory uncorrectable
error has occurred.
5. (Trained service technician only) If the problem occurs on the same DIMM connector, check the DIMM connector. If the connector contains any foreign material or is damaged, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
6. (Trained service technician only) Remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
7. (Trained Service technician only) Replace the affected microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
Chapter 3. Diagnostics 55
Page 74
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory Logging Limit Reached for DIMM One of the DIMMs on Memory Subsystem One of the DIMMs.
Memory DIMM Configuration Error for One of the DIMMs on Memory Subsystem One of the DIMMs.
Error The memory logging limit
has been reached.
Error A DIMM configuration error
has occurred.
1. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
2. Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see “Installing a memory module” on page 231 for memory population).
3. If the error still occurs on the same DIMM, replace the affected DIMM.
4. (Trained service technician only) If the problem occurs on the same DIMM connector, check the DIMM connector. If the connector contains any foreign material or is damaged, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
5. (Trained service technician only) Remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
6. (Trained Service technician only) Replace the affected microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
Make sure that DIMMs are installed in the correct sequence and have the same size, type, speed, and technology.
56 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 75
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory DIMM disabled for One of the DIMMs on Memory Subsystem One of the DIMMs.
Info DIMM disabled.
1. Make sure the DIMM is installed correctly (see “Installing a memory module” on page 231).
2. If the DIMM was disabled because of a memory fault (memory uncorrectable error or memory logging limit reached), follow the suggested actions for that error event and restart the server.
3. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory event. If no memory fault is recorded in the logs and no DIMM connector error LED is lit, you can re-enable the DIMM through the Setup utility or the Advanced Settings Utility (ASU).
Chapter 3. Diagnostics 57
Page 76
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory DIMM scrub failure for DIMM n Status on Memory Subsystem DIMM n Status. (n = DIMM number)
Error DIMM scrub failure.
1. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
2. Manually re-enable all affected DIMMs if the server firmware version is older than UEFI v1.10. If the server firmware version is UEFI v1.10 or newer, disconnect and reconnect the server to the power source and restart the server.
3. Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see “Installing a memory module” on page 231 for memory population).
4. If the problem is related to a DIMM, replace the failing DIMM (see “Removing a memory module” on page 230 and “Installing a memory module” on page 231).
(Continued on the next page)
58 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 77
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory DIMM scrub failure for DIMM n Status on Memory Subsystem DIMM n Status. (n = DIMM number)
Error DIMM scrub failure.
5. (Trained service technician only) If the problem occurs on the same DIMM connector, check the DIMM connector. If the connector contains any foreign material or is damaged, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
6. (Trained service technician only) Remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
7. (Trained service technician only) If the problem is related to microprocessor socket pins, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
8. (Trained Service technician only) Replace the affected microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
Chapter 3. Diagnostics 59
Page 78
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected for DIMM n Status on Memory Subsystem DIMM n Status. (n = DIMM number)
Error A memory uncorrectable
error has occurred.
1. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
2. Manually re-enable all affected DIMMs if the server firmware version is older than UEFI v1.10. If the server firmware version is UEFI v1.10 or newer, disconnect and reconnect the server to the power source and restart the server.
3. Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see “Installing a memory module” on page 231 for memory population).
4. If the problem follows the DIMM, replace the failing DIMM (see “Removing a memory module” on page 230 and “Installing a memory module” on page 231).
(Continued on the next page)
60 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 79
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected for DIMM n Status on Memory Subsystem DIMM n Status. (n = DIMM number)
Error A memory uncorrectable
error has occurred.
5. (Trained service technician only) If the problem occurs on the same DIMM connector, check the DIMM connector. If the connector contains any foreign material or is damaged, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
6. (Trained service technician only) Remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
7. (Trained Service technician only) Replace the affected microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
Chapter 3. Diagnostics 61
Page 80
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory Logging Limit Reached for DIMM nStatus on Memory Subsystem DIMMnStatus. (n = DIMM number)
Memory DIMM Configuration Error for DIMM nStatus on Memory Subsystem DIMM nStatus. (n = DIMM number)
Error The memory logging limit
has been reached.
Error A DIMM configuration error
has occurred.
1. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory error.
2. Swap the affected DIMMs (as indicated by the error LEDs on the system board or the event logs) to a different memory channel or microprocessor (see “Installing a memory module” on page 231 for memory population).
3. If the error still occurs on the same DIMM, replace the affected DIMM.
4. (Trained service technician only) If the problem occurs on the same DIMM connector, check the DIMM connector. If the connector contains any foreign material or is damaged, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
5. (Trained service technician only) Remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board (see “Removing the system board” on page 296 and “Installing the system board” on page 298).
6. (Trained Service technician only) Replace the affected microprocessor (see “Removing a microprocessor and heat sink” on page 284 and “Installing a microprocessor and heat sink” on page 286).
Make sure that DIMMs are installed in the correct sequence and have the same size, type, speed, and technology.
62 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 81
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory DIMM disabled for DIMM n Status on Memory Subsystem DIMM n Status. (n = DIMM number)
Info DIMM disabled.
1. Make sure the DIMM is installed correctly (see “Installing a memory module” on page 231).
2. If the DIMM was disabled because of a memory fault (memory uncorrectable error or memory logging limit reached), follow the suggested actions for that error event and restart the server.
3. Check the IBM support website for an applicable retain tip or firmware update that applies to this memory event. If no memory fault is recorded in the logs and no DIMM connector error LED is lit, you can re-enable the DIMM through the Setup utility or the Advanced Settings Utility (ASU).
Sensor DIMM n Temp has transitioned to critical from a less severe state. (n = DIMM number)
Error A sensor has changed to
Critical state from a less severe state.
1. Make sure that the fans are operating, that there are no obstructions to the airflow, that the air baffles are in place and correctly installed, and that the server cover is installed and completely closed.
2. If a fan has failed, complete the action for a fan failure.
3. Replace DIMM n.
(n = DIMM number)
Chapter 3. Diagnostics 63
Page 82
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
A PCI PERR has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)
A PCI SERR has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)
Error A PCI PERR has occurred.
(Sensor = PCI Slot n; n = PCI slot number)
Error A PCI SERR has occurred.
(Sensor = PCI Slot n; n = PCI slot number)
1. Check the extender-card LEDs.
2. Reseat the affected adapters and extender card.
3. Update the server and adapter firmware (UEFI and IMM). Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
4. Remove the adapter from slot n.
5. Replace the PCIe adapter.
6. Replace extender card n.
(n = PCI slot number)
1. Check the extender-card LEDs.
2. Reseat the affected adapters and extender card.
3. Update the server and adapter firmware (UEFI and IMM). Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
4. Remove the adapter from slot n.
5. Replace the PCIe adapter.
6. Replace extender card n.
(n = PCI slot number)
64 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 83
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
A PCI PERR has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)
Error A PCI PERR has occurred.
(Sensor = One of PCI Err)
1. Check the extender-card LEDs.
2. Reseat the affected adapters and riser card.
3. Update the server and adapter firmware (UEFI and IMM). Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
4. Remove both adapters.
5. Replace the PCIe adapter.
6. Replace the extender card.
7. (Trained service technician only) Replace the system board.
A PCI SERR has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)
Error A PCI SERR has occurred.
(Sensor = One of PCI Err)
1. Check the extender-card LEDs.
2. Reseat the affected adapters and extender card.
3. Update the server and adapter firmware (UEFI and IMM). Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
4. Remove both adapters.
5. Replace the PCIe adapter.
6. Replace the extender card.
7. (Trained service technician only) Replace the system board.
Chapter 3. Diagnostics 65
Page 84
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Fault in slot System board on system %1. (%1 = CIM_ComputerSystem. ElementName)
Redundancy Bckup Mem Status has been reduced.
IMM Network Initialization Complete. Info An IMM network has
Certificate Authority %1 has detected a %2 Certificate Error. (%1 = IBM_CertificateAuthority. CADistinguishedName; %2 = CIM_PublicKeyCertificate. ElementName)
Ethernet Data Rate modified from %1 to %2 by user %3. (%1 = CIM_EthernetPort.Speed; %2 = CIM_EthernetPort.Speed; %3 = user ID)
Error
Error Redundancy has been lost
and is insufficient to continue operation.
completed initialization.
Error A problem has occurred with
the SSL Server, SSL Client, or SSL Trusted CA certificate that has been imported into the IMM. The imported certificate must contain a public key that corresponds to the key pair that was previously generated by the
Generate a New Key and Certificate Signing Request link.
Info A user has modified the
Ethernet port data rate.
1. Check the extender-card LEDs.
2. Reseat the affected adapters and extender card.
3. Update the server and adapter firmware (UEFI and IMM). Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
4. Remove both adapters.
5. Replace the PCIe adapter.
6. Replace the extender card.
7. (Trained service technician only) Replace the system board.
1. Check the system-event log for DIMM failure events (uncorrectable or PFA) and correct the failures.
2. Re-enable mirroring in the Setup utility.
No action; information only.
1. Make sure that the certificate that you are importing is correct.
2. Try importing the certificate again.
No action; information only.
66 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 85
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Ethernet Duplex setting modified from %1 to %2 by user %3.
Info A user has modified the
Ethernet port duplex setting.
No action; information only.
(%1 = CIM_EthernetPort.FullDuplex; %2 = CIM_EthernetPort.FullDuplex; %3 = user ID)
Ethernet MTU setting modified from %1 to %2 by user %3.
Info A user has modified the
Ethernet port MTU setting.
No action; information only.
(%1 = CIM_EthernetPort. ActiveMaximumTransmissionUnit; %2 = CIM_EthernetPort. ActiveMaximumTransmissionUnit; %3 = user ID)
Ethernet Duplex setting modified from %1 to %2 by user %3. (%1 = CIM_EthernetPort.
Info A user has modified the
Ethernet port MAC address setting.
No action; information only.
NetworkAddresses; %2 = CIM_EthernetPort. NetworkAddresses; %3 = user ID)
Ethernet interface %1 by user %2. (%1 = CIM_EthernetPort.EnabledState;
Info A user has enabled or
disabled the Ethernet interface.
No action; information only.
%2 = user ID) Hostname set to %1 by user %2.
(%1 = CIM_DNSProtocolEndpoint.
Info A user has modified the host
name of the IMM.
No action; information only.
Hostname; %2 = user ID)
IP address of network interface modified from %1 to %2 by user %3.
Info A user has modified the IP
address of the IMM.
No action; information only.
(%1 = CIM_IPProtocolEndpoint. IPv4Address; %2 = CIM_StaticIPAssignment SettingData.IPAddress; %3 = user ID)
IP subnet mask of network interface modified from %1 to %2 by user
Info A user has modified the IP
subnet mask of the IMM.
No action; information only.
%3s. (%1 = CIM_IPProtocolEndpoint. SubnetMask; %2 = CIM_StaticIPAssignment SettingData.SubnetMask; %3 = user ID)
Chapter 3. Diagnostics 67
Page 86
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
IP address of default gateway modified from %1 to %2 by user %3s. (%1 = CIM_IPProtocolEndpoint. GatewayIPv4Address; %2 = CIM_StaticIPAssignment SettingData. DefaultGatewayAddress; %3 = user ID)
OS Watchdog response %1 by %2. (%1 = Enabled or Disabled; %2 = user ID)
DHCP[%1] failure, no IP address assigned. (%1 = IP address, xxx.xxx.xxx.xxx)
Remote Login Successful. Login ID: %1 from %2 at IP address %3. (%1 = user ID; %2 = ValueMap(CIM_ProtocolEndpoint. ProtocolIFType; %3 = IP address, xxx.xxx.xxx.xxx)
Attempting to %1 server %2 by user %3. (%1 = Power Up, Power Down, Power Cycle, or Reset; %2 = IBM_ComputerSystem. ElementName; %3 = user ID)
Security: Userid: '%1' had %2 login failures from WEB client at IP address %3. (%1 = user ID; %2 = MaximumSuccessiveLoginFailures (currently set to 5 in the firmware); %3 = IP address, xxx.xxx.xxx.xxx)
Security: Login ID: '%1' had %2 login failures from CLI at %3. (%1 = user ID; %2 = MaximumSuccessiveLoginFailures (currently set to 5 in the firmware); %3 = IP address, xxx.xxx.xxx.xxx)
Info A user has modified the
default gateway IP address of the IMM.
Info A user has enabled or
disabled an OS Watchdog.
Info A DHCP server has failed to
assign an IP address to the IMM.
Info A user has successfully
logged in to the IMM.
Info A user has used the IMM to
perform a power function on the server.
Error A user has exceeded the
maximum number of unsuccessful login attempts from a Web browser and has been prevented from logging in for the lockout period.
Error A user has exceeded the
maximum number of unsuccessful login attempts from the command-line interface and has been prevented from logging in for the lockout period.
No action; information only.
No action; information only.
1. Make sure that the network cable is connected.
2. Make sure that there is a DHCP server on the network that can assign an IP address to the IMM.
No action; information only.
No action; information only.
1. Make sure that the correct login ID and password are being used.
2. Have the system administrator reset the login ID or password.
1. Make sure that the correct login ID and password are being used.
2. Have the system administrator reset the login ID or password.
68 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 87
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Remote access attempt failed. Invalid userid or password received. Userid is '%1' from WEB browser at IP address %2. (%1 = user ID; %2 = IP address, xxx.xxx.xxx.xxx)
Remote access attempt failed. Invalid userid or password received. Userid is '%1' from TELNET client at IP address %2. (%1 = user ID; %2 = IP address, xxx.xxx.xxx.xxx)
The Chassis Event Log (CEL) on system %1 cleared by user %2.
Error A user has attempted to log
in from a Web browser by using an invalid login ID or password.
Error A user has attempted to log
in from a Telnet session by using an invalid login ID or password.
Info A user has cleared the IMM
event log.
1. Make sure that the correct login ID and password are being used.
2. Have the system administrator reset the login ID or password.
1. Make sure that the correct login ID and password are being used.
2. Have the system administrator reset the login ID or password.
No action; information only.
(%1 = CIM_ComputerSystem. ElementName; %2 = user ID)
IMM reset was initiated by user %1. (%1 = user ID)
ENET[0] DHCP-HSTN=%1, DN=%2, IP@=%3, SN=%4, GW@=%5, DNS1@=%6.
Info A user has initiated a reset
of the IMM.
Info The DHCP server has
assigned an IMM IP address and configuration.
No action; information only.
No action; information only.
(%1 = CIM_DNSProtocolEndpoint. Hostname; %2 = CIM_DNSProtocolEndpoint. DomainName; %3 = CIM_IPProtocolEndpoint. IPv4Address; %4 = CIM_IPProtocolEndpoint. SubnetMask; %5 = IP address,
xxx.xxx.xxx.xxx; %6 = IP address, xxx.xxx.xxx.xxx)
ENET[0] IP-Cfg:HstName=%1, IP@%2, NetMsk=%3, GW@=%4. (%1 = CIM_DNSProtocolEndpoint.
Info An IMM IP address and
configuration have been assigned using client data.
No action; information only.
Hostname; %2 = CIM_StaticIPSettingData. IPv4Address; %3 = CIM_StaticIPSettingData. SubnetMask; %4 = CIM_StaticIPSettingData. DefaultGatewayAddress)
LAN: Ethernet[0] interface is no longer active.
LAN: Ethernet[0] interface is now active.
Info The IMM Ethernet interface
has been disabled.
Info The IMM Ethernet interface
has been enabled.
No action; information only.
No action; information only.
Chapter 3. Diagnostics 69
Page 88
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
DHCP setting changed to by user %1. (%1 = user ID)
IMM: Configuration %1 restored from a configuration file by user %2. (%1 = CIM_ConfigurationData. ConfigurationName; %2 = user ID)
Watchdog %1 Screen Capture Occurred. (%1 = OS Watchdog or Loader Watchdog)
Watchdog %1 Failed to Capture Screen. (%1 = OS Watchdog or Loader Watchdog)
Info A user has changed the
DHCP mode.
Info A user has restored the IMM
configuration by importing a configuration file.
Error An operating-system error
has occurred, and the screen capture was successful.
Error An operating-system error
has occurred, and the screen capture failed.
No action; information only.
No action; information only.
1. Reconfigure the watchdog timer to a higher value.
2. Make sure that the IMM Ethernet over USB interface is enabled.
3. Reinstall the RNDIS or cdc_ether device driver for the operating system.
4. Disable the watchdog.
5. Check the integrity of the installed operating system.
1. Reconfigure the watchdog timer to a higher value.
2. Make sure that the IMM Ethernet over USB interface is enabled.
3. Reinstall the RNDIS or cdc_ether device driver for the operating system.
4. Disable the watchdog.
5. Check the integrity of the installed operating system.
6. Update the IMM firmware. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
70 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 89
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Running the backup IMM main application.
Error The IMM has resorted to
running the backup main application.
Update the IMM firmware. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
Please ensure that the IMM is flashed with the correct firmware. The IMM is unable to match its firmware to the server.
Error The server does not support
the installed IMM firmware version.
Update the IMM firmware to a version that the server supports. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
IMM reset was caused by restoring default values.
Info The IMM has been reset
because a user has restored
No action; information only.
the configuration to its default settings.
IMM clock has been set from NTP server %1. (%1 = IBM_NTPService.ElementName)
SSL data in the IMM configuration data is invalid. Clearing configuration data region and disabling SSL+H25.
Info The IMM clock has been set
to the date and time that is provided by the Network Time Protocol server.
Error There is a problem with the
certificate that has been imported into the IMM. The imported certificate must contain a public key that corresponds to the key pair
No action; information only.
1. Make sure that the certificate that you are importing is correct.
2. Try to import the certificate again.
that was previously generated through the
Generate a New Key and Certificate Signing Request link.
Flash of %1 from %2 succeeded for user %3. (%1 = CIM_ManagedElement. ElementName; %2 = Web or LegacyCLI; %3 = user ID)
Info A user has successfully
updated one of the following firmware components:
v IMM main application v IMM boot ROM
No action; information only.
v Server firmware v Diagnostics v Integrated service
processor
Chapter 3. Diagnostics 71
Page 90
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Flash of %1 from %2 failed for user %3. (%1 = CIM_ManagedElement. ElementName; %2 = Web or LegacyCLI; %3 = user ID)
The Chassis Event Log (CEL) on system %1 is 75% full. (%1 = CIM_ComputerSystem. ElementName)
The Chassis Event Log (CEL) on system %1 is 100% full. (%1 = CIM_ComputerSystem. ElementName)
%1 Platform Watchdog Timer expired for %2. (%1 = OS Watchdog or Loader Watchdog; %2 = OS Watchdog or Loader Watchdog)
IMM Test Alert Generated by %1. (%1 = user ID)
Security: Userid: '%1' had %2 login failures from an SSH client at IP address %3. (%1 = user ID; %2 = MaximumSuccessiveLoginFailures (currently set to 5 in the firmware); %3 = IP address, xxx.xxx.xxx.xxx)
Info An attempt to update a
firmware component from the interface and IP address has failed.
Info The IMM event log is 75%
full. When the log is full, older log entries are replaced by newer ones.
Info The IMM event log is full.
When the log is full, older log entries are replaced by newer ones.
Error A Platform Watchdog Timer
Expired event has occurred.
Info A user has generated a test
alert from the IMM.
Error A user has exceeded the
maximum number of unsuccessful login attempts from SSH and has been prevented from logging in for the lockout period.
Try to update the firmware again.
To avoid losing older log entries, save the log as a text file and clear the log.
To avoid losing older log entries, save the log as a text file and clear the log.
1. Reconfigure the watchdog timer to a higher value.
2. Make sure that the IMM Ethernet over USB interface is enabled.
3. Reinstall the RNDIS or cdc_ether device driver for the operating system.
4. Disable the watchdog.
5. Check the integrity of the installed operating system.
No action; information only.
1. Make sure that the correct login ID and password are being used.
2. Have the system administrator reset the login ID or password.
72 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 91

Checkout procedure

The checkout procedure is the sequence of tasks that you should follow to diagnose a problem in the server.

About the checkout procedure

Before you perform the checkout procedure for diagnosing hardware problems, review the following information:
v Read the safety information that begins on page vii. v The diagnostic programs provide the primary methods of testing the major
components of the server, such as the system board, Ethernet controller, keyboard, mouse (pointing device), serial ports, and hard disk drives. You can also use them to test some external devices. If you are not sure whether a problem is caused by the hardware or by the software, you can use the diagnostic programs to confirm that the hardware is working correctly.
v When you run the diagnostic programs, a single problem might cause more than
one error message. When this happens, correct the cause of the first error message. The other error messages usually will not occur the next time you run the diagnostic programs.
Exception: If multiple error codes or light path diagnostics LEDs indicate a microprocessor error, the error might be in a microprocessor or in a microprocessor socket. See “Microprocessor problems” on page 82 for information about diagnosing microprocessor problems.
v Before you run the diagnostic programs, you must determine whether the failing
server is part of a shared hard disk drive cluster (two or more servers sharing external storage devices). If it is part of a cluster, you can run all diagnostic programs except the ones that test the storage unit (that is, a hard disk drive in the storage unit) or the storage adapter that is attached to the storage unit. The failing server might be part of a cluster if any of the following conditions is true:
– You have identified the failing server as part of a cluster (two or more servers
sharing external storage devices).
– One or more external storage units are attached to the failing server and at
least one of the attached storage units is also attached to another server or unidentifiable device.
– One or more servers are located near the failing server.
Important: If the server is part of a shared hard disk drive cluster, run one test at a time. Do not run any suite of tests, such as “quick” or “normal” tests, because this might enable the hard disk drive diagnostic tests.
v If the server is halted and a POST error code is displayed, see “POST error
codes” on page 26. If the server is halted and no error message is displayed, see “Troubleshooting tables” on page 75 and “Solving undetermined problems” on page 138.
v For information about power-supply problems, see “Solving power problems” on
page 137 and “Power-supply LEDs” on page 96.
v For intermittent problems, check the system-event log; see “Event logs” on page
23, “System-event log” on page 37, and “Diagnostic programs, messages, and error codes” on page 97.
Chapter 3. Diagnostics 73
Page 92

Performing the checkout procedure

To perform the checkout procedure, complete the following steps:
1. Is the server part of a cluster?
v No: Go to step 2. v Yes: Shut down all failing servers that are related to the cluster. Go to step 2.
2. Complete the following steps: a. Turn off the server and all external devices. b. Check all cables and power cords. c. Check all internal and external devices for compatibility at
http://www.ibm.com/servers/eserver/serverproven/compat/us/. d. Set all display controls to the middle positions. e. Turn on all external devices. f. Turn on the server. If the server does not start, see “Troubleshooting tables”
on page 75.
g. Check the system-error LED on the operator information panel (see “Server
controls, LEDs, and connectors” on page 9). If it is flashing, check the light
path diagnostics LEDs (see “Light path diagnostics” on page 90). h. Check for the following results:
v Successful completion of POST
v Successful completion of startup, indicated by a readable display of the
operating-system desktop
3. Are there readable instructions on the main menu? v No: Find the failure symptom in “Troubleshooting tables” on page 75; if
necessary, see “Solving undetermined problems” on page 138.
v Yes: Run the diagnostic programs (see “Running the diagnostic programs” on
page 97). – If you receive an error, see “Diagnostic messages” on page 98. – If the diagnostic programs were completed successfully and you still
suspect a problem, see “Solving undetermined problems” on page 138.
74 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 93

Troubleshooting tables

Use the troubleshooting tables to find solutions to problems that have identifiable symptoms.
If you cannot find a problem in these tables, see “Running the diagnostic programs” on page 97 for information about testing the server.
If you have just added new software or a new optional device and the server is not working, complete the following steps before you use the troubleshooting tables:
1. Check the operator information panel and the light path diagnostics LEDs (see “Light path diagnostics” on page 90).
2. Remove the software or device that you just added.
3. Run the diagnostic tests to determine whether the server is running correctly.
4. Reinstall the new software or new device.

DVD drive problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The DVD drive is not recognized.
A DVD is not working correctly.
1. Make sure that: v The SATA channel to which the DVD drive is attached (primary or
secondary) is enabled in the Setup utility.
v All cables and jumpers are installed correctly. v The signal cable and connector are not damaged and the connector pins are
not bent.
v The correct device driver is installed for the DVD drive.
2. Run the DVD drive diagnostic programs.
3. Reseat the following components: a. DVD drive b. DVD drive cables
4. Replace the following components one at a time, in the order shown, restarting the server each time:
a. DVD drive b. DVD drive and cables c. (Trained service technician only) System board
1. Clean the DVD.
2. Run the DVD drive diagnostic programs.
3. Reseat the DVD drive.
4. Replace the DVD drive.
Chapter 3. Diagnostics 75
Page 94
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The DVD drive tray is not working.
1. Make sure that the server is turned on.
2. Insert the end of a straightened paper clip into the manual tray-release opening.
3. Reseat the DVD drive.
4. Replace the DVD drive.

General problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
A cover lock is broken, an LED is not working, or a similar problem has occurred.
The server is hung while the screen is on. Cannot start the Setup utility by pressing F1.
If the part is a CRU, replace it. If the part is a FRU, the part must be replaced by a trained service technician.
1. See “Nx boot failure” on page 136 for more information.
2. See “Recovering the server firmware” on page 134 for more information.

Hard disk drive problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
Not all drives are recognized by the hard disk drive diagnostic tests.
The server stops responding during the hard disk drive diagnostic test.
A hard disk drive was not detected while the operating system was being started.
76 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Remove the drive that is indicated by the diagnostic tests; then, run the hard disk drive diagnostic tests again. If the remaining drives are recognized, replace the drive that you removed with a new one.
Remove the hard disk drive that was being tested when the server stopped responding, and run the diagnostic test again. If the hard disk drive diagnostic test runs successfully, replace the drive that you removed with a new one.
Reseat all hard disk drives and cables; then, run the hard disk drive diagnostic tests again.
Page 95
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
A hard disk drive passes the diagnostic Fixed Disk Test, but the problem remains.
Run the diagnostic SCSI Fixed Disk Test (see “Running the diagnostic programs” on page 97). Note: This test is not available on servers that have RAID arrays or servers that have SATA hard disk drives.

Hypervisor problems

v Follow the suggested actions in the order in which they are listed in the Action
column until the problem is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page
141to determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must
be performed only by a Trained service technician.
v Go to the IBM support Web site at http://www.ibm.com/systems/support/ to check
for technical information, hints, tips, and new device drivers or to submit a request for information.
Symptom Action
If an optional embedded hypervisor flash device is not listed in the expected boot order, does not appear in the list of boot devices, or a similar problem has occurred.
1. Make sure that the optional embedded hypervisor flash device is selected on the boot manager (<F12> Select Boot Device) at startup.
2. Make sure that the embedded hypervisor flash device is seated in the connector correctly (see “Removing a USB embedded hypervisor flash device” on page 234and “Installing a USB embedded hypervisor flash device” on page
236).
3. See the documentation that comes with the optional embedded hypervisor flash device for setup and configuration information.
4. Make sure that other software works on the server.
Chapter 3. Diagnostics 77
Page 96

Intermittent problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
A problem occurs only occasionally and is difficult to diagnose.
1. Make sure that: v All cables and cords are connected securely to the rear of the server and
attached devices.
v When the server is turned on, air is flowing from the fan grille. If there is no
airflow, the fan is not working. This can cause the server to overheat and shut down.
2. Check the system-event log or IMM log (see “Event logs” on page 23).
3. See “Solving undetermined problems” on page 138.

Keyboard, mouse, or pointing-device problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
All or some keys on the keyboard do not work.
1. Make sure that:
v The keyboard cable is securely connected. v The server and the monitor are turned on.
2. See http://www.ibm.com/servers/eserver/serverproven/compat/us/ for keyboard compatibility.
3. If you are using a USB keyboard, run the Setup utility and enable keyboardless operation to prevent the 301 POST error message from being displayed during startup.
4. If you are using a USB keyboard and it is connected to a USB hub, disconnect the keyboard from the hub and connect it directly to the server.
5. Replace the following components one at a time, in the order shown, restarting the server each time:
a. Keyboard b. (Trained service technician only) System board
78 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 97
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The mouse or pointing device does not work.
1. Make sure that: v The mouse or pointing device is compatible with the server. See
http://www.ibm.com/servers/eserver/serverproven/compat/us/.
v The mouse or pointing-device cable is securely connected to the server. v The mouse or pointing-device device drivers are installed correctly. v The server and the monitor are turned on. v The mouse is enabled in the Setup utility.
2. If you are using a USB mouse or pointing device and it is connected to a USB hub, disconnect the mouse or pointing device from the hub and connect it directly to the server.
3. Replace the following components one at a time, in the order shown, restarting the server each time:
a. Mouse or pointing device b. (Trained service technician only) System board
Chapter 3. Diagnostics 79
Page 98

Memory problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The amount of system memory that is displayed is less than the amount of installed physical memory.
1. Make sure that:
v No error LEDs are lit on the operator information panel or on the DIMM. v Memory mirroring does not account for the discrepancy. v The memory modules are seated correctly. v You have installed the correct type of memory. v If you changed the memory, you updated the memory configuration in the
Setup utility.
v All banks of memory are enabled. The server might have automatically
disabled a memory bank when it detected a problem, or a memory bank might have been manually disabled.
2. Check the POST error log: v If a DIMM was disabled by a systems-management interrupt (SMI), replace
the DIMM.
v If a DIMM was disabled by the user or by POST, run the Setup utility and
enable the DIMM.
3. Run memory diagnostics (see “Running the diagnostic programs” on page 97).
4. Make sure that there is no memory mismatch when the server is at the minimum memory configuration (one 1 GB DIMM); see the information about the minimum required configuration on page “Solving undetermined problems” on page 138).
5. Add one pair of DIMMs at a time, making sure that the DIMMs in each pair match.
6. Reseat the DIMMs, and then restart the server.
7. Reverse the DIMMs between the channels (of the same microprocessor), and then restart the server. If the problem is related to a DIMM, replace the failing DIMM.
8. (Trained service technician only) Install the failing DIMM into a DIMM connector for microprocessor 2 (if installed) to verify that the problem is not the microprocessor or the DIMM connector.
9. (Trained service technician only) Replace the system board.
80 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 99
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
Multiple rows of DIMMs in a branch are identified as failing.
1. Reseat the DIMMs; then, restart the server.
2. Remove the lowest-numbered DIMM pair of those that are identified and replace it with an identical pair of known good DIMMs; then, restart the server. Repeat as necessary. If the failures continue after all identified pairs are replaced, go to step4.
3. Return the removed DIMMs, one pair at a time, to their original connectors, restarting the server after each pair, until a pair fails. Replace each DIMM in the failed pair with an identical known good DIMM, restarting the server after each DIMM. Replace the failed DIMM. Repeat step 3 until you have tested all removed DIMMs.
4. Replace the lowest-numbered DIMM pair of those identified; then, restart the server. Repeat as necessary.
5. Reverse the DIMMs between the channels (of the same microprocessor), and then restart the server. If the problem is related to a DIMM, replace the failing DIMM.
6. (Trained service technician only) Install the failing DIMM into a DIMM connector for microprocessor 2 (if installed) to verify that the problem is not the microprocessor or the DIMM connector.
7. (Trained service technician only) Replace the system board.
Chapter 3. Diagnostics 81
Page 100

Microprocessor problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The server emits a continuous beep during POST, indicating that the startup (boot) microprocessor is not working correctly.
1. Correct any errors that are indicated by the light path diagnostics LEDs (see “Light path diagnostics” on page 90).
2. Make sure that the server supports all the microprocessors and that the microprocessors match in speed and cache size.
3. (Trained service technician only) Reseat microprocessor 1
4. (Trained service technician only) If there is no indication of which microprocessor has failed, isolate the error by testing with one microprocessor at a time.
5. Replace the following components one at a time, in the order shown, restarting the server each time:
a. (Trained service technician only) Microprocessor 2 b. VRM c. (Trained service technician only) System board
6. (Trained service technician only) If multiple error codes or light path diagnostics LEDs indicate a microprocessor error, reverse the locations of two microprocessors to determine whether the error is associated with a microprocessor or with a microprocessor socket.
v If the error is associated with a microprocessor, replace the microprocessor. v If the error is associated with a VRM, replace the VRM. v If the error is associated with a microprocessor socket, replace the system
board.
82 IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Loading...