Note: Before using this information and the product it supports, read the general information in Appendix B, “Notices,” on page 327,
and the IBM Safety Information, Environmental Notices and User Guide documents on the IBM Documentation CD, and the
Warranty Information document that comes with the server.
This section contains information for trained service technicians.
Inspecting for unsafe conditions
Use the information in this section to help you identify potential unsafe conditions in
an IBM product that you are working on. Each IBM product, as it was designed and
manufactured, has required safety items to protect users and service technicians
from injury. The information in this section addresses only those items. Use good
judgment to identify potential unsafe conditions that might be caused by non-IBM
alterations or attachment of non-IBM features or options that are not addressed in
this section. If you identify an unsafe condition, you must determine how serious the
hazard is and whether you must correct the problem before you work on the
product.
Consider the following conditions and the safety hazards that they present:
v Electrical hazards, especially primary power. Primary voltage on the frame can
cause serious or fatal electrical shock.
v Explosive hazards, such as a damaged CRT face or a bulging capacitor.
v Mechanical hazards, such as loose or missing hardware.
To inspect the product for potential unsafe conditions, complete the following steps:
1. Make sure that the power is off and the power cord is disconnected.
2. Make sure that the exterior cover is not damaged, loose, or broken, and
observe any sharp edges.
3. Check the power cord:
v Make sure that the third-wire ground connector is in good condition. Use a
meter to measure third-wire ground continuity for 0.1 ohm or less between
the external ground pin and the frame ground.
v Make sure that the power cord is the correct type, as specified in “Power
cords” on page 145.
v Make sure that the insulation is not frayed or worn.
4. Remove the cover.
5. Check for any obvious non-IBM alterations. Use good judgment as to the safety
of any non-IBM alterations.
6. Check inside the server for any obvious unsafe conditions, such as metal filings,
contamination, water or other liquid, or signs of fire or smoke damage.
7. Check for worn, frayed, or pinched cables.
8. Make sure that the power-supply cover fasteners (screws or rivets) have not
been removed or tampered with.
viiiIBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 11
Guidelines for servicing electrical equipment
Observe the following guidelines when servicing electrical equipment:
v Check the area for electrical hazards such as moist floors, nongrounded power
extension cords, power surges, and missing safety grounds.
v Use only approved tools and test equipment. Some hand tools have handles that
are covered with a soft material that does not provide insulation from live
electrical currents.
v Regularly inspect and maintain your electrical hand tools for safe operational
condition. Do not use worn or broken tools or testers.
v Do not touch the reflective surface of a dental mirror to a live electrical circuit.
The surface is conductive and can cause personal injury or equipment damage if
it touches a live electrical circuit.
v Some rubber floor mats contain small conductive fibers to decrease electrostatic
discharge. Do not use this type of mat to protect yourself from electrical shock.
v Do not work alone under hazardous conditions or near equipment that has
hazardous voltages.
v Locate the emergency power-off (EPO) switch, disconnecting switch, or electrical
outlet so that you can turn off the power quickly in the event of an electrical
accident.
v Disconnect all power before you perform a mechanical inspection, work near
power supplies, or remove or install main units.
v Before you work on the equipment, disconnect the power cord. If you cannot
disconnect the power cord, have the customer power-off the wall box that
supplies power to the equipment and lock the wall box in the off position.
v Never assume that power has been disconnected from a circuit. Check it to
make sure that it has been disconnected.
v If you have to work on equipment that has exposed electrical circuits, observe
the following precautions:
– Make sure that another person who is familiar with the power-off controls is
near you and is available to turn off the power if necessary.
– When you are working with powered-on electrical equipment, use only one
hand. Keep the other hand in your pocket or behind your back to avoid
creating a complete circuit that could cause an electrical shock.
– When you use a tester, set the controls correctly and use the approved probe
leads and accessories for that tester.
– Stand on a suitable rubber mat to insulate you from grounds such as metal
floor strips and equipment frames.
v Use extreme care when you measure high voltages.
v To ensure proper grounding of components such as power supplies, pumps,
blowers, fans, and motor generators, do not service these components outside of
their normal operating locations.
v If an electrical accident occurs, use caution, turn off the power, and send another
person to get medical aid.
Safetyix
Page 12
Safety statements
Important:
Each caution and danger statement in this document is labeled with a number. This
number is used to cross reference an English-language caution or danger
statement with translated versions of the caution or danger statement in the SafetyInformation document.
For example, if a caution statement is labeled "Statement 1," translations for that
caution statement are in the Safety Information document under "Statement 1."
Be sure to read all caution and danger statements in this document before you
perform the procedures. Read any additional safety information that comes with the
server or optional device before you install the device.
Attention:Use No. 26 AWG or larger UL-listed or CSA certified
telecommunication line cord.
xIBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 13
Statement 1:
DANGER
Electrical current from power, telephone, and communication cables is
hazardous.
To avoid a shock hazard:
v Do not connect or disconnect any cables or perform installation,
maintenance, or reconfiguration of this product during an electrical
storm.
v Connect all power cords to a properly wired and grounded electrical
outlet.
v Connect to properly wired outlets any equipment that will be attached to
this product.
v When possible, use one hand only to connect or disconnect signal
cables.
v Never turn on any equipment when there is evidence of fire, water, or
structural damage.
v Disconnect the attached power cords, telecommunications systems,
networks, and modems before you open the device covers, unless
instructed otherwise in the installation and configuration procedures.
v Connect and disconnect cables as described in the following table when
installing, moving, or opening covers on this product or attached
devices.
To Connect:To Disconnect:
1. Turn everything OFF.
2. First, attach all cables to devices.
3. Attach signal cables to connectors.
4. Attach power cords to outlet.
5. Turn device ON.
1. Turn everything OFF.
2. First, remove power cords from outlet.
3. Remove signal cables from connectors.
4. Remove all cables from devices.
Safetyxi
Page 14
Statement 2:
CAUTION:
When replacing the lithium battery, use only IBM Part Number 33F8354 or an
equivalent type battery recommended by the manufacturer. If your system has
a module containing a lithium battery, replace it only with the same module
type made by the same manufacturer. The battery contains lithium and can
explode if not properly used, handled, or disposed of.
Do not:
v Throw or immerse into water
v Heat to more than 100°C (212°F)
v Repair or disassemble
Dispose of the battery as required by local ordinances or regulations.
xiiIBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 15
Statement 3:
CAUTION:
When laser products (such as CD-ROMs, DVD drives, fiber optic devices, or
transmitters) are installed, note the following:
v Do not remove the covers. Removing the covers of the laser product could
result in exposure to hazardous laser radiation. There are no serviceable
parts inside the device.
v Use of controls or adjustments or performance of procedures other than
those specified herein might result in hazardous radiation exposure.
DANGER
Some laser products contain an embedded Class 3A or Class 3B laser
diode. Note the following.
Laser radiation when open. Do not stare into the beam, do not view directly
with optical instruments, and avoid direct exposure to the beam.
Class 1 Laser Product
Laser Klasse 1
Laser Klass 1
Luokan 1 Laserlaite
Appareil A Laser de Classe 1
`
Safetyxiii
Page 16
Statement 4:
≥ 18 kg (39.7 lb)≥ 32 kg (70.5 lb)≥ 55 kg (121.2 lb)
CAUTION:
Use safe practices when lifting.
Statement 5:
CAUTION:
The power control button on the device and the power switch on the power
supply do not turn off the electrical current supplied to the device. The device
also might have more than one power cord. To remove all electrical current
from the device, ensure that all power cords are disconnected from the power
source.
2
1
xivIBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 17
Statement 8:
CAUTION:
Never remove the cover on a power supply or any part that has the following
label attached.
Hazardous voltage, current, and energy levels are present inside any
component that has this label attached. There are no serviceable parts inside
these components. If you suspect a problem with one of these parts, contact
a service technician.
Statement 11:
CAUTION:
The following label indicates sharp edges, corners, or joints nearby.
Statement 12:
CAUTION:
The following label indicates a hot surface nearby.
Safetyxv
Page 18
Statement 13:
DANGER
Overloading a branch circuit is potentially a fire hazard and a shock hazard
under certain conditions. To avoid these hazards, ensure that your system
electrical requirements do not exceed branch circuit protection
requirements. Refer to the information that is provided with your device for
electrical specifications.
Statement 15:
CAUTION:
Make sure that the rack is secured properly to avoid tipping when the server
unit is extended.
Statement 17:
CAUTION:
The following label indicates moving parts nearby.
Statement 26:
CAUTION:
Do not place any object on top of rack-mounted devices.
Attention:This server is suitable for use on an IT power distribution system
whose maximum phase-to-phase voltage is 240 V under any distribution fault
condition.
xviIBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 19
Chapter 1. Start here
You can solve many problems without outside assistance by following the
troubleshooting procedures in this Problem Determination and Service Guide and
on the IBM Web site. This document describes the diagnostic tests that you can
perform, troubleshooting procedures, and explanations of error messages and error
codes. The documentation that comes with your operating system and software
also contains troubleshooting information.
Diagnosing a problem
Before you contact IBM or an approved warranty service provider, follow these
procedures in the order in which they are presented to diagnose a problem with
your server:
1. Determine what has changed.
Determine whether any of the following items were added, removed, replaced,
or updated before the problem occurred:
v IBM System x Server Firmware (server firmware)
v Device drivers
v Firmware
v Hardware components
v Software
If possible, return the server to the condition it was in before the problem
occurred.
2. Collect data.
Thorough data collection is necessary for diagnosing hardware and software
problems.
a. Document error codes and system board LEDs.
v System error codes: See “Viewing the test log” on page 98 for
information about error codes.
v Software or operating-system error codes: See the documentation for
the software or operating system for information about a specific error
code. See the manufacturer's Web site for documentation.
v Light path diagnostics LEDs: See “Light path diagnostics” on page 90
for information about light path diagnostics LEDs that are lit.
v System board LEDs: See “System board LEDs” on page 19 for
information about system board LEDs that are lit.
“Light path diagnostics” on page 90
b. Collect system data.
Run Dynamic System Analysis (DSA) to collect information about the
hardware, firmware, software, and operating system. Have this information
available when you contact IBM or an approved warranty service provider.
For instructions for running the DSA program, see “Running the diagnostic
programs” on page 97.
If you have to download the latest version of DSA , go to
http://www.ibm.com/systems/support/supportsite.wss/
docdisplay?brandind=5000008&lndocid=SERV-DSA or complete the
following steps.
Note: Changes are made periodically to the IBM Web site. The actual
procedure might vary slightly from what is described in this document.
1) Go to http://www.ibm.com/systems/support/.
2) Under Product support, click System x.
3) Under Popular links, click Software and device drivers.
4) Under Related downloads, click Dynamic System Analysis (DSA).
For information about DSA command-line options, go to
http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp?topic=/
com.ibm.xseries.tools.doc/erep_tools_dsa.html or complete the following
steps:
1) Go to http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp.
2) In the navigation pane, click IBM System x and BladeCenter ToolsCenter.
3) Click Tools reference > Error reporting and analysis tools > IBMDynamic System Analysis.
3. Follow the problem-resolution procedures.
The four problem-resolution procedures are presented in the order in which they
are most likely to solve your problem. Follow these procedures in the order in
which they are presented:
a. Check for and apply code updates.
Most problems that appear to be caused by faulty hardware are actually
caused by IBM System x Server Firmware (server firmware), system
firmware, device firmware, or device drivers that are not at the latest levels.
Important: Some cluster solutions require specific code levels or
coordinated code updates. If the device is part of a cluster solution, verify
that the latest level of code is supported for the cluster solution before you
update the code.
1) Determine the existing code levels.
In DSA, click Firmware/VPD to view system firmware levels, or click
Software to view operating-system levels.
2) Download and install updates of code that is not at the latest level.
To display a list of available updates for your server, go
tohttp://www.ibm.com/systems/support/supportsite.wss/
docdisplay?brandind=5000008&lndocid=MIGR-4JT or complete the
following steps.
Note: Changes are made periodically to the IBM Web site. The actual
procedure might vary slightly from what is described in this document.
a) Go to http://www.ibm.com/systems/support/.
b) Under Product support, click System x.
c) Under Popular links, click Software and device drivers.
d) Click System x3400 M3 to display the list of downloadable files for
the server.
You can install code updates that are packaged as an UpdateXpress
System Pack or UpdateXpress CD image. An UpdateXpress System
Pack contains an integration-tested bundle of online firmware and
device-driver updates for your server. Use UpdateXpress System Pack
Installer to acquire and apply UpdateXpress System Packs and
individual firmware and device-driver updates. For additional information
and to download the UpdateXpress System Pack Installer, go to the
2IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 21
System x and BladeCenter Tools Center at http://publib.boulder.ibm.com/
infocenter/toolsctr/v1r0/index.jsp and click UpdateXpress System PackInstaller.
Be sure to separately install any listed critical updates that have release
dates that are later than the release date of the UpdateXpress System
Pack or UpdateXpress image.
When you click an update, an information page is displayed, including a
list of the problems that the update fixes. Review this list for your
specific problem; however, even if your problem is not listed, installing
the update might solve the problem.
b. Check for and correct an incorrect configuration.
If the server is incorrectly configured, a system function can fail to work
when you enable it; if you make an incorrect change to the server
configuration, a system function that has been enabled can stop working.
1) Make sure that all installed hardware and software are supported.
See http://www.ibm.com/servers/eserver/serverproven/compat/us/ to
verify that the server supports the installed operating system, optional
devices, and software levels. If any hardware or software component is
not supported, uninstall it to determine whether it is causing the problem.
You must remove nonsupported hardware before you contact IBM or an
approved warranty service provider for support.
2) Make sure that the server, operating system, and software are
installed and configured correctly.
Many configuration problems are caused by loose power or signal
cables or incorrectly seated adapters. You might be able to solve the
problem by turning off the server, reconnecting cables, reseating
adapters, and turning the server back on. For information about
performing the checkout procedure, see “Checkout procedure” on page
73.
If the problem is associated with a specific function (for example, if a
RAID hard disk drive is marked offline in the RAID array), see the
documentation for the associated adapter and management or
controlling software to verify that the adapter is correctly configured.
Problem determination information is available for many devices such as
RAID and network adapters.
For problems with operating systems or IBM software or devices,
complete the following steps.
Note: Changes are made periodically to the IBM Web site. The actual
procedure might vary slightly from what is described in this document.
a) Go to http://www.ibm.com/systems/support/.
b) Under Product support, click System x.
c) From the Product family list, select System x3400 M3.
d) Under Support & downloads, click Documentation, Install, and
Use to search for related documentation.
c. Check for troubleshooting procedures and RETAIN tips.
Troubleshooting procedures and RETAIN tips document known problems
and suggested solutions. To search for troubleshooting procedures and
RETAIN tips, complete the following steps.
Note: Changes are made periodically to the IBM Web site. The actual
procedure might vary slightly from what is described in this document.
Chapter 1. Start here3
Page 22
1) Go to http://www.ibm.com/systems/support/.
2) Under Product support, click System x.
3) From the Product family list, select System x3400 M3.
4) Under Support & downloads, click Troubleshoot.
5) Select the troubleshooting procedure or RETAIN tip that applies to your
problem:
v Troubleshooting procedures are under Diagnostic.
v RETAIN tips are under Troubleshoot.
d. Check for and replace defective hardware.
If a hardware component is not operating within specifications, it can cause
unpredictable results. Most hardware failures are reported as error codes in
a system or operating-system log. For more information, see
“Troubleshooting tables” on page 75 and Chapter 5, “Removing and
replacing server components,” on page 149. Hardware errors are also
indicated by light path diagnostics LEDs.
A single problem might cause multiple symptoms. Follow the troubleshooting
procedure for the most obvious symptom. If that procedure does not
diagnose the problem, use the procedure for another symptom, if possible.
If the problem remains, contact IBM or an approved warranty service
provider for assistance with additional problem determination and possible
hardware replacement. To open an online service request, go to
http://www.ibm.com/support/electronic/. Be prepared to provide information
about any error codes and collected data.
Undocumented problems
If you have completed the diagnostic procedure and the problem remains, the
problem might not have been previously identified by IBM. After you have verified
that all code is at the latest level, all hardware and software configurations are valid,
and no light path diagnostics LEDs or log entries indicate a hardware component
failure, contact IBM or an approved warranty service provider for assistance. To
open an online service request, go to http://www.ibm.com/support/electronic/. Be
prepared to provide information about any error codes and collected data and the
problem determination procedures that you have used.
4IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 23
Chapter 2. Introduction
This Problem Determination and Service Guide contains information to help you
solve problems that might occur in your IBM
server. It describes the diagnostic tools that come with the server, error codes and
suggested actions, and instructions for replacing failing components.
Replaceable components are of four types:
v Consumable parts: Purchase and replacement of consumable parts
(components, such as batteries and printer cartridges, that have depletable life)
is your responsibility. If IBM acquires or installs a consumable part at your
request, you will be charged for the service.
v Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your
responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for
the installation.
v Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or
request IBM to install it, at no additional charge, under the type of warranty
service that is designated for your server.
v Field replaceable unit (FRU): FRUs must be installed only by trained service
technicians.
For information about the terms of the warranty and getting service and assistance,
see the Warranty Information document.
Related documentation
®
System x3400 M3 Type 7378/7379
In addition to this document, the following documentation also comes with the
server:
v Environmental Notices and User's Guide
This document is in PDF on the IBM Documentation CD. It contains translated
environmental notices.
v IBM License Agreement for Machine Code
This document is in PDF on the IBM Documentation CD. It provides translated
versions of the IBM License Agreement for Machine Code for your product.
v Warranty Information
This is a document that comes with the server. It contains information about the
terms of the warranty and getting service and assistance.
v Installation and User's Guide
This document is in Portable Document Format (PDF) on the IBM Documentation
CD. It provides general information about setting up and cabling the server,
including information about features, and how to configure the server. It also
contains detailed instructions for installing, removing, and connecting optional
devices that the server supports.
v Licenses and Attributions Documents
This document is in PDF. It contains information about the open-source notices.
v Rack Installation Instructions
This printed document contains instructions for installing the server in a rack.
v Safety Information
This document is in PDF on the IBM Documentation CD. It contains translated
caution and danger statements. Each caution and danger statement that appears
in the documentation has a number that you can use to locate the corresponding
statement in your language in the Safety Information document.
v Warranty Information
This is a document that comes with the server. It contains information about the
terms of the warranty and getting service and assistance.
The System x and xSeries Tools Center is an online information center that
contains information about tools for updating, managing, and deploying firmware,
device drivers, and operating systems. The System x and xSeries Tools Center is at
http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp
Depending on the server model, additional documentation might be included on the
IBM Documentation CD.
The server might have features that are not described in the documentation that
comes with the server. The documentation might be updated occasionally to include
information about those features, or technical updates might be available to provide
additional information that is not included in the server documentation. These
updates are available from the IBM Web site. To check for updated documentation
and technical updates, complete the following steps.
Note: Changes are made periodically to the IBM Web site. The actual procedure
might vary slightly from what is described in this document.
1. Go to http://www.ibm.com/support/.
2. Under Product support, click System x.
3. Under Popular links, click Publications lookup.
4. From the Product family menu, select System x3400 and click Continue.
Notices and statements in this document
The caution and danger statements in this document are also in the multilingual
Safety Information document, which is on the IBM System x Documentation CD.
Each statement is numbered for reference to the corresponding statement in the
Safety Information document.
The following notices and statements are used in this document:
v Note: These notices provide important tips, guidance, or advice.
v Important: These notices provide information or advice that might help you avoid
inconvenient or problem situations.
v Attention: These notices indicate potential damage to programs, devices, or
data. An attention notice is placed just before the instruction or situation in which
damage might` occur.
v Caution: These statements indicate situations that can be potentially hazardous
to you. A caution statement is placed just before the description of a potentially
hazardous procedure step or situation.
v Danger: These statements indicate situations that can be potentially lethal or
extremely hazardous to you. A danger statement is placed just before the
description of a potentially lethal or extremely hazardous procedure step or
situation.
6IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 25
Features and specifications
The following information is a summary of the features and specifications of the
server. Depending on the server model, some features might not be available, or
some specifications might not apply.
Table 1. Features and specifications
Microprocessor:
v Intel Xeon up to six-core with
integrated memory controller and
Quick Path Interconnect (QPI)
architecture
v Designed for LGA 1366 socket
v Scalable up to twelve cores
v 32 KB instruction cache, 32 KB
data cache, and 4MB, 8 MB and
12MB cache that is shared among
the cores
v Support for up to two
microprocessors, second
microprocessor with pluggable
VRM
v Support for Intel Extended Memory
64 Technology (EM64T)
Note:
v Use the Setup utility to determine
the type and speed of the
microprocessors. For a list of
supported microprocessors, see
http://www.ibm.com/servers/
eserver/serverproven/compat/us/
v Do not install an Intel Xeon
series microprocessor and an
™
Xeon
5600 series microprocessor
™
5500
in the same server.
.
Video controller:
v Matrox G200eV video on system
board
v Compatible with SVGA and VGA
Power supply:
v Standard: One 670 watt (100 - 240
V AC)
Note: On models with eight 3.5-inch
or sixteen 2.5-inch hard disk drives,
need to upgrade power supply to
920-watt.
Memory:
v Sixteen DIMM connectors (eight
per microprocessor)
v Minimum: 1 GB
Note: If you install a
ServeRAID-M1015 SAS/SATA
adapter, make sure at least 2 GB
of memory is installed in the
server before you run DSA from a
bootable CD.
v Maximum: 128 GB
– 48 GB using unbuffered
– 128 GB using registered
v Type: Registered or unbuffered
ECC double-data-rate 3 (DDR3)
800, 1066, and 1333 MHz DIMMs
only
v RDIMMs sizes: 1 GB, 2 GB , 4
GB , and 8 GB single-rank or
dual-rank
v UDIMMs sizes: 1 GB, 2 GB, and
4 GB single-rank or dual-rank
v Chipkill supported
Drives:
v SATA:
– DVD (standard)
– DVD/CD-RW (optional)
– Maximum of two devices can
v Diskette (optional): External USB
1.44 MB
v Supported hard disk drives:
– Serial Attached SCSI (SAS)
DIMMs (UDIMMs)
DIMMs (RDIMMs)
be installed
Expansion bays:
v Sixteen 2.5-inch HDD bays (three
optical DVD drive bays)
v Four 3.5-inch simple-swap SATA
drives
v Eight 3.5-inch HDD bays (one
UltraSlim DVD drive)
v Three half-high 5.25-inch bays (one
DVD drive installed)
Note:
– SAS expander card does not
support 3 GB RAID adapters.
– If the server is configured for
RAID operation using a
ServeRAID adapter, you might
have to reconfigure your disk
arrays after you install drives.
See the ServeRAID adapter
documentation for additional
information about RAID operation
and complete instructions for
using the ServeRAID adapter.
– Full-high devices such as an
optional tape drive will occupy
two half-high
5.25-inch bays.
PCI and PCI-X expansion slots:
v Six PCI expansion slots on the
system board:
– Four PCI Express x8 (2x8 link,
2x4 link)
– One PCI Express x16 (x8 link)
– One PCI 32-bit
v One or two expansion slots on the
PCI extender card:
– Optional - One PCI Express x8
(x4 link) on the PCI-Express
extender card
– Optional - Two PCI-X 64/133
slots on the PCI-X extender card
Hot-swap fans:
v Three (maximum)
Chapter 2. Introduction7
Page 26
Table 1. Features and specifications (continued)
Size:
v Tower
– Height: 440 mm (17.3 in.)
– Depth: 767 mm (30.2 in.)
– Width: 218 mm (8.6 in.)
– Weight: approximately 37.85 kg
(83.4 lb) when fully configured
or 27.1 kg (59.7 lb) minimum
v Rack
–5U
– Height: 218 mm (8.6 in.)
– Depth: 702 mm (27.6 in.)
– Width: 424 mm (16.7 in.)
– Weight: approximately 36 kg
(79.3 lb) when fully configured
or 25.8 kg (56.9 lb) minimum
Racks are marked in vertical
increments of 4.45 cm (1.75 inches).
Each increment is referred to as a
unit, or “U.” A 1-U-high device is 4.45
cm (1.75 inches) tall.
Integrated functions:
v Integrated Management Module
(IMM), which provides service
processor control and monitoring
functions, video controller, and
(when the optional virtual media
key is installed) remote keyboard,
video, mouse, and remote hard
disk drive capabilities
v Dedicated or shared management
network connections
v Six-port Serial ATA (SATA)
controller embedded
v Serial over LAN (SOL) and serial
redirection over Telnet or Secure
Shell (SSH)
v USB flash device with embedded
hypervisor software.
v Support for remote management
presence
v One systems-management RJ-45
for connection to a dedicated
systems-management network.
This system management
connector is dedicated to the IMM
functions.
v Six Universal Serial Bus (USB)
ports standard (v2.0 supporting
v1.1)
– Four on rear of server
– Two on front of server
v One internal USB tape connector
v One Broadcom dual-port
10/100/1000 Ethernet controller
with Wake on LAN support
v One serial connector, shared with
the IMM
Note: In messages and
documentation, the term serviceprocessor refers to the integrated
management module (IMM).
ServeRAID SAS adapter:
v ServeRAID-BR10i SAS/SATA
adapter that supports RAID levels 0,
1 and 1E (standard)
v ServeRAID-BR10il SAS/SATA
adapter that supports RAID levels 0,
1 and 1E (standard)
SAS/SATA adapter, which supports
RAID level 0, 1, 5, 10, 50
Note: If the server is configured for
RAID operation using a ServeRAID
adapter, you might have to
reconfigure your disk arrays after
you install drives. See the
ServeRAID adapter documentation
for additional information about
RAID operation and complete
instructions for using the ServeRAID
adapter.
Acoustical noise emissions:
v Sound power, idle: 5.5 bel declared
v Sound power, operating: 6.0 bel
declared
8IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 27
Table 1. Features and specifications (continued)
Environment:
v Air temperature:
– Server on: 10°C to 35°C (50.0°F
to 95.0°F); altitude: 0 to 915 m
(3000 ft)
– Server on: 10°C to 32°C (50.0°F
to 90.0°F); altitude: 915 m
(3000 ft) to 2134 m (7000 ft)
– Server on: 10°C to 28°C (50.0°F
to 83.0°F); altitude: 2134 m
(7000 ft) to 3050 m (10000 ft)
– Server off: 5°C to 45°C (41°F to
113°F)
– Shipping: -40°C to 60°C
(-40.0°F to 140°F)
Electrical input:
v Sine-wave input (50-60 Hz)
required
v Input voltage low range:
– Minimum: 100 V AC
– Maximum: 127 V AC
v Input voltage high range:
– Minimum: 200 V AC
– Maximum: 240 V AC
v Approximate input
kilovolt-amperes (kVA):
– Minimum: 0.60 kVA
– Maximum: 1.10 kVA
Notes:
1. Power consumption and heat
output vary depending on the
number and type of optional
features that are installed and
the power-management optional
features that are in use.
2. These levels were measured in
controlled acoustical
environments according to the
procedures that are specified by
the American National Standards
Institute (ANSI) S12.10 and ISO
7779 and are reported in
accordance with ISO 9296.
Actual sound-pressure levels in a
given location might exceed the
average stated values because
of room reflections and other
nearby noise sources. The
declared sound-power levels
indicate an upper limit, below
which a large number of
computers will operate.
Heat output:
Approximate heat output:
v Minimum configuration: 2013 Btu
per hour (590 watts)
v Maximum configuration: 3610 Btu
per hour (1058 watts)
Humidity:
v Server on: 20% to 80%, maximum
dew point 21°C, maximum rate of
change 5°C/hour
v Server off: 8% to 80%, maximum
dew point 27°C
Server controls, LEDs, and connectors
This section describes the controls, light-emitting diodes (LEDs), and connectors on
the front and rear of the server.
Power control button and power-on LED
Press this button to turn the server on and off manually or to wake the
server from a reduced-power state. The states of the power-on LED are as
follows:
Off: AC power is not present, or the power supply or the LED itself has
failed.
Flashing rapidly (4 times per second): The server is turned off and is
not ready to be turned on. The power-control button is disabled. This will
last approximately 20 to 40 seconds.
Note: Approximately 20 seconds after the server is connected to ac
power, the power-control button becomes active.
Chapter 2. Introduction9
Page 28
Flashing slowly (once per second): The server is turned off and is
ready to be turned on. You can press the power-control button to turn on
the server.
Lit: The server is turned on.
Fading on and off: The server is in a reduced-power state. To wake the
server, press the power-control button or use the IMM Web interface.
See “Logging on to the Web interface” on page 314 for information on
logging on to the IMM Web interface.
Hard disk drive activity LED
When this LED is flashing, it indicates that a hard disk drive is in use.
System-error LED
When this amber LED is lit, it indicates that a system error has occurred.
An LED on the system board might also be lit to help isolate the error. See
Chapter 3, “Diagnostics,” on page 23 for additional information.
USB connectors
Connect USB devices to these connectors.
DVD-eject button
Press this button to release a CD or DVD from the DVD drive.
DVD drive activity LED
When this LED is lit, it indicates that the DVD drive is in use.
Hot-swap hard disk drive activity LED (some models)
On some server models, each hot-swap drive has a hard disk drive activity
LED. When this green LED is flashing, it indicates that the drive is in use.
When the drive is removed, this LED also is visible on the SAS/SATA
backplane, next to the drive connector. The backplane is the printed circuit
board behind drive bays 4 through 7 on 3.5-inch hard disk drive models and
bays 4 through 11 on 2.5-inch hard disk drive models.
Hot-swap hard disk drive status LED (some models)
On some server models, each hot-swap hard disk drive has an amber
status LED. If this amber status LED for a drive is lit, it indicates that the
associated hard disk drive has failed.
If an optional ServeRAID adapter is installed in the server and the LED
flashes slowly (one flash per second), the drive is being rebuilt. If the LED
flashes rapidly (three flashes per second), the adapter is identifying the
drive.
When the drive is removed, this LED also is visible on the SAS/SATA
backplane, below the hot-swap hard disk drive activity LED.
Please see “Event logs” on page 23 for more information.
10IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 29
Rear view
The following illustration shows the connectors and LEDs on the rear of the server.
AC power LED
DC power LED
Fault (error) LED
Serial 1
(COM 1)
Video
System
management
Ethernet
connector
NMI button
Ethernet 1
10/100/1000
USB 1
USB 2
USB 3
USB 4
Ethernet 2
10/100/1000
Power cord
connector
Ethernet
transmit/receive
activity LED
Ethernet link
status LED
Ethernet
transmit/receive
activity LED
Ethernet link
status LED
Power-cord connector
Connect the power cord to this connector.
AC power LED
This green LED provides status information about the power supply. During
typical operation, both the AC and DC power LEDs are lit.
DC power LED
This green LED provides status information about the power supply. During
typical operation, both the AC and DC power LEDs are lit.
Power-error (Fault) LED
When this amber LED is lit, it indicates that the power supply has failed.
Video connector
Connect a monitor to this connector.
Note: The maximum video resolution is 1600 x 1200 at 85 Hz.
Serial connector
Connect a 9-pin serial device to this connector.
Systems-mamagement Ethernet connector
Use this connector to manage the server, using a dedicated management
network. If you use this connector, the IMM cannot be accessed directly
from a production network. A dedicated management network provides
additional security by physically separating the management network traffic
Chapter 2. Introduction11
Page 30
USB connectors
Ethernet connectors
Ethernet transmit/receive activity LED
Ethernet link status LED
Power-supply LEDs
The following illustration shows the locations of the 670-watt power supply LEDs.
AC power LED
DC power LED
Fault (error) LED
from the production network. You can use the Setup utility to configure the
server to use a dedicated systems management network or a shared
network.
Connect USB devices to these connectors.
Use either of these connectors to connect the server to a network. When
you use the Ethernet 1 connector, the network can be shared with the IMM
through a single network cable.
This LED is on the Ethernet connector. When this LED is lit, it indicates that
there is activity between the server and the network.
This LED is on the Ethernet connector. When this LED is lit, it indicates that
there is an active connection on the Ethernet port.
Power cord
connector
The following table describes the problems that are indicated by various
combinations of the power-supply LEDs. For more information about solving
power-supply problems, see “Power-supply LEDs” on page 96.
12IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 31
Table 2. Power-supply LEDs
Power-supply LEDs
DescriptionActionNotesACDCError
OffOffOffNo AC power to
the server or a
problem with the
AC power source
OffOffOnNo AC power to
the server or a
problem with the
AC power source
and the power
supply had
detected an
internal problem
OffOnOffFaulty power
supply
OffOnOnFaulty power
supply
OnOffOffPower supply not
fully seated,
faulty system
board, or faulty
power supply
OnOff or
Flashing
OnOnOffNormal operation
OnOnOnPower supply is
OnFaulty power
supply
faulty but still
operational
1. Check the AC power to the server.
2. Make sure that the power cord is
connected to a functioning power
source.
3. Turn the server off and then turn the
server back on.
4. If the problem remains, replace the
power supply.
1. Replace the power supply.
2. Make sure that the power cord is
connected to a functioning power
source.
Replace the power supply.
Replace the power supply.
1. If the system board error (fault) LED is
not lit, replace the power supply.
2. If the system board error (fault) LED is
lit, (Trained service technician only)
replace the system board.
Replace the power supply.
Replace the power supply.
This is a normal
condition when no
AC power is
present.
This happens only
when a second
power supply is
providing power to
the server.
Typically indicates
that a power supply
is not fully seated.
Note: On models with eight 3.5-inch or sixteen 2.5-inch hard disk drives, need to
upgrade power supply to 920-watt.
The following illustration shows the 920-watt power-supply LEDs on the rear of the
server.
Chapter 2. Introduction13
Page 32
Table 3. Power-supply LEDs
OffOffOffNo AC power to the server or a problem
OffOffOnNo AC power to the server or a problem
OffOnOffFaulty power supply
OffOnOnFaulty power supply
OnOffOffPower supply not fully seated, faulty
OnOff or flashingOnFaulty power supply
OnOnOffNormal operation
OnOnOnPower supply is faulty but still operational
Power-supply LEDs
DescriptionAC powerDC powerPower error
with the AC power source
with the AC power source and the power
supply has detected an internal problem
system board, or faulty power supply
14IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 33
Internal LEDs, connectors, and jumpers
The illustrations in this section show the LEDs, connectors, and jumpers on the
internal boards. The illustrations might differ slightly from your hardware.
System board internal connectors
The following illustration shows the internal connectors on the system board.
Chapter 2. Introduction15
Page 34
The following illustration shows one additional PCI Express expansion slot that is
available on the PCI Express extender card, if equipped.
The following illustration shows two additional PCI-X expansion slots that are
available on the PCI-X extender card, if equipped.
16IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 35
System board switches and jumpers
The following tables show the settings of the switches and the jumpers.
See Table 4 and Table 5 for information about the switch and jumper settings.
Table 4. System board jumpers
Jumper
number
JP1CMOS clear
JP6UEFI boot
JP7Trust
Note: If no jumper is present, the server responds as on default position.
Jumper
nameJumper setting
v Pins 1 and 2: Normal operation (default).
v Pins 2 and 3: Clears CMOS memory.
recovery
Platform
Module
(TPM)
v Pins 1 and 2: Normal operation (default).
v Pins 2 and 3: Enable the UEFI recovery mode.
v Pins 1 and 2: TPM physical presence is asserted.
v Pins 2 and 3: TPM physical presence is not asserted (default).
Note: The physical presence requires manual setting on the
server to change the TPM configuration. The TPM is enabled
and physical presence is not asserted by default. The physical
presence needs to be asserted to activate, deactivate, clear or
change ownership of the TPM.
Table 5. System board switch 6
SW 6 SwitchesSwitch description
1Reserved (default off)
2Power-on password override when on. (default off)
3Reserved (default off)
4When this switch is off, the primary IMM firmware ROM page is loaded. When this switch is on,
the secondary (backup) IMM firmware ROM page is loaded. (default off)
Chapter 2. Introduction17
Page 36
Notes:
1. Before you change any switch settings or move any jumpers, turn off the server;
then, disconnect all power cords and external cables. (Review the information in
“Safety” on page vii, “Installation guidelines” on page 149, and “Handling
static-sensitive devices” on page 151.)
18IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 37
System board LEDs
The following illustration shows the LEDs on the system board.
The system board is equipped with a PCI extender card that provides either one or
two additional expansion slots. The following illustration shows the LEDs on the PCI
Express extender card, if equipped.
The following illustration shows the LEDs on the PCI-X extender card, if equipped.
Chapter 2. Introduction19
Page 38
System board external connectors
The following illustration shows the external input/output connectors on the system
board.
NMI button
Video port
USB ports
Serial port
Ethernet
System
management
Hard disk drive backplane connectors
The following illustrations show the connectors on the 2.5-inch and 3.5-inch hard
disk drive backplanes.
20IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 39
Figure 1. Connectors on the 3.5-inch hard disk drive backplane
Figure 2. Connectors on the 2.5-inch hard disk drive backplane
Chapter 2. Introduction21
Page 40
22IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 41
Chapter 3. Diagnostics
This chapter describes the diagnostic tools that are available to help you solve
problems that might occur in the server.
If you cannot diagnose and correct a problem by using the information in this
chapter, see Appendix A, “Getting help and technical assistance,” on page 325 for
more information.
Diagnostic tools
The following tools are available to help you diagnose and solve hardware-related
problems:
v POST error messages
The power-on self-test (POST) generates messages to indicate successful test
completion or the detection of a problem. See “POST error codes” on page 26
for more information.
v Event logs
For information about the POST event log, the system-event log, the integrated
management module (IMM) event log, and the DSA log, see “Event logs” and
“System-event log” on page 37.
v Troubleshooting tables
These tables list problem symptoms and actions to correct the problems. See
“Troubleshooting tables” on page 75.
v Light path diagnostics
Use the light path diagnostics to diagnose system errors quickly. See “Light path
diagnostics” on page 90 for more information.
v Diagnostic programs, messages, and error codes
The diagnostic programs are the primary method of testing the major
components of the server. See “Diagnostic programs, messages, and error
codes” on page 97 for more information.
Event logs
Error codes and messages are displayed in the following types of event logs:
v POST event log: This log contains the three most recent error codes and
messages that were generated during POST. You can view the POST event log
through the Setup utility.
v System-event log: This log contains all IMM, POST, and system management
interrupt (SMI) events. You can view the system-event log through the Setup
utility and through the Dynamic System Analysis (DSA) program (as the IPMI
event log).
The system-event log is limited in size. When it is full, new entries will not
overwrite existing entries; therefore, you must periodically save and then clear
the system-event log through the Setup utility when the IMM logs an event that
indicates that the log is more than 75% full. When you are troubleshooting, you
might have to save and then clear the system-event log to make the most recent
events available for analysis.
Messages are listed on the left side of the screen, and details about the selected
message are displayed on the right side of the screen. To move from one entry
to the next, use the Up Arrow (↑) and Down Arrow (↓) keys.
Some IMM sensors cause assertion events to be logged when their setpoints are
reached. When a setpoint condition no longer exists, a corresponding
deassertion event is logged. However, not all events are assertion-type events.
v Integrated management module (IMM) event log: This log contains a filtered
subset of all IMM, POST, and system management interrupt (SMI) events. You
can view the IMM event log through the IMM Web interface and through the
Dynamic System Analysis (DSA) program (as the ASM event log).
v DSA log: This log is generated by the Dynamic System Analysis (DSA) program,
and it is a chronologically ordered merge of the system-event log (as the IPMI
event log), the IMM event log (as the ASM event log), and the operating-system
event logs. You can view the DSA log through the DSA program.
Viewing event logs through the Setup utility
To view the POST event log or system-event log, complete the following steps:
1. Turn on the server.
2. When the prompt <F1> Setup is displayed, press F1. If you have set both a
power-on password and an administrator password, you must type the
administrator password to view the event logs.
3. Select System Event Logs and use one of the following procedures:
v To view the POST event log, select POST Event Viewer.
v To view the system-event log, select System Event Log.
Attention:If you set an administrator password and then forget it, there is no way
to change, override, or remove it. You must replace the system board.
Viewing event logs without restarting the server
If the server is not hung, methods are available for you to view one or more event
logs without having to restart the server.
If you have installed Portable or Installable Dynamic System Analysis (DSA), you
can use it to view the system-event log (as the IPMI event log), the IMM event log
(as the ASM event log), or the merged DSA log. You can also use DSA Preboot to
view these logs, although you must restart the server to use DSA Preboot. To install
Portable DSA, Installable DSA, or DSA Preboot or to download a DSA Preboot CD
image, go to http://www.ibm.com/systems/support/supportsite.wss/
docdisplay?lndocid=SERV-DSA&brandind=5000008 or complete the following steps.
Note: Changes are made periodically to the IBM Web site. The actual procedure
might vary slightly from what is described in this document.
1. Go to http://www.ibm.com/systems/support/.
2. Under Product support, click System x.
3. Under Popular links, click Software and device drivers.
4. Under Related downloads, click Dynamic System Analysis (DSA) to display
the matrix of downloadable DSA files.
If IPMItool is installed in the server, you can use it to view the system-event log.
Most recent versions of the Linux operating system come with a current version of
IPMItool. For information about IPMItool, see http://publib.boulder.ibm.com/
infocenter/toolsctr/v1r0/index.jsp?topic=/com.ibm.xseries.tools.doc/
config_tools_ipmitool.html or complete the following steps.
24IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 43
Note: Changes are made periodically to the IBM Web site. The actual procedure
might vary slightly from what is described in this document.
1. Go to http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp.
2. In the navigation pane, click IBM System x and BladeCenter Tools Center.
For an overview of IPMI, go to http://publib.boulder.ibm.com/infocenter/systems/
index.jsp?topic=/liaai/ipmi/liaaiipmi.htm or complete the following steps:
1. Go to http://publib.boulder.ibm.com/infocenter/systems/index.jsp.
2. In the navigation pane, click IBM Systems Information Center.
3. Expand Operating systems, expand Linux information, expand Blueprints
for Linux on IBM systems, and click Using Intelligent Platform Management
Interface (IPMI) on IBM Linux platforms.
You can view the IMM event log through the Event Log link in the integrated
management module (IMM) Web interface.
The following table describes the methods that you can use to view the event logs,
depending on the condition of the server. The first two conditions generally do not
require that you restart the server.
Table 6. Methods for viewing event logs
ConditionAction
The server is not hung and is connected to a
network.
The server is not hung and is not connected
to a network.
The server is hung.
Use any of the following methods:
v Run Portable or Installable DSA to view
the event logs or create an output file that
you can send to IBM service and support.
v Type the IP address of the IMM and go to
the Event Log page.
v Use IPMItool to view the system-event log.
Use IPMItool locally to view the system-event
log.
v If DSA Preboot is installed, restart the
server and press F2 to start DSA Preboot
and view the event logs.
v If DSA Preboot is not installed, insert the
DSA Preboot CD and restart the server to
start DSA Preboot and view the event
logs.
v Alternatively, you can restart the server
and press F1 to start the Setup utility and
view the POST event log or system-event
log. For more information, see “Viewing
event logs through the Setup utility” on
page 24.
Chapter 3. Diagnostics25
Page 44
POST error codes
When you turn on the server, it performs a series of tests to check the operation of
the server components and some optional devices in the server. This series of tests
is called the power-on self-test, or POST.
If a power-on password is set, you must type the password and press Enter, when
you are prompted, for POST to run.
If POST is completed without detecting any problems, the server startup is
completed.
If POST detects a problem, an error message is sent to the POST event log.
The following table describes the POST error codes and suggested actions to
correct the detected problems. These errors can appear as severe, warning, or
informational.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
0010002Microprocessor not supported
0011000Invalid microprocessor type
1. Reseat the following components one at a time,
in the order shown, restarting the server each
time:
a. (Trained service technician only)
Microprocessor 1
b. (Trained service technician only)
Microprocessor 2 (if one is installed)
2. (Trained service technician only) Remove
microprocessor 2 and restart the server.
3. (Trained service technician only) Remove
microprocessor 1 and install microprocessor 2 in
the microprocessor 1 connector. Restart the
server. If the error is corrected, microprocessor 1
is bad and must be replaced.
4. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. (Trained service technician only)
Microprocessor 1
b. (Trained service technician only)
Microprocessor 2
c. (Trained service technician only) System
board
1. Update the firmware (see “Updating the firmware”
on page 302).
2. (Trained service technician only) Remove and
replace the affected microprocessor (error LED is
lit) with a supported type.
26IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 45
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
0011002Microprocessor mismatch
1. Run the Setup utility and view the microprocessor
information to compare the installed
microprocessor specifications.
2. (Trained service technician only) Remove and
replace one of the microprocessors so that they
both match.
0011004Microprocessor failed BIST
1. Update the firmware (see “Updating the firmware”
on page 302).
2. (Trained service technician only) Reseat
microprocessor 2.
3. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. (Trained service technician only)
Microprocessor
b. (Trained service technician only) System
board
001100AMicrocode update failed
1. Update the server firmware (see “Updating the
firmware” on page 302).
2. (Trained service technician only) Replace the
microprocessor.
0050001DIMM disabledNote: Each time you install or remove a DIMM, you
must disconnect the server from the power source;
then, wait 10 seconds before restarting the server.
1. Make sure the DIMM is installed correctly (see
“Installing a memory module” on page 231).
2. If the DIMM was disabled because of a memory
fault, follow the suggested actions for that error
event and restart the server.
3. Check the IBM support website for an applicable
retain tip or firmware update that applies to this
memory event. If no memory fault is recorded in
the logs and no DIMM connector error LED is lit,
you can re-enable the DIMM through the Setup
utility or the Advanced Settings Utility (ASU).
Chapter 3. Diagnostics27
Page 46
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
0051003Uncorrectable DIMM errorNote: Each time you install or remove a DIMM, you
must disconnect the server from the power source;
then, wait 10 seconds before restarting the server.
1. Check the IBM support website for an applicable
retain tip or firmware update that applies to this
memory error.
2. Manually re-enable all affected DIMMs if the
server firmware version is older than UEFI v1.10.
If the server firmware version is UEFI v1.10 or
newer, disconnect and reconnect the server to the
power source and restart the server.
3. If the problem remains, replace the failing DIMM
(see “Removing a memory module” on page 230
and “Installing a memory module” on page 231).
4. (Trained service technician only) If the problem
occurs on the same DIMM connector, check the
DIMM connector. If the connector contains any
foreign material or is damaged, replace the
system board (see “Removing the system board”
on page 296 and “Installing the system board” on
page 298).
5. (Trained service technician only) Remove the
affected microprocessor and check the
microprocessor socket pins for any damaged
pins. If a damage is found, replace the system
board (see “Removing the system board” on page
296 and “Installing the system board” on page
298).
6. (Trained Service technician only) Replace the
affected microprocessor (see “Removing a
microprocessor and heat sink” on page 284 and
“Installing a microprocessor and heat sink” on
page 286).
0051006DIMM mismatch detectedMake sure that the DIMMs match and are installed in
the correct sequence (see “Installing a memory
module” on page 231).
28IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 47
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
0051009No memory detected
1. Make sure that the server contains DIMMs.
2. Reseat the DIMMs (see “Removing a memory
module” on page 230 and “Installing a memory
module” on page 231).
3. Install DIMMs in the correct sequence (see
“Installing a memory module” on page 231).
4. (Trained service technician only) Replace the
failing microprocessor (see “Removing a
microprocessor and heat sink” on page 284 and
“Installing a microprocessor and heat sink” on
page 286).
5. (Trained service technician only) Replace the
system board (see “Removing the system board”
on page 296 and “Installing the system board” on
page 298).
0600369No memory detected
1. Make sure that the server contains DIMMs.
2. Reseat the DIMMs.
3. Install DIMMs in the correct sequence (see
“Installing a memory module” on page 231).
4. (Trained service technician only) Replace the
failing microprocessor.
5. (Trained service technician only) Replace the
system board.
005100ANo usable memory detected
1. Make sure that the server contains DIMMs.
2. Reseat the DIMMs (see “Removing a memory
module” on page 230 and “Installing a memory
module” on page 231).
3. Install DIMMs in the correct sequence (see
“Installing a memory module” on page 231).
4. Clear CMOS memory to re-enable all the memory
connectors (see “System board switches and
jumpers” on page 17). Note that all firmware
settings will be reset to the default settings.
Chapter 3. Diagnostics29
Page 48
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
0058001PFA threshold exceeded
0058007DIMM population is unsupported
1. Check the IBM support website for an applicable
retain tip or firmware update that applies to this
memory error.
2. Swap the affected DIMMs (as indicated by the
error LEDs on the system board or the event
logs) to a different memory channel or
microprocessor (see “Installing a memory module”
on page 231 for memory population).
3. If the error still occurs on the same DIMM,
replace the affected DIMM.
4. (Trained service technician only) If the problem
occurs on the same DIMM connector, check the
DIMM connector. If the connector contains any
foreign material or is damaged, replace the
system board (see “Removing the system board”
on page 296 and “Installing the system board” on
page 298).
5. (Trained service technician only) Remove the
affected microprocessor and check the
microprocessor socket pins for any damaged
pins. If a damage is found, replace the system
board (see “Removing the system board” on page
296 and “Installing the system board” on page
298).
6. (Trained Service technician only) Replace the
affected microprocessor (see “Removing a
microprocessor and heat sink” on page 284 and
“Installing a microprocessor and heat sink” on
page 286).
1. Reseat the DIMMs, and then restart the server
(see “Removing a memory module” on page 230
and “Installing a memory module” on page 231).
2. Make sure that the DIMMs are installed in the
proper sequence (see “Installing a memory
module” on page 231).
30IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 49
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
0058008DIMM failed memory test
1. Check the IBM support website for an applicable
retain tip or firmware update that applies to this
memory error.
2. Manually re-enable all affected DIMMs if the
server firmware version is older than UEFI v1.10.
If the server firmware version is UEFI v1.10 or
newer, disconnect and reconnect the server to the
power source and restart the server.
3. Swap the affected DIMMs (as indicated by the
error LEDs on the system board or the event
logs) to a different memory channel or
microprocessor (see “Installing a memory module”
on page 231 for memory population).
4. If the problem is related to a DIMM, replace the
failing DIMM (see “Removing a memory module”
on page 230 and “Installing a memory module” on
page 231).
5. (Trained service technician only) If the problem
occurs on the same DIMM connector, check the
DIMM connector. If the connector contains any
foreign material or is damaged, replace the
system board (see “Removing the system board”
on page 296 and “Installing the system board” on
page 298).
6. (Trained service technician only) Remove the
affected microprocessor and check the
microprocessor socket pins for any damaged
pins. If a damage is found, replace the system
board (see “Removing the system board” on page
296 and “Installing the system board” on page
298).
7. (Trained service technician only) If the problem is
related to microprocessor socket pins, replace the
system board (see “Removing the system board”
on page 296 and “Installing the system board” on
page 298).
8. (Trained Service technician only) Replace the
affected microprocessor (see “Removing a
microprocessor and heat sink” on page 284 and
“Installing a microprocessor and heat sink” on
page 286).
0058015Start to Activate Spare Memory ChannelInformation only. A failed DIMM has been detected to
activate the memory online-spare feature. Check the
event log for uncorrected DIMM failure events.
Note: The memory online-spare feature is supported
on server models with an Intel Xeon
™
5600 series
microprocessor.
Chapter 3. Diagnostics31
Page 50
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
00580A1Invalid DIMM population for mirroring mode
1. If a fault LED is lit, resolve the failure.
2. Install the DIMMs in the correct sequence (see
“Installing a memory module” on page 231).
00580A4Memory population changedInformation only. Memory has been added, moved, or
changed.
00580A5Mirror failover completeInformation only. Memory redundancy has been lost.
Check the event log for uncorrected DIMM failure
events (see “Event logs” on page 23).
00580A6Spare Memory Channel ActivatedInformation only. Memory online-spare channel has
been activated to back up a failed DIMM. Check the
event log for uncorrected DIMM failure events.
Note: The memory online-spare feature is supported
on server models with an Intel Xeon
™
5600 series
microprocessor.
0068002CMOS battery cleared
1. Reseat the battery.
2. Clear the CMOS memory (see “System board
switches and jumpers” on page 17).
3. Replace the following components one at a time,
in the following order, restarting the server after
each one:
a. Battery
b. (Trained service technician only) System
board
2011000PCI-X PERR
1. Check the extender card LEDs.
2. Reseat all affected adapters and extender cards.
3. Update the PCI device firmware.
4. Remove the adapters from the extender card.
5. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Extender card
b. (Trained service technician only) System
board
32IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 51
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
2011001PCI-X SERR
1. Check the extender-card LEDs.
2. Reseat all affected adapters and extender cards.
3. Update the PCI device firmware.
4. Remove the adapters from the extender card.
5. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Extender card
b. (Trained service technician only) System
board
2018001PCI Express uncorrected or uncorrected
error
1. Check the extender-card LEDs.
2. Reseat all affected adapters and extender cards.
3. Update the PCI device firmware.
4. Remove both adapters from the extender card.
5. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Extender card
b. (Trained service technician only) System
board
2018002Option ROM resource allocation failureInformational message that some devices might not
be initialized.
1. If possible, rearrange the order of the adapters in
the PCI slots to change the load order of the
optional-device ROM code.
2. Run the Setup utility, select Start Options, and
change the boot priority to change the load order
of the optional-device ROM code.
3. Run the Setup utility and disable some other
resources, if their functions are not being used, to
make more space available. Select Devices andI/O Ports to disable any of the integrated devices.
4. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Each adapter
b. (Trained service technician only) System
board
3xx0007 (xx
can be 00 - 19)
Firmware fault detected, system halted
1. Recover the server firmware to the latest level.
2. Undo any recent configuration changes, or clear
CMOS memory to restore the settings to the
default values.
3. Remove any recently installed hardware.
Chapter 3. Diagnostics33
Page 52
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
3038003Firmware corrupted
3048005Booted secondary (backup) server firmware
image
3048006Booted secondary (backup) server firmware
image because of ABR
305000ARTC date/time is incorrect
3058001System configuration invalid
1. Run the Setup utility, select Load DefaultSettings, and save the settings to recover the
server firmware.
2. (Trained service technician only) Replace the
system board.
Information only. The backup switch was used to boot
the secondary bank.
1. Run the Setup utility, select Load DefaultSettings, and save the settings to recover the
primary server firmware settings.
2. Turn off the server and remove it from the power
source.
3. Reconnect the server to the power source, and
then turn on the server.
1. Adjust the date and time settings in the Setup
utility, and then restart the server.
2. Reseat the battery.
3. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Battery
b. (Trained service technician only) System
board
1. Run the Setup utility, and select Save Settings.
2. Run the Setup utility, select Load DefaultSettings, and save the settings.
3. Reseat the following components one at a time in
the order shown, restarting the server each time:
a. Battery
b. Failing device (if the device is a FRU, it must
be reseated by a trained service technician
only)
4. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Battery
b. Failing device (if the device is a FRU, it must
be replaced by a trained service technician
only)
c. (Trained service technician only) System
board
34IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 53
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
3058004Three boot failures
1. Undo any recent system changes, such as new
settings or newly installed devices.
2. Make sure that the server is attached to a reliable
power source.
3. Remove all hardware that is not listed on the
ServerProven Web site.
4. Make sure that the operating system is not
corrupted.
5. Run the Setup utility, save the configuration, and
then restart the server.
3108007System configuration restored to default
settings
3138002Boot configuration error
Information only. This is message is usually
associated with the CMOS battery clear event.
1. Remove any recent configuration changes that
you made in the Setup utility.
2. Run the Setup utility, select Load DefaultSettings, and save the settings.
3808000IMM communication failure
1. Remove power from the server for 30 seconds,
and then reconnect the server to power and
restart it.
2. Update the IMM firmware. (See “Updating the
firmware” on page 302).
3. Make sure that the virtual media key is seated
and not damaged.
4. (Trained service technician only) Replace the
system board.
3808002Error updating system configuration to IMM
1. Remove power from the server, and then
reconnect the server to power and restart it.
2. Run the Setup utility and select Save Settings.
3. Update the firmware.
3808003Error retrieving system configuration from
IMM
1. Remove power from the server, and then
reconnect the server to power and restart it.
2. Run the Setup utility and select Save Settings.
3. Update the IMM firmware.
3808004IMM system-event log full
v When out-of-band, use the IMM Web interface or
IPMItool to clear the logs from the operating
system.
v When using the local console:
1. Run the Setup utility.
2. Select System Event Logs.
3. Select Clear System Event Log.
4. Restart the server.
Chapter 3. Diagnostics35
Page 54
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error codeDescriptionAction
3818001Core Root of Trust Measurement (CRTM)
update failed
3818002Core Root of Trust Measurement (CRTM)
update aborted
3818003Core Root of Trust Measurement (CRTM)
flash lock failed
3818004Core Root of Trust Measurement (CRTM)
system error
3818005Current Bank Core Root of Trust
Measurement (CRTM) capsule signature
invalid
3818006Opposite bank CRTM capsule signature
invalid
3818007CRTM update capsule signature invalid
3828004AEM power capping disabled
1. Run the Setup utility, select Load DefaultSettings, and save the settings.
2. (Trained service technician only) Replace the
system board.
1. Run the Setup utility, select Load DefaultSettings, and save the settings.
2. (Trained service technician only) Replace the
system board.
1. Run the Setup utility, select Load DefaultSettings, and save the settings.
2. (Trained service technician only) Replace the
system board.
1. Run the Setup utility, select Load DefaultSettings, and save the settings.
2. (Trained service technician only) Replace the
system board.
1. Run the Setup utility, select Load DefaultSettings, and save the settings.
2. (Trained service technician only) Replace the
system board.
1. Switch the firmware bank to the backup bank.
2. Run the Setup utility, select Load DefaultSettings, and save the settings.
3. Switch the bank back to the current bank.
4. (Trained service technician only) Replace the
system board.
1. Run the Setup utility, select Load DefaultSettings, and save the settings.
2. (Trained service technician only) Replace the
system board.
1. Check the settings and the event logs.
2. Make sure that the Active Energy Manager
feature is enabled in the Setup utility. Select
System Settings>Power>Active Energy
Manager>Capping Enabled.
3. Update the server firmware.
4. Update the IMM firmware.
36IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 55
System-event log
The system-event log contains messages of three types:
Information
Information messages do not require action; they record significant
system-level events, such as when the server is started.
Warning
Warning messages do not require immediate action; they indicate possible
problems, such as when the recommended maximum ambient temperature
is exceeded.
ErrorError messages might require action; they indicate system errors, such as
when a fan is not detected.
Each message contains date and time information, and it indicates the source of
the message (POST or the IMM).
Integrated management module error messages
The following table describes the IMM error messages and suggested actions to
correct the detected problems. For more information about IMM, see the IntegratedManagement Module User's Guide at http://www.ibm.com/systems/support/
supportsite.wss/docdisplay?lndocid=MIGR-5079770&brandind=5000008.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
MessageSeverityDescriptionAction
Numeric sensor Ambient Temp going
high (upper critical) has asserted.
Numeric sensor Ambient Temp going
high (upper non-recoverable) has
asserted.
Numeric sensor Planar 3.3V going
low (lower critical) has asserted.
Numeric sensor Planar 3.3V going
high (upper critical) has asserted.
Numeric sensor Planar 5V going low
(lower critical) has asserted.
Numeric sensor Planar 5V going high
(upper critical) has asserted.
Numeric sensor Planar VBAT going
low (lower critical) has asserted.
Numeric sensor Fan n Tach going
low (lower critical) has asserted.
(n = fan number)
ErrorAn upper critical sensor
going high has asserted.
ErrorAn upper nonrecoverable
sensor going high has
asserted.
ErrorA lower critical sensor going
low has asserted.
ErrorAn upper critical sensor
going high has asserted.
ErrorA lower critical sensor going
low has asserted.
ErrorAn upper critical sensor
going high has asserted.
ErrorA lower critical sensor going
low has asserted.
ErrorA lower critical sensor going
low has asserted.
Reduce the ambient temperature.
Reduce the ambient temperature.
(Trained service technician only)
Replace the system board.
(Trained service technician only)
Replace the system board.
(Trained service technician only)
Replace the system board.
(Trained service technician only)
Replace the system board.
Replace the 3 V battery.
1. Reseat the failing fan n, which
is indicated by a lit LED on
the fan.
2. Replace the failing fan.
(n = fan number)
Chapter 3. Diagnostics37
Page 56
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
The Processor CPU nStatus has
Failed with IERR.
(n = microprocessor number)
An Over-Temperature Condition has
been detected on the Processor CPU
nStatus.
(n = microprocessor number)
ErrorA processor failed - IERR
condition has occurred.
ErrorAn overtemperature
condition has occurred for
microprocessor n.
(n = microprocessor number)
1. Make sure that the latest
levels of firmware and device
drivers are installed for all
adapters and standard
devices, such as Ethernet,
SCSI, and SAS.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
2. Run the DSA program for the
hard disk drives and other I/O
devices.
3. (Trained service technician
only) Replace microprocessor
n.
(n = microprocessor number)
1. Make sure that the fans are
operating, that there are no
obstructions to the airflow,
that the air baffle is in place
and correctly installed, and
that the server cover is
installed and completely
closed.
2. Make sure that the heat sink
for microprocessor nis
installed correctly.
3. (Trained service technician
only) Replace microprocessor
n.
(n = microprocessor number)
38IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 57
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
The Processor CPU nStatus has
Failed with FRB1/BIST condition.
(n = microprocessor number)
ErrorA processor failed -
FRB1/BIST condition has
occurred.
1. Check for a server firmware
update.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
2. Make sure that the installed
microprocessors are
compatible with each other
(see “Installing a
microprocessor and heat sink”
on page 286 for information
about microprocessor
requirements).
3. (Trained service technician
only) Reseat microprocessor
n.
4. (Trained service technician
only) Replace microprocessor
n.
(n = microprocessor number)
The Processor CPU nStatus has a
Configuration Mismatch.
(n = microprocessor number)
ErrorA processor configuration
mismatch has occurred.
1. Make sure that the installed
microprocessors are
compatible with each other
(see “Installing a
microprocessor and heat sink”
on page 286 for information
about microprocessor
requirements).
2. (Trained service technician
only) Replace the
incompatible microprocessor.
Chapter 3. Diagnostics39
Page 58
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
An SM BIOS Uncorrectable CPU
complex error for Processor CPU
nStatus has asserted.
(n = microprocessor number)
Sensor CPU nOverTemp has
transitioned to critical from a less
severe state.
(n = microprocessor number)
ErrorAn SMBIOS uncorrectable
CPU complex error has
asserted.
ErrorA sensor has changed to
Critical state from a less
severe state.
1. Check for a server firmware
update.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
2. Make sure that the installed
microprocessors are
compatible with each other
(see “Installing a
microprocessor and heat sink”
on page 286 for information
about microprocessor
requirements).
3. (Trained service technician
only) Reseat microprocessor
n.
4. (Trained service technician
only) Replace microprocessor
n.
(n = microprocessor number)
1. Make sure that the fans are
operating, that there are no
obstructions to the airflow,
that the air baffle is in place
and correctly installed, and
that the server cover is
installed and completely
closed.
2. Make sure that the heat sink
for microprocessor n is
installed correctly.
3. (Trained service technician
only) Replace microprocessor
n.
(n = microprocessor number)
40IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 59
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Sensor CPU nOverTemp has
transitioned to non-recoverable from
a less severe state.
(n = microprocessor number)
ErrorA sensor has changed to
Nonrecoverable state from a
less severe state.
1. Make sure that the fans are
operating, that there are no
obstructions to the airflow,
that the air baffle is in place
and correctly installed, and
that the server cover is
installed and completely
closed.
2. Make sure that the heat sink
for microprocessor n is
installed correctly.
3. (Trained service technician
only) Replace microprocessor
n.
(n = microprocessor number)
Sensor CPU nOverTemp has
transitioned to critical from a
non-recoverable state.
(n = microprocessor number)
ErrorA sensor has changed to
Critical state from
Nonrecoverable state.
1. Make sure that the fans are
operating, that there are no
obstructions to the airflow,
that the air baffle is in place
and correctly installed, and
that the server cover is
installed and completely
closed.
2. Make sure that the heat sink
for microprocessor nis
installed correctly.
3. (Trained service technician
only) Replace microprocessor
n.
(n = microprocessor number)
Sensor CPU nOverTemp has
transitioned to non-recoverable.
(n = microprocessor number)
ErrorA sensor has changed to
Nonrecoverable state.
1. Make sure that the fans are
operating, that there are no
obstructions to the airflow,
that the air baffle is in place
and correctly installed, and
that the server cover is
installed and completely
closed.
2. Make sure that the heat sink
for microprocessor nis
installed correctly.
3. (Trained service technician
only) Replace microprocessor
n.
(n = microprocessor number)
Chapter 3. Diagnostics41
Page 60
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
A diagnostic interrupt has occurred
on system %1.
(%1 = CIM_ComputerSystem.
ElementName)
A bus timeout has occurred on
system %1.
(%1 = CIM_ComputerSystem.
ElementName)
A software NMI has occurred on
system %1.
(%1 = CIM_ComputerSystem.
ElementName)
The System %1 encountered a
POST Error.
(%1 = CIM_ComputerSystem.
ElementName)
ErrorAn operator information
panel NMI/diagnostic
interrupt has occurred.
ErrorA bus timeout has occurred.
ErrorA software NMI has
occurred.
ErrorA POST error has occurred.
(Sensor = ABR Status)
If the NMI button on the system
board has not been pressed,
complete the following steps:
1. Make sure that the NMI
button is not pressed.
2. Replace the operator
information panel cable.
3. Replace the operator
information panel.
1. Remove the adapter from the
PCI slot that is indicated by a
lit LED.
2. Replace the extender card.
3. Remove all PCI adapters.
4. (Trained service technicians
only) Replace the system
board.
1. Check the device driver.
2. Reinstall the device driver.
1. Recover the server firmware
from the backup page (see
“Recovering the server
firmware” on page 134).
2. Update the server firmware to
the latest level.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
42IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 61
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
The System %1 encountered a
POST Error.
(%1 = CIM_ComputerSystem.
ElementName)
ErrorA POST error has occurred.
(Sensor = Firmware Error)
1. Make sure that the server
contains DIMMs.
2. Reseat the DIMMs.
3. Install DIMMs in the correct
sequence (see “Installing a
memory module” on page
231).
4. Update the server firmware
on the primary page.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
5. (Trained service technician
only) Replace the failing
microprocessor.
6. (Trained service technician
only) Replace the system
board.
An Uncorrectable Bus Error has
occurred on system %1.
(%1 = CIM_ComputerSystem.
ElementName)
ErrorA bus uncorrectable error
has occurred.
(Sensor = Critical Int PCI)
1. Check the system-event log.
2. Check the PCI error LEDs.
3. Remove the adapter from the
indicated PCI slot.
4. Check for a server firmware
update.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
5. (Trained service technician
only) Replace the system
board.
Chapter 3. Diagnostics43
Page 62
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
An Uncorrectable Bus Error has
occurred on system %1.
(%1 = CIM_ComputerSystem.
ElementName)
An Uncorrectable Bus Error has
occurred on system %1.
(%1 = CIM_ComputerSystem.
ElementName)
ErrorA bus uncorrectable error
has occurred.
(Sensor = Critical Int CPU)
ErrorA bus uncorrectable error
has occurred.
(Sensor = Critical Int DIM)
1. Check the system-event log.
2. Check the microprocessor
error LEDs.
3. Remove the failing
microprocessor from the
system board.
4. Check for a server firmware
update.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
5. Make sure that the two
microprocessors are
matching.
6. (Trained service technician
only) Replace the system
board.
1. Check the system-event log.
2. Check the DIMM error LEDs.
3. Remove the failing DIMM
from the system board.
4. Check for a server firmware
update.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
5. Make sure that the installed
DIMMs are supported and
configured correctly.
6. (Trained service technician
only) Replace the system
board.
44IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 63
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Sensor Sys Board Fault has
transitioned to critical from a less
severe state.
ErrorA sensor has changed to
Critical state from a less
severe state.
1. Check the system-event log.
2. Check for an error LED on
the system board.
3. Replace any failing device.
4. Check for a server firmware
update.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
5. (Trained service technician
only) Replace the system
board.
The Power Supply (Power Supply: n)
has Failed.
(n = power supply number)
ErrorPower supply nhas failed.
(n = power supply number)
1. If the power-on LED is lit,
complete the following steps:
a. Reduce the server to the
minimum configuration.
b. Reinstall the components
one at a time, restarting
the server each time.
c. If the error recurs, replace
the component that you
just reinstalled.
2. Reseat power supply n.
3. Replace power supply n.
(n = power supply number)
Sensor PS n Fan Fault has
transitioned to critical from a less
severe state.
(n = power supply number)
ErrorA sensor has changed to
Critical state from a less
severe state.
1. Make sure that there are no
obstructions, such as bundled
cables, to the airflow from the
power-supply fan.
2. Replace power supply n.
(n = power supply number)
Chapter 3. Diagnostics45
Page 64
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Sensor Pwr Rail A Fault has
transitioned to non-recoverable.
Sensor Pwr Rail B Fault has
transitioned to non-recoverable.
ErrorA sensor has changed to
Nonrecoverable state.
ErrorA sensor has changed to
Nonrecoverable state.
1. Turn off the server and
disconnect it from power.
2. (Trained service technician
only) Remove the PCI
adapter and microprocessor
1. Reinstall the
microprocessor in socket 1
and restart the server.
3. Restart the server.
4. Reinstall each device, one at
a time, starting the server
each time to isolate the failing
device.
5. Replace the failing device.
6. (Trained service technician
only) Replace the system
board.
1. Turn off the server and
disconnect it from power.
2. (Trained service technician
only) Remove the PCI
adapter and microprocessor
2.
3. Restart the server.
4. Reinstall each device, one at
a time, starting the server
each time to isolate the failing
device.
5. Replace the failing device.
6. (Trained service technician
only) Replace the system
board.
46IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 65
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Sensor Pwr Rail C Fault has
transitioned to non-recoverable.
ErrorA sensor has changed to
Nonrecoverable state.
1. Turn off the server and
disconnect it from power.
2. Remove the hard disk drives,
hard disk drive backplanes,
and DIMMs in connectors 1
through 8.
3. Restart the server.
4. Reinstall each device, one at
a time, starting the server
each time to isolate the failing
device.
5. Replace the failing device.
6. (Trained service technician
only) Replace the system
board.
Sensor Pwr Rail D Fault has
transitioned to non-recoverable.
ErrorA sensor has changed to
Nonrecoverable state.
1. Turn off the server and
disconnect it from power.
2. Remove the optical drive and
the DIMMs in connectors 9
through 16.
3. Restart the server.
4. Reinstall the microprocessor
in socket 1 and restart the
server.
5. (Trained service technician
only) Replace the failing
microprocessor.
6. (Trained service technician
only) Replace the system
board.
Sensor Pwr Rail E Fault has
transitioned to non-recoverable.
ErrorA sensor has changed to
Nonrecoverable state.
1. Turn off the server and
disconnect it from power.
2. (Trained service technician
only) Remove the optical
drive and the PCI adapter.
3. Restart the server.
4. Reinstall each device, one at
a time, starting the server
each time to isolate the failing
device.
5. Replace the failing device.
6. (Trained service technician
only) Replace the system
board.
Chapter 3. Diagnostics47
Page 66
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Sensor Pwr Rail F Fault has
transitioned to non-recoverable.
Sensor PS n Therm Fault has
transitioned to critical from a less
severe state.
(n = power supply number)
Sensor PSn 12V OV Fault has
transitioned to non-recoverable.
(n = power supply number)
Sensor PSn 12V UV Fault has
transitioned to non-recoverable.
Sensor PSn 12V OC Fault has
transitioned to non-recoverable.
(n = power supply number)
Sensor PS n VCO Fault has
transitioned to non-recoverable.
(n = power supply number)
ErrorA sensor has changed to
Nonrecoverable state.
ErrorA sensor has changed to
Critical state from a less
severe state.
ErrorA sensor has changed to
Nonrecoverable state.
ErrorA sensor has changed to
Nonrecoverable state.
ErrorA sensor has changed to
Nonrecoverable state.
ErrorA sensor has changed to
Nonrecoverable state.
1. Turn off the server and
disconnect it from power.
2. Remove the hard disk drives
and the hard disk drive
backplanes.
3. Restart the server.
4. Reinstall each device, one at
a time, starting the server
each time to isolate the failing
device.
5. Replace the failing device.
6. (Trained service technician
only) Replace the system
board.
1. Make sure that there are no
obstructions, such as bundled
cables, to the airflow from the
power-supply fan.
2. Replace power supply n.
(n = power supply number)
1. Remove the power supplies.
2. Replace power supply n.
3. (Trained service technician
only) Replace the system
board.
(n = power supply number)
1. Remove the power supplies.
2. Replace power supply n.
3. (Trained service technician
only) Replace the system
board.
(n = power supply number)
1. Remove the power supplies.
2. Replace power supply n.
3. (Trained service technician
only) Replace the system
board.
(n = power supply number)
1. Replace the failing power
supply.
(n = power supply number)
48IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 67
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Redundancy Power Unit has been
reduced.
ErrorRedundancy has been lost
and is insufficient to continue
operation.
1. Check the LEDs for both
power supplies.
2. Follow the actions in
“Power-supply LEDs” on page
96.
Sensor RAID Error has transitioned
to critical from a less severe state.
ErrorA sensor has changed to
Critical state from a less
severe state.
1. Check the hard disk drive
LEDs.
2. Reseat the hard disk drive for
which the status LED is lit.
3. Replace the defective hard
disk drive.
The Drive n Status has been
removed from unit Drive 0 Status.
ErrorA drive has been removed.Reseat hard disk drive n.
(n = hard disk drive number)
(n = hard disk drive number)
The Drive n Status has been
disabled due to a detected fault.
(n = hard disk drive number)
ErrorA drive has been disabled
because of a fault.
1. Run the hard disk drive
diagnostic test on drive n.
2. Reseat the following
components:
a. Hard disk drive
b. Cable from the system
board to the backplane
3. Replace the following
components one at a time, in
the order shown, restarting
the server each time:
a. Hard disk drive
b. Cable from the system
board to the backplane
c. Hard disk drive backplane
(n = hard disk drive number)
Array %1 is in critical condition.
(%1 = CIM_ComputerSystem.
ElementName)
Array %1 has failed.
(%1 = CIM_ComputerSystem.
ElementName)
ErrorAn array is in Critical state.
(Sensor = Drive n Status)
(n = hard disk drive number)
ErrorAn array is in Failed state.
(Sensor = Drive n Status)
(n = hard disk drive number)
Replace the hard disk drive that
is indicated by a lit status LED.
Replace the hard disk drive that
is indicated by a lit status LED.
Chapter 3. Diagnostics49
Page 68
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected
for DIMM All DIMMs on Memory
Subsystem All DIMMs.
ErrorA memory uncorrectable
error has occurred.
1. Check the IBM support
website for an applicable
retain tip or firmware update
that applies to this memory
error.
2. Manually re-enable all
affected DIMMs if the server
firmware version is older than
UEFI v1.10. If the server
firmware version is UEFI
v1.10 or newer, disconnect
and reconnect the server to
the power source and restart
the server.
3. Swap the affected DIMMs (as
indicated by the error LEDs
on the system board or the
event logs) to a different
memory channel or
microprocessor (see
“Installing a memory module”
on page 231 for memory
population).
4. If the problem follows the
DIMM, replace the failing
DIMM (see “Removing a
memory module” on page 230
and “Installing a memory
module” on page 231).
(Continued on the next page)
50IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 69
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected
for DIMM All DIMMs on Memory
Subsystem All DIMMs.
ErrorA memory uncorrectable
error has occurred.
5. (Trained service technician
only) If the problem occurs on
the same DIMM connector,
check the DIMM connector. If
the connector contains any
foreign material or is
damaged, replace the system
board (see “Removing the
system board” on page 296
and “Installing the system
board” on page 298).
6. (Trained service technician
only) Remove the affected
microprocessor and check the
microprocessor socket pins
for any damaged pins. If a
damage is found, replace the
system board (see “Removing
the system board” on page
296 and “Installing the system
board” on page 298).
7. (Trained Service technician
only) Replace the affected
microprocessor (see
“Removing a microprocessor
and heat sink” on page 284
and “Installing a
microprocessor and heat sink”
on page 286).
Chapter 3. Diagnostics51
Page 70
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory Logging Limit Reached for
DIMM All DIMMs on Memory
Subsystem All DIMMs.
Memory DIMM Configuration Error
for All DIMMs on Memory Subsystem
All DIMMs.
ErrorThe memory logging limit
has been reached.
ErrorA DIMM configuration error
has occurred.
1. Check the IBM support
website for an applicable
retain tip or firmware update
that applies to this memory
error.
2. Swap the affected DIMMs (as
indicated by the error LEDs
on the system board or the
event logs) to a different
memory channel or
microprocessor (see
“Installing a memory module”
on page 231 for memory
population).
3. If the error still occurs on the
same DIMM, replace the
affected DIMM.
4. (Trained service technician
only) If the problem occurs on
the same DIMM connector,
check the DIMM connector. If
the connector contains any
foreign material or is
damaged, replace the system
board (see “Removing the
system board” on page 296
and “Installing the system
board” on page 298).
5. (Trained service technician
only) Remove the affected
microprocessor and check the
microprocessor socket pins
for any damaged pins. If a
damage is found, replace the
system board (see “Removing
the system board” on page
296 and “Installing the system
board” on page 298).
6. (Trained Service technician
only) Replace the affected
microprocessor (see
“Removing a microprocessor
and heat sink” on page 284
and “Installing a
microprocessor and heat sink”
on page 286).
Make sure that DIMMs are
installed in the correct sequence
and have the same size, type,
speed, and technology.
52IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 71
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory DIMM disabled for All
DIMMs on Memory Subsystem All
DIMMs.
InfoDIMM disabled
1. Make sure the DIMM is
installed correctly (see
“Installing a memory module”
on page 231).
2. If the DIMM was disabled
because of a memory fault
(memory uncorrectable error
or memory logging limit
reached), follow the
suggested actions for that
error event and restart the
server.
3. Check the IBM support
website for an applicable
retain tip or firmware update
that applies to this memory
event. If no memory fault is
recorded in the logs and no
DIMM connector error LED is
lit, you can re-enable the
DIMM through the Setup
utility or the Advanced
Settings Utility (ASU).
Chapter 3. Diagnostics53
Page 72
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected
for DIMM One of the DIMMs on
Memory Subsystem One of the
DIMMs.
ErrorA memory uncorrectable
error has occurred.
1. Check the IBM support
website for an applicable
retain tip or firmware update
that applies to this memory
error.
2. Manually re-enable all
affected DIMMs if the server
firmware version is older than
UEFI v1.10. If the server
firmware version is UEFI
v1.10 or newer, disconnect
and reconnect the server to
the power source and restart
the server.
3. Swap the affected DIMMs (as
indicated by the error LEDs
on the system board or the
event logs) to a different
memory channel or
microprocessor (see
“Installing a memory module”
on page 231 for memory
population).
4. If the problem follows the
DIMM, replace the failing
DIMM (see “Removing a
memory module” on page 230
and “Installing a memory
module” on page 231).
(Continued on the next page)
54IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 73
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected
for DIMM One of the DIMMs on
Memory Subsystem One of the
DIMMs.
ErrorA memory uncorrectable
error has occurred.
5. (Trained service technician
only) If the problem occurs on
the same DIMM connector,
check the DIMM connector. If
the connector contains any
foreign material or is
damaged, replace the system
board (see “Removing the
system board” on page 296
and “Installing the system
board” on page 298).
6. (Trained service technician
only) Remove the affected
microprocessor and check the
microprocessor socket pins
for any damaged pins. If a
damage is found, replace the
system board (see “Removing
the system board” on page
296 and “Installing the system
board” on page 298).
7. (Trained Service technician
only) Replace the affected
microprocessor (see
“Removing a microprocessor
and heat sink” on page 284
and “Installing a
microprocessor and heat sink”
on page 286).
Chapter 3. Diagnostics55
Page 74
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory Logging Limit Reached for
DIMM One of the DIMMs on Memory
Subsystem One of the DIMMs.
Memory DIMM Configuration Error
for One of the DIMMs on Memory
Subsystem One of the DIMMs.
ErrorThe memory logging limit
has been reached.
ErrorA DIMM configuration error
has occurred.
1. Check the IBM support
website for an applicable
retain tip or firmware update
that applies to this memory
error.
2. Swap the affected DIMMs (as
indicated by the error LEDs
on the system board or the
event logs) to a different
memory channel or
microprocessor (see
“Installing a memory module”
on page 231 for memory
population).
3. If the error still occurs on the
same DIMM, replace the
affected DIMM.
4. (Trained service technician
only) If the problem occurs on
the same DIMM connector,
check the DIMM connector. If
the connector contains any
foreign material or is
damaged, replace the system
board (see “Removing the
system board” on page 296
and “Installing the system
board” on page 298).
5. (Trained service technician
only) Remove the affected
microprocessor and check the
microprocessor socket pins
for any damaged pins. If a
damage is found, replace the
system board (see “Removing
the system board” on page
296 and “Installing the system
board” on page 298).
6. (Trained Service technician
only) Replace the affected
microprocessor (see
“Removing a microprocessor
and heat sink” on page 284
and “Installing a
microprocessor and heat sink”
on page 286).
Make sure that DIMMs are
installed in the correct sequence
and have the same size, type,
speed, and technology.
56IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 75
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory DIMM disabled for One of
the DIMMs on Memory Subsystem
One of the DIMMs.
InfoDIMM disabled.
1. Make sure the DIMM is
installed correctly (see
“Installing a memory module”
on page 231).
2. If the DIMM was disabled
because of a memory fault
(memory uncorrectable error
or memory logging limit
reached), follow the
suggested actions for that
error event and restart the
server.
3. Check the IBM support
website for an applicable
retain tip or firmware update
that applies to this memory
event. If no memory fault is
recorded in the logs and no
DIMM connector error LED is
lit, you can re-enable the
DIMM through the Setup
utility or the Advanced
Settings Utility (ASU).
Chapter 3. Diagnostics57
Page 76
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory DIMM scrub failure for
DIMM n Status on Memory
Subsystem DIMM n Status.
(n = DIMM number)
ErrorDIMM scrub failure.
1. Check the IBM support
website for an applicable
retain tip or firmware update
that applies to this memory
error.
2. Manually re-enable all
affected DIMMs if the server
firmware version is older than
UEFI v1.10. If the server
firmware version is UEFI
v1.10 or newer, disconnect
and reconnect the server to
the power source and restart
the server.
3. Swap the affected DIMMs (as
indicated by the error LEDs
on the system board or the
event logs) to a different
memory channel or
microprocessor (see
“Installing a memory module”
on page 231 for memory
population).
4. If the problem is related to a
DIMM, replace the failing
DIMM (see “Removing a
memory module” on page 230
and “Installing a memory
module” on page 231).
(Continued on the next page)
58IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 77
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory DIMM scrub failure for
DIMM n Status on Memory
Subsystem DIMM n Status.
(n = DIMM number)
ErrorDIMM scrub failure.
5. (Trained service technician
only) If the problem occurs on
the same DIMM connector,
check the DIMM connector. If
the connector contains any
foreign material or is
damaged, replace the system
board (see “Removing the
system board” on page 296
and “Installing the system
board” on page 298).
6. (Trained service technician
only) Remove the affected
microprocessor and check the
microprocessor socket pins
for any damaged pins. If a
damage is found, replace the
system board (see “Removing
the system board” on page
296 and “Installing the system
board” on page 298).
7. (Trained service technician
only) If the problem is related
to microprocessor socket
pins, replace the system
board (see “Removing the
system board” on page 296
and “Installing the system
board” on page 298).
8. (Trained Service technician
only) Replace the affected
microprocessor (see
“Removing a microprocessor
and heat sink” on page 284
and “Installing a
microprocessor and heat sink”
on page 286).
Chapter 3. Diagnostics59
Page 78
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected
for DIMM n Status on Memory
Subsystem DIMM n Status.
(n = DIMM number)
ErrorA memory uncorrectable
error has occurred.
1. Check the IBM support
website for an applicable
retain tip or firmware update
that applies to this memory
error.
2. Manually re-enable all
affected DIMMs if the server
firmware version is older than
UEFI v1.10. If the server
firmware version is UEFI
v1.10 or newer, disconnect
and reconnect the server to
the power source and restart
the server.
3. Swap the affected DIMMs (as
indicated by the error LEDs
on the system board or the
event logs) to a different
memory channel or
microprocessor (see
“Installing a memory module”
on page 231 for memory
population).
4. If the problem follows the
DIMM, replace the failing
DIMM (see “Removing a
memory module” on page 230
and “Installing a memory
module” on page 231).
(Continued on the next page)
60IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 79
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory uncorrectable error detected
for DIMM n Status on Memory
Subsystem DIMM n Status.
(n = DIMM number)
ErrorA memory uncorrectable
error has occurred.
5. (Trained service technician
only) If the problem occurs on
the same DIMM connector,
check the DIMM connector. If
the connector contains any
foreign material or is
damaged, replace the system
board (see “Removing the
system board” on page 296
and “Installing the system
board” on page 298).
6. (Trained service technician
only) Remove the affected
microprocessor and check the
microprocessor socket pins
for any damaged pins. If a
damage is found, replace the
system board (see “Removing
the system board” on page
296 and “Installing the system
board” on page 298).
7. (Trained Service technician
only) Replace the affected
microprocessor (see
“Removing a microprocessor
and heat sink” on page 284
and “Installing a
microprocessor and heat sink”
on page 286).
Chapter 3. Diagnostics61
Page 80
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory Logging Limit Reached for
DIMM nStatus on Memory
Subsystem DIMMnStatus.
(n = DIMM number)
Memory DIMM Configuration Error
for DIMM nStatus on Memory
Subsystem DIMM nStatus.
(n = DIMM number)
ErrorThe memory logging limit
has been reached.
ErrorA DIMM configuration error
has occurred.
1. Check the IBM support
website for an applicable
retain tip or firmware update
that applies to this memory
error.
2. Swap the affected DIMMs (as
indicated by the error LEDs
on the system board or the
event logs) to a different
memory channel or
microprocessor (see
“Installing a memory module”
on page 231 for memory
population).
3. If the error still occurs on the
same DIMM, replace the
affected DIMM.
4. (Trained service technician
only) If the problem occurs on
the same DIMM connector,
check the DIMM connector. If
the connector contains any
foreign material or is
damaged, replace the system
board (see “Removing the
system board” on page 296
and “Installing the system
board” on page 298).
5. (Trained service technician
only) Remove the affected
microprocessor and check the
microprocessor socket pins
for any damaged pins. If a
damage is found, replace the
system board (see “Removing
the system board” on page
296 and “Installing the system
board” on page 298).
6. (Trained Service technician
only) Replace the affected
microprocessor (see
“Removing a microprocessor
and heat sink” on page 284
and “Installing a
microprocessor and heat sink”
on page 286).
Make sure that DIMMs are
installed in the correct sequence
and have the same size, type,
speed, and technology.
62IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 81
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Memory DIMM disabled for DIMM n
Status on Memory Subsystem DIMM
n Status.
(n = DIMM number)
InfoDIMM disabled.
1. Make sure the DIMM is
installed correctly (see
“Installing a memory module”
on page 231).
2. If the DIMM was disabled
because of a memory fault
(memory uncorrectable error
or memory logging limit
reached), follow the
suggested actions for that
error event and restart the
server.
3. Check the IBM support
website for an applicable
retain tip or firmware update
that applies to this memory
event. If no memory fault is
recorded in the logs and no
DIMM connector error LED is
lit, you can re-enable the
DIMM through the Setup
utility or the Advanced
Settings Utility (ASU).
Sensor DIMM n Temp has
transitioned to critical from a less
severe state.
(n = DIMM number)
ErrorA sensor has changed to
Critical state from a less
severe state.
1. Make sure that the fans are
operating, that there are no
obstructions to the airflow,
that the air baffles are in
place and correctly installed,
and that the server cover is
installed and completely
closed.
2. If a fan has failed, complete
the action for a fan failure.
3. Replace DIMM n.
(n = DIMM number)
Chapter 3. Diagnostics63
Page 82
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
A PCI PERR has occurred on system
%1.
(%1 = CIM_ComputerSystem.
ElementName)
A PCI SERR has occurred on system
%1.
(%1 = CIM_ComputerSystem.
ElementName)
ErrorA PCI PERR has occurred.
(Sensor = PCI Slot n; n =
PCI slot number)
ErrorA PCI SERR has occurred.
(Sensor = PCI Slot n; n =
PCI slot number)
1. Check the extender-card
LEDs.
2. Reseat the affected adapters
and extender card.
3. Update the server and
adapter firmware (UEFI and
IMM).
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
4. Remove the adapter from slot
n.
5. Replace the PCIe adapter.
6. Replace extender card n.
(n = PCI slot number)
1. Check the extender-card
LEDs.
2. Reseat the affected adapters
and extender card.
3. Update the server and
adapter firmware (UEFI and
IMM).
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
4. Remove the adapter from slot
n.
5. Replace the PCIe adapter.
6. Replace extender card n.
(n = PCI slot number)
64IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 83
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
A PCI PERR has occurred on system
%1.
(%1 = CIM_ComputerSystem.
ElementName)
ErrorA PCI PERR has occurred.
(Sensor = One of PCI Err)
1. Check the extender-card
LEDs.
2. Reseat the affected adapters
and riser card.
3. Update the server and
adapter firmware (UEFI and
IMM).
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
4. Remove both adapters.
5. Replace the PCIe adapter.
6. Replace the extender card.
7. (Trained service technician
only) Replace the system
board.
A PCI SERR has occurred on system
%1.
(%1 = CIM_ComputerSystem.
ElementName)
ErrorA PCI SERR has occurred.
(Sensor = One of PCI Err)
1. Check the extender-card
LEDs.
2. Reseat the affected adapters
and extender card.
3. Update the server and
adapter firmware (UEFI and
IMM).
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
4. Remove both adapters.
5. Replace the PCIe adapter.
6. Replace the extender card.
7. (Trained service technician
only) Replace the system
board.
Chapter 3. Diagnostics65
Page 84
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Fault in slot System board on system
%1.
(%1 = CIM_ComputerSystem.
ElementName)
Redundancy Bckup Mem Status has
been reduced.
IMM Network Initialization Complete.InfoAn IMM network has
Certificate Authority %1 has detected
a %2 Certificate Error.
(%1 = IBM_CertificateAuthority.
CADistinguishedName;
%2 = CIM_PublicKeyCertificate.
ElementName)
Ethernet Data Rate modified from %1
to %2 by user %3.
(%1 = CIM_EthernetPort.Speed;
%2 = CIM_EthernetPort.Speed;
%3 = user ID)
Error
ErrorRedundancy has been lost
and is insufficient to continue
operation.
completed initialization.
ErrorA problem has occurred with
the SSL Server, SSL Client,
or SSL Trusted CA certificate
that has been imported into
the IMM. The imported
certificate must contain a
public key that corresponds
to the key pair that was
previously generated by the
Generate a New Key and
Certificate Signing
Request link.
InfoA user has modified the
Ethernet port data rate.
1. Check the extender-card
LEDs.
2. Reseat the affected adapters
and extender card.
3. Update the server and
adapter firmware (UEFI and
IMM).
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
4. Remove both adapters.
5. Replace the PCIe adapter.
6. Replace the extender card.
7. (Trained service technician
only) Replace the system
board.
1. Check the system-event log
for DIMM failure events
(uncorrectable or PFA) and
correct the failures.
2. Re-enable mirroring in the
Setup utility.
No action; information only.
1. Make sure that the certificate
that you are importing is
correct.
2. Try importing the certificate
again.
No action; information only.
66IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 85
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Ethernet Duplex setting modified
from %1 to %2 by user %3.
InfoA user has modified the
Ethernet port duplex setting.
No action; information only.
(%1 = CIM_EthernetPort.FullDuplex;
%2 = CIM_EthernetPort.FullDuplex;
%3 = user ID)
Ethernet MTU setting modified from
%1 to %2 by user %3.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
IP address of default gateway
modified from %1 to %2 by user
%3s.
(%1 = CIM_IPProtocolEndpoint.
GatewayIPv4Address;
%2 = CIM_StaticIPAssignment
SettingData.
DefaultGatewayAddress;
%3 = user ID)
OS Watchdog response %1 by %2.
(%1 = Enabled or Disabled; %2 =
user ID)
DHCP[%1] failure, no IP address
assigned.
(%1 = IP address, xxx.xxx.xxx.xxx)
Remote Login Successful. Login ID:
%1 from %2 at IP address %3.
(%1 = user ID; %2 =
ValueMap(CIM_ProtocolEndpoint.
ProtocolIFType; %3 = IP address,
xxx.xxx.xxx.xxx)
Attempting to %1 server %2 by user
%3.
(%1 = Power Up, Power Down,
Power Cycle, or Reset; %2 =
IBM_ComputerSystem.
ElementName; %3 = user ID)
Security: Userid: '%1' had %2 login
failures from WEB client at IP
address %3.
(%1 = user ID; %2 =
MaximumSuccessiveLoginFailures
(currently set to 5 in the firmware);
%3 = IP address, xxx.xxx.xxx.xxx)
Security: Login ID: '%1' had %2 login
failures from CLI at %3.
(%1 = user ID; %2 =
MaximumSuccessiveLoginFailures
(currently set to 5 in the firmware);
%3 = IP address, xxx.xxx.xxx.xxx)
InfoA user has modified the
default gateway IP address
of the IMM.
InfoA user has enabled or
disabled an OS Watchdog.
InfoA DHCP server has failed to
assign an IP address to the
IMM.
InfoA user has successfully
logged in to the IMM.
InfoA user has used the IMM to
perform a power function on
the server.
ErrorA user has exceeded the
maximum number of
unsuccessful login attempts
from a Web browser and has
been prevented from logging
in for the lockout period.
ErrorA user has exceeded the
maximum number of
unsuccessful login attempts
from the command-line
interface and has been
prevented from logging in for
the lockout period.
No action; information only.
No action; information only.
1. Make sure that the network
cable is connected.
2. Make sure that there is a
DHCP server on the network
that can assign an IP address
to the IMM.
No action; information only.
No action; information only.
1. Make sure that the correct
login ID and password are
being used.
2. Have the system
administrator reset the login
ID or password.
1. Make sure that the correct
login ID and password are
being used.
2. Have the system
administrator reset the login
ID or password.
68IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 87
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Remote access attempt failed. Invalid
userid or password received. Userid
is '%1' from WEB browser at IP
address %2.
(%1 = user ID; %2 = IP address,
xxx.xxx.xxx.xxx)
Remote access attempt failed. Invalid
userid or password received. Userid
is '%1' from TELNET client at IP
address %2.
(%1 = user ID; %2 = IP address,
xxx.xxx.xxx.xxx)
The Chassis Event Log (CEL) on
system %1 cleared by user %2.
ErrorA user has attempted to log
in from a Web browser by
using an invalid login ID or
password.
ErrorA user has attempted to log
in from a Telnet session by
using an invalid login ID or
password.
InfoA user has cleared the IMM
event log.
1. Make sure that the correct
login ID and password are
being used.
2. Have the system
administrator reset the login
ID or password.
1. Make sure that the correct
login ID and password are
being used.
2. Have the system
administrator reset the login
ID or password.
No action; information only.
(%1 = CIM_ComputerSystem.
ElementName; %2 = user ID)
IMM reset was initiated by user %1.
(%1 = user ID)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
DHCP setting changed to by user
%1.
(%1 = user ID)
IMM: Configuration %1 restored from
a configuration file by user %2.
(%1 = CIM_ConfigurationData.
ConfigurationName; %2 = user ID)
Watchdog %1 Screen Capture
Occurred.
(%1 = OS Watchdog or Loader
Watchdog)
Watchdog %1 Failed to Capture
Screen.
(%1 = OS Watchdog or Loader
Watchdog)
InfoA user has changed the
DHCP mode.
InfoA user has restored the IMM
configuration by importing a
configuration file.
ErrorAn operating-system error
has occurred, and the
screen capture was
successful.
ErrorAn operating-system error
has occurred, and the
screen capture failed.
No action; information only.
No action; information only.
1. Reconfigure the watchdog
timer to a higher value.
2. Make sure that the IMM
Ethernet over USB interface
is enabled.
3. Reinstall the RNDIS or
cdc_ether device driver for
the operating system.
4. Disable the watchdog.
5. Check the integrity of the
installed operating system.
1. Reconfigure the watchdog
timer to a higher value.
2. Make sure that the IMM
Ethernet over USB interface
is enabled.
3. Reinstall the RNDIS or
cdc_ether device driver for
the operating system.
4. Disable the watchdog.
5. Check the integrity of the
installed operating system.
6. Update the IMM firmware.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part
of a cluster solution, verify
that the latest level of code is
supported for the cluster
solution before you update
the code.
70IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 89
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Running the backup IMM main
application.
ErrorThe IMM has resorted to
running the backup main
application.
Update the IMM firmware.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part of a
cluster solution, verify that the
latest level of code is supported
for the cluster solution before you
update the code.
Please ensure that the IMM is
flashed with the correct firmware. The
IMM is unable to match its firmware
to the server.
ErrorThe server does not support
the installed IMM firmware
version.
Update the IMM firmware to a
version that the server supports.
Important: Some cluster
solutions require specific code
levels or coordinated code
updates. If the device is part of a
cluster solution, verify that the
latest level of code is supported
for the cluster solution before you
update the code.
IMM reset was caused by restoring
default values.
InfoThe IMM has been reset
because a user has restored
No action; information only.
the configuration to its
default settings.
IMM clock has been set from NTP
server %1.
(%1 =
IBM_NTPService.ElementName)
SSL data in the IMM configuration
data is invalid. Clearing configuration
data region and disabling SSL+H25.
InfoThe IMM clock has been set
to the date and time that is
provided by the Network
Time Protocol server.
ErrorThere is a problem with the
certificate that has been
imported into the IMM. The
imported certificate must
contain a public key that
corresponds to the key pair
No action; information only.
1. Make sure that the certificate
that you are importing is
correct.
2. Try to import the certificate
again.
that was previously
generated through the
Generate a New Key and
Certificate Signing
Request link.
Flash of %1 from %2 succeeded for
user %3.
(%1 = CIM_ManagedElement.
ElementName;
%2 = Web or LegacyCLI;
%3 = user ID)
InfoA user has successfully
updated one of the following
firmware components:
v IMM main application
v IMM boot ROM
No action; information only.
v Server firmware
v Diagnostics
v Integrated service
processor
Chapter 3. Diagnostics71
Page 90
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Flash of %1 from %2 failed for user
%3.
(%1 = CIM_ManagedElement.
ElementName;
%2 = Web or LegacyCLI;
%3 = user ID)
The Chassis Event Log (CEL) on
system %1 is 75% full.
(%1 = CIM_ComputerSystem.
ElementName)
The Chassis Event Log (CEL) on
system %1 is 100% full.
(%1 = CIM_ComputerSystem.
ElementName)
%1 Platform Watchdog Timer expired
for %2.
(%1 = OS Watchdog or Loader
Watchdog; %2 = OS Watchdog or
Loader Watchdog)
IMM Test Alert Generated by %1.
(%1 = user ID)
Security: Userid: '%1' had %2 login
failures from an SSH client at IP
address %3.
(%1 = user ID; %2 =
MaximumSuccessiveLoginFailures
(currently set to 5 in the firmware);
%3 = IP address, xxx.xxx.xxx.xxx)
InfoAn attempt to update a
firmware component from
the interface and IP address
has failed.
InfoThe IMM event log is 75%
full. When the log is full,
older log entries are
replaced by newer ones.
InfoThe IMM event log is full.
When the log is full, older
log entries are replaced by
newer ones.
ErrorA Platform Watchdog Timer
Expired event has occurred.
InfoA user has generated a test
alert from the IMM.
ErrorA user has exceeded the
maximum number of
unsuccessful login attempts
from SSH and has been
prevented from logging in for
the lockout period.
Try to update the firmware again.
To avoid losing older log entries,
save the log as a text file and
clear the log.
To avoid losing older log entries,
save the log as a text file and
clear the log.
1. Reconfigure the watchdog
timer to a higher value.
2. Make sure that the IMM
Ethernet over USB interface
is enabled.
3. Reinstall the RNDIS or
cdc_ether device driver for
the operating system.
4. Disable the watchdog.
5. Check the integrity of the
installed operating system.
No action; information only.
1. Make sure that the correct
login ID and password are
being used.
2. Have the system
administrator reset the login
ID or password.
72IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 91
Checkout procedure
The checkout procedure is the sequence of tasks that you should follow to
diagnose a problem in the server.
About the checkout procedure
Before you perform the checkout procedure for diagnosing hardware problems,
review the following information:
v Read the safety information that begins on page vii.
v The diagnostic programs provide the primary methods of testing the major
components of the server, such as the system board, Ethernet controller,
keyboard, mouse (pointing device), serial ports, and hard disk drives. You can
also use them to test some external devices. If you are not sure whether a
problem is caused by the hardware or by the software, you can use the
diagnostic programs to confirm that the hardware is working correctly.
v When you run the diagnostic programs, a single problem might cause more than
one error message. When this happens, correct the cause of the first error
message. The other error messages usually will not occur the next time you run
the diagnostic programs.
Exception: If multiple error codes or light path diagnostics LEDs indicate a
microprocessor error, the error might be in a microprocessor or in a
microprocessor socket. See “Microprocessor problems” on page 82 for
information about diagnosing microprocessor problems.
v Before you run the diagnostic programs, you must determine whether the failing
server is part of a shared hard disk drive cluster (two or more servers sharing
external storage devices). If it is part of a cluster, you can run all diagnostic
programs except the ones that test the storage unit (that is, a hard disk drive in
the storage unit) or the storage adapter that is attached to the storage unit. The
failing server might be part of a cluster if any of the following conditions is true:
– You have identified the failing server as part of a cluster (two or more servers
sharing external storage devices).
– One or more external storage units are attached to the failing server and at
least one of the attached storage units is also attached to another server or
unidentifiable device.
– One or more servers are located near the failing server.
Important: If the server is part of a shared hard disk drive cluster, run one test
at a time. Do not run any suite of tests, such as “quick” or “normal” tests,
because this might enable the hard disk drive diagnostic tests.
v If the server is halted and a POST error code is displayed, see “POST error
codes” on page 26. If the server is halted and no error message is displayed,
see “Troubleshooting tables” on page 75 and “Solving undetermined problems”
on page 138.
v For information about power-supply problems, see “Solving power problems” on
page 137 and “Power-supply LEDs” on page 96.
v For intermittent problems, check the system-event log; see “Event logs” on page
23, “System-event log” on page 37, and “Diagnostic programs, messages, and
error codes” on page 97.
Chapter 3. Diagnostics73
Page 92
Performing the checkout procedure
To perform the checkout procedure, complete the following steps:
1. Is the server part of a cluster?
v No: Go to step 2.
v Yes: Shut down all failing servers that are related to the cluster. Go to step 2.
2. Complete the following steps:
a. Turn off the server and all external devices.
b. Check all cables and power cords.
c. Check all internal and external devices for compatibility at
http://www.ibm.com/servers/eserver/serverproven/compat/us/.
d. Set all display controls to the middle positions.
e. Turn on all external devices.
f. Turn on the server. If the server does not start, see “Troubleshooting tables”
on page 75.
g. Check the system-error LED on the operator information panel (see “Server
controls, LEDs, and connectors” on page 9). If it is flashing, check the light
path diagnostics LEDs (see “Light path diagnostics” on page 90).
h. Check for the following results:
v Successful completion of POST
v Successful completion of startup, indicated by a readable display of the
operating-system desktop
3. Are there readable instructions on the main menu?
v No: Find the failure symptom in “Troubleshooting tables” on page 75; if
necessary, see “Solving undetermined problems” on page 138.
v Yes: Run the diagnostic programs (see “Running the diagnostic programs” on
page 97).
– If you receive an error, see “Diagnostic messages” on page 98.
– If the diagnostic programs were completed successfully and you still
suspect a problem, see “Solving undetermined problems” on page 138.
74IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 93
Troubleshooting tables
Use the troubleshooting tables to find solutions to problems that have identifiable
symptoms.
If you cannot find a problem in these tables, see “Running the diagnostic programs”
on page 97 for information about testing the server.
If you have just added new software or a new optional device and the server is not
working, complete the following steps before you use the troubleshooting tables:
1. Check the operator information panel and the light path diagnostics LEDs (see
“Light path diagnostics” on page 90).
2. Remove the software or device that you just added.
3. Run the diagnostic tests to determine whether the server is running correctly.
4. Reinstall the new software or new device.
DVD drive problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
The DVD drive is not
recognized.
A DVD is not working correctly.
1. Make sure that:
v The SATA channel to which the DVD drive is attached (primary or
secondary) is enabled in the Setup utility.
v All cables and jumpers are installed correctly.
v The signal cable and connector are not damaged and the connector pins are
not bent.
v The correct device driver is installed for the DVD drive.
2. Run the DVD drive diagnostic programs.
3. Reseat the following components:
a. DVD drive
b. DVD drive cables
4. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. DVD drive
b. DVD drive and cables
c. (Trained service technician only) System board
1. Clean the DVD.
2. Run the DVD drive diagnostic programs.
3. Reseat the DVD drive.
4. Replace the DVD drive.
Chapter 3. Diagnostics75
Page 94
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
The DVD drive tray is not
working.
1. Make sure that the server is turned on.
2. Insert the end of a straightened paper clip into the manual tray-release
opening.
3. Reseat the DVD drive.
4. Replace the DVD drive.
General problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
A cover lock is broken, an LED
is not working, or a similar
problem has occurred.
The server is hung while the
screen is on. Cannot start the
Setup utility by pressing F1.
If the part is a CRU, replace it. If the part is a FRU, the part must be replaced by a
trained service technician.
1. See “Nx boot failure” on page 136 for more information.
2. See “Recovering the server firmware” on page 134 for more information.
Hard disk drive problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
Not all drives are recognized by
the hard disk drive diagnostic
tests.
The server stops responding
during the hard disk drive
diagnostic test.
A hard disk drive was not
detected while the operating
system was being started.
76IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Remove the drive that is indicated by the diagnostic tests; then, run the hard disk
drive diagnostic tests again. If the remaining drives are recognized, replace the
drive that you removed with a new one.
Remove the hard disk drive that was being tested when the server stopped
responding, and run the diagnostic test again. If the hard disk drive diagnostic test
runs successfully, replace the drive that you removed with a new one.
Reseat all hard disk drives and cables; then, run the hard disk drive diagnostic
tests again.
Page 95
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
A hard disk drive passes the
diagnostic Fixed Disk Test, but
the problem remains.
Run the diagnostic SCSI Fixed Disk Test (see “Running the diagnostic programs”
on page 97).
Note: This test is not available on servers that have RAID arrays or servers that
have SATA hard disk drives.
Hypervisor problems
v Follow the suggested actions in the order in which they are listed in the Action
column until the problem is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page
141to determine which components are customer replaceable units (CRU) and
which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must
be performed only by a Trained service technician.
v Go to the IBM support Web site at http://www.ibm.com/systems/support/ to check
for technical information, hints, tips, and new device drivers or to submit a request
for information.
SymptomAction
If an optional embedded
hypervisor flash device is
not listed in the expected
boot order, does not
appear in the list of boot
devices, or a similar
problem has occurred.
1. Make sure that the optional embedded hypervisor flash
device is selected on the boot manager (<F12> Select Boot
Device) at startup.
2. Make sure that the embedded hypervisor flash device is
seated in the connector correctly (see “Removing a USB
embedded hypervisor flash device” on page 234and
“Installing a USB embedded hypervisor flash device” on page
236).
3. See the documentation that comes with the optional
embedded hypervisor flash device for setup and configuration
information.
4. Make sure that other software works on the server.
Chapter 3. Diagnostics77
Page 96
Intermittent problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
A problem occurs only
occasionally and is difficult to
diagnose.
1. Make sure that:
v All cables and cords are connected securely to the rear of the server and
attached devices.
v When the server is turned on, air is flowing from the fan grille. If there is no
airflow, the fan is not working. This can cause the server to overheat and
shut down.
2. Check the system-event log or IMM log (see “Event logs” on page 23).
3. See “Solving undetermined problems” on page 138.
Keyboard, mouse, or pointing-device problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
All or some keys on the
keyboard do not work.
1. Make sure that:
v The keyboard cable is securely connected.
v The server and the monitor are turned on.
2. See http://www.ibm.com/servers/eserver/serverproven/compat/us/ for keyboard
compatibility.
3. If you are using a USB keyboard, run the Setup utility and enable keyboardless
operation to prevent the 301 POST error message from being displayed during
startup.
4. If you are using a USB keyboard and it is connected to a USB hub, disconnect
the keyboard from the hub and connect it directly to the server.
5. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. Keyboard
b. (Trained service technician only) System board
78IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 97
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
The mouse or pointing device
does not work.
1. Make sure that:
v The mouse or pointing device is compatible with the server. See
v The mouse or pointing-device cable is securely connected to the server.
v The mouse or pointing-device device drivers are installed correctly.
v The server and the monitor are turned on.
v The mouse is enabled in the Setup utility.
2. If you are using a USB mouse or pointing device and it is connected to a USB
hub, disconnect the mouse or pointing device from the hub and connect it
directly to the server.
3. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. Mouse or pointing device
b. (Trained service technician only) System board
Chapter 3. Diagnostics79
Page 98
Memory problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
The amount of system memory
that is displayed is less than the
amount of installed physical
memory.
1. Make sure that:
v No error LEDs are lit on the operator information panel or on the DIMM.
v Memory mirroring does not account for the discrepancy.
v The memory modules are seated correctly.
v You have installed the correct type of memory.
v If you changed the memory, you updated the memory configuration in the
Setup utility.
v All banks of memory are enabled. The server might have automatically
disabled a memory bank when it detected a problem, or a memory bank
might have been manually disabled.
2. Check the POST error log:
v If a DIMM was disabled by a systems-management interrupt (SMI), replace
the DIMM.
v If a DIMM was disabled by the user or by POST, run the Setup utility and
enable the DIMM.
3. Run memory diagnostics (see “Running the diagnostic programs” on page 97).
4. Make sure that there is no memory mismatch when the server is at the
minimum memory configuration (one 1 GB DIMM); see the information about
the minimum required configuration on page “Solving undetermined problems”
on page 138).
5. Add one pair of DIMMs at a time, making sure that the DIMMs in each pair
match.
6. Reseat the DIMMs, and then restart the server.
7. Reverse the DIMMs between the channels (of the same microprocessor), and
then restart the server. If the problem is related to a DIMM, replace the failing
DIMM.
8. (Trained service technician only) Install the failing DIMM into a DIMM connector
for microprocessor 2 (if installed) to verify that the problem is not the
microprocessor or the DIMM connector.
9. (Trained service technician only) Replace the system board.
80IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Page 99
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
Multiple rows of DIMMs in a
branch are identified as failing.
1. Reseat the DIMMs; then, restart the server.
2. Remove the lowest-numbered DIMM pair of those that are identified and
replace it with an identical pair of known good DIMMs; then, restart the server.
Repeat as necessary. If the failures continue after all identified pairs are
replaced, go to step4.
3. Return the removed DIMMs, one pair at a time, to their original connectors,
restarting the server after each pair, until a pair fails. Replace each DIMM in the
failed pair with an identical known good DIMM, restarting the server after each
DIMM. Replace the failed DIMM. Repeat step 3 until you have tested all
removed DIMMs.
4. Replace the lowest-numbered DIMM pair of those identified; then, restart the
server. Repeat as necessary.
5. Reverse the DIMMs between the channels (of the same microprocessor), and
then restart the server. If the problem is related to a DIMM, replace the failing
DIMM.
6. (Trained service technician only) Install the failing DIMM into a DIMM connector
for microprocessor 2 (if installed) to verify that the problem is not the
microprocessor or the DIMM connector.
7. (Trained service technician only) Replace the system board.
Chapter 3. Diagnostics81
Page 100
Microprocessor problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 4, “Parts listing, System x3400 M3 Types 7378 and 7379,” on page 141 to determine which
components are customer replaceable units (CRU) and which components are field replaceable units
(FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
SymptomAction
The server emits a continuous
beep during POST, indicating
that the startup (boot)
microprocessor is not working
correctly.
1. Correct any errors that are indicated by the light path diagnostics LEDs (see
“Light path diagnostics” on page 90).
2. Make sure that the server supports all the microprocessors and that the
microprocessors match in speed and cache size.
3. (Trained service technician only) Reseat microprocessor 1
4. (Trained service technician only) If there is no indication of which
microprocessor has failed, isolate the error by testing with one microprocessor
at a time.
5. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. (Trained service technician only) Microprocessor 2
b. VRM
c. (Trained service technician only) System board
6. (Trained service technician only) If multiple error codes or light path diagnostics
LEDs indicate a microprocessor error, reverse the locations of two
microprocessors to determine whether the error is associated with a
microprocessor or with a microprocessor socket.
v If the error is associated with a microprocessor, replace the microprocessor.
v If the error is associated with a VRM, replace the VRM.
v If the error is associated with a microprocessor socket, replace the system
board.
82IBM System x3400 M3 Types 7378 and 7379: Problem Determination and Service Guide
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.