IBM System x3550 Ty pe 7978 and 1913
Problem Dete rminatio n and Service Guid e
IBM System x3550 Ty pe 7978 and 1913
Problem Dete rminatio n and Service Guid e
Note: Before using this information and the product it supports, read the general information in Appendix B, “Notices,” on page 161
and the Warranty and Support Information document on the IBM System x Documentation CD.
Fifth Edition November 2006)
© Copyright International Business Machines Corporation 2006. All rights reserved.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Guidelines for trained service technicians . . . . . . . . . . . . . . . viii
Inspecting for unsafe conditions . . . . . . . . . . . . . . . . . viii
Guidelines for servicing electrical equipment . . . . . . . . . . . . . viii
Safety statements . . . . . . . . . . . . . . . . . . . . . . . .x
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . .1
Related documentation . . . . . . . . . . . . . . . . . . . . . .1
Notices and statements in this document . . . . . . . . . . . . . . . .2
Features and specifications . . . . . . . . . . . . . . . . . . . . .3
Server controls, LEDs, and connectors . . . . . . . . . . . . . . . .5
Front view . . . . . . . . . . . . . . . . . . . . . . . . . .5
Light path diagnostics panel . . . . . . . . . . . . . . . . . . .7
Rear view . . . . . . . . . . . . . . . . . . . . . . . . . .8
Internal LEDs, connectors, and jumpers . . . . . . . . . . . . . . . .9
System-board internal connectors . . . . . . . . . . . . . . . . .10
Power backplane card internal connectors . . . . . . . . . . . . . .10
System-board switches and jumpers . . . . . . . . . . . . . . . .11
System-board external connectors . . . . . . . . . . . . . . . . .14
System-board LEDs . . . . . . . . . . . . . . . . . . . . . .15
System-board option connectors . . . . . . . . . . . . . . . . .17
Chapter 2. Diagnostics . . . . . . . . . . . . . . . . . . . . .19
Diagnostic tools . . . . . . . . . . . . . . . . . . . . . . . .19
POST . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
POST beep codes . . . . . . . . . . . . . . . . . . . . . .19
Error logs . . . . . . . . . . . . . . . . . . . . . . . . . .26
No-beep symptoms . . . . . . . . . . . . . . . . . . . . . .27
POST error codes . . . . . . . . . . . . . . . . . . . . . . .28
Checkout procedure . . . . . . . . . . . . . . . . . . . . . . .41
About the checkout procedure . . . . . . . . . . . . . . . . . .41
Performing the checkout procedure . . . . . . . . . . . . . . . .41
Troubleshooting tables . . . . . . . . . . . . . . . . . . . . . .43
CD-RW/DVD drive problems . . . . . . . . . . . . . . . . . . .43
General problems . . . . . . . . . . . . . . . . . . . . . . .44
Hard disk drive problems . . . . . . . . . . . . . . . . . . . .44
Intermittent problems . . . . . . . . . . . . . . . . . . . . . .45
USB keyboard, mouse, or pointing-device problems . . . . . . . . . .46
Memory problems . . . . . . . . . . . . . . . . . . . . . . .47
Microprocessor problems . . . . . . . . . . . . . . . . . . . .48
Monitor problems . . . . . . . . . . . . . . . . . . . . . . .49
Optional-device problems . . . . . . . . . . . . . . . . . . . .51
Power problems . . . . . . . . . . . . . . . . . . . . . . .52
Serial port problems . . . . . . . . . . . . . . . . . . . . . .54
ServerGuide problems . . . . . . . . . . . . . . . . . . . . .54
Software problems . . . . . . . . . . . . . . . . . . . . . .55
Universal Serial Bus (USB) port problems . . . . . . . . . . . . . .56
Video problems . . . . . . . . . . . . . . . . . . . . . . . .56
Light path diagnostics . . . . . . . . . . . . . . . . . . . . . .56
Remind button . . . . . . . . . . . . . . . . . . . . . . . .58
Light path diagnostics switch . . . . . . . . . . . . . . . . . . .58
Light path diagnostics LEDs . . . . . . . . . . . . . . . . . . .58
Power-supply LEDs . . . . . . . . . . . . . . . . . . . . . . .60
© Copyright IBM Corp. 2006 iii
Diagnostic programs, messages, and error codes . . . . . . . . . . . .61
Running the diagnostic programs . . . . . . . . . . . . . . . . .62
Diagnostic text messages . . . . . . . . . . . . . . . . . . . .63
Viewing the test log . . . . . . . . . . . . . . . . . . . . . .63
Diagnostic error codes . . . . . . . . . . . . . . . . . . . . .64
Recovering the BIOS code . . . . . . . . . . . . . . . . . . . .76
System-error log messages . . . . . . . . . . . . . . . . . . . .78
Solving power problems . . . . . . . . . . . . . . . . . . . . .84
Solving Ethernet controller problems . . . . . . . . . . . . . . . . .85
Solving undetermined problems . . . . . . . . . . . . . . . . . . .86
Problem determination tips . . . . . . . . . . . . . . . . . . . .86
Calling IBM for service . . . . . . . . . . . . . . . . . . . . . .87
Chapter 3. Parts listing, Type 7978 and 1913 server . . . . . . . . . .89
Replaceable server components . . . . . . . . . . . . . . . . . .90
Power cords . . . . . . . . . . . . . . . . . . . . . . . . . .93
Chapter 4. Removing and replacing server components . . . . . . . .95
Installation guidelines . . . . . . . . . . . . . . . . . . . . . .95
System reliability guidelines . . . . . . . . . . . . . . . . . . .96
Working inside the server with the power on . . . . . . . . . . . . .96
Handling static-sensitive devices . . . . . . . . . . . . . . . . .96
Returning a device or component . . . . . . . . . . . . . . . . .97
Removing and replacing Tier 1 CRUs . . . . . . . . . . . . . . . .98
Removing the cover . . . . . . . . . . . . . . . . . . . . . .98
Installing the cover . . . . . . . . . . . . . . . . . . . . . .98
Removing the air baffle . . . . . . . . . . . . . . . . . . . . .99
Installing the air baffle . . . . . . . . . . . . . . . . . . . . . 101
Removing an adapter . . . . . . . . . . . . . . . . . . . . . 102
Installing an adapter . . . . . . . . . . . . . . . . . . . . . 103
Removing a hard disk drive . . . . . . . . . . . . . . . . . . . 103
Installing a hard disk drive . . . . . . . . . . . . . . . . . . . 105
Removing and installing the internal CD-RW/DVD drive . . . . . . . . 107
Removing a memory module (DIMM) . . . . . . . . . . . . . . .110
Installing a memory module . . . . . . . . . . . . . . . . . . .110
Removing the Remote Supervisor Adapter II SlimLine . . . . . . . . .113
Installing the Remote Supervisor Adapter II SlimLine . . . . . . . . .114
Removing the RAID controller . . . . . . . . . . . . . . . . . .115
Installing the RAID controller . . . . . . . . . . . . . . . . . .117
Removing the RAID-controller battery . . . . . . . . . . . . . . .118
Installing the RAID-controller battery . . . . . . . . . . . . . . .119
Removing a power supply . . . . . . . . . . . . . . . . . . . 120
Installing a power supply . . . . . . . . . . . . . . . . . . . . 121
Removing a hot-swap fan assembly . . . . . . . . . . . . . . . . 122
Installing a hot-swap fan assembly . . . . . . . . . . . . . . . . 123
Removing the system-board battery . . . . . . . . . . . . . . . . 123
Installing the system-board battery . . . . . . . . . . . . . . . . 124
Removing and replacing Tier 2 CRUs . . . . . . . . . . . . . . . . 125
Removing a riser card assembly . . . . . . . . . . . . . . . . . 126
Installing a riser card assembly . . . . . . . . . . . . . . . . . 127
Removing a disk drive cage assembly . . . . . . . . . . . . . . . 128
Installing a disk drive cage assembly . . . . . . . . . . . . . . . 130
Removing the hot swap backplane or simple swap backplate . . . . . . 131
Installing the hot swap backplane or simple swap backplate . . . . . . . 133
Removing the power-supply backplane . . . . . . . . . . . . . . 135
Installing the power-supply backplane . . . . . . . . . . . . . . . 136
iv IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Removing and replacing FRUs . . . . . . . . . . . . . . . . . . 137
Removing a microprocessor . . . . . . . . . . . . . . . . . . 137
Installing a microprocessor . . . . . . . . . . . . . . . . . . . 138
Removing the operator information panel assembly . . . . . . . . . . 140
Installing the operator information panel assembly . . . . . . . . . . 142
Removing the system board . . . . . . . . . . . . . . . . . . 144
Installing the system board . . . . . . . . . . . . . . . . . . . 145
Chapter 5. Configuration information and instructions . . . . . . . . 149
Updating the firmware . . . . . . . . . . . . . . . . . . . . . . 149
Configuring the server . . . . . . . . . . . . . . . . . . . . . . 149
Using the ServerGuide Setup and Installation CD . . . . . . . . . . . 149
Using the Configuration/Setup Utility program . . . . . . . . . . . . 151
Configuring the Ethernet controller . . . . . . . . . . . . . . . . 152
Configuring hot-swap SAS or hot-swap SATA RAID . . . . . . . . . . 152
Configuring simple-swap SATA RAID . . . . . . . . . . . . . . . 155
Updating the UUID . . . . . . . . . . . . . . . . . . . . . . . 156
Updating the DMI/SMBIOS data . . . . . . . . . . . . . . . . . . 156
Appendix A. Getting help and technical assistance . . . . . . . . . . 159
Before you call . . . . . . . . . . . . . . . . . . . . . . . . 159
Using the documentation . . . . . . . . . . . . . . . . . . . . . 159
Getting help and information from the World Wide Web . . . . . . . . . 160
Software service and support . . . . . . . . . . . . . . . . . . . 160
Hardware service and support . . . . . . . . . . . . . . . . . . . 160
Appendix B. Notices . . . . . . . . . . . . . . . . . . . . . . 161
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Important notes . . . . . . . . . . . . . . . . . . . . . . . . 162
Product recycling and disposal . . . . . . . . . . . . . . . . . . 163
Battery return program . . . . . . . . . . . . . . . . . . . . . 164
Electronic emission notices . . . . . . . . . . . . . . . . . . . . 164
Federal Communications Commission (FCC) statement . . . . . . . . 164
Industry Canada Class A emission compliance statement . . . . . . . . 165
Australia and New Zealand Class A statement . . . . . . . . . . . . 165
United Kingdom telecommunications safety requirement . . . . . . . . 165
European Union EMC Directive conformance statement . . . . . . . . 165
Taiwanese Class A warning statement . . . . . . . . . . . . . . . 166
Chinese Class A warning statement . . . . . . . . . . . . . . . . 166
Japanese Voluntary Control Council for Interference (VCCI) statement 166
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Contents v
vi IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Safety
Before installing this product, read the Safety Information.
Antes de instalar este produto, leia as Informações de Segurança.
Pred instalací tohoto produktu si prectete prírucku bezpecnostních instrukcí.
Læs sikkerhedsforskrifterne, før du installerer dette produkt.
Lees voordat u dit product installeert eerst de veiligheidsvoorschriften.
Ennen kuin asennat tämän tuotteen, lue turvaohjeet kohdasta Safety Information.
Avant d’installer ce produit, lisez les consignes de sécurité.
Vor der Installation dieses Produkts die Sicherheitshinweise lesen.
Prima di installare questo prodotto, leggere le Informazioni sulla Sicurezza.
Les sikkerhetsinformasjonen (Safety Information) før du installerer dette produktet.
Antes de instalar este produto, leia as Informações sobre Segurança.
Antes de instalar este producto, lea la información de seguridad.
Läs säkerhetsinformationen innan du installerar den här produkten.
© Copyright IBM Corp. 2006 vii
Guidelines for trained service technicians
This section contains information for trained service technicians.
Inspecting for unsafe conditions
Use the information in this section to help you identify potential unsafe conditions in
an IBM product that you are working on. Each IBM product, as it was designed and
manufactured, has required safety items to protect users and service technicians
from injury. The information in this section addresses only those items. Use good
judgment to identify potential unsafe conditions that might be caused by non-IBM
alterations or attachment of non-IBM features or options that are not addressed in
this section. If you identify an unsafe condition, you must determine how serious the
hazard is and whether you must correct the problem before you work on the
product.
Consider the following conditions and the safety hazards that they present:
v Electrical hazards, especially primary power. Primary voltage on the frame can
cause serious or fatal electrical shock.
v Explosive hazards, such as a damaged CRT face or a bulging capacitor.
v Mechanical hazards, such as loose or missing hardware.
inspect the product for potential unsafe conditions, complete the following steps:
To
1. Make sure that the power is off and the power cord is disconnected.
2. Make sure that the exterior cover is not damaged, loose, or broken, and
observe any sharp edges.
3. Check the power cord:
v Make sure that the third-wire ground connector is in good condition. Use a
meter to measure third-wire ground continuity for 0.1 ohm or less between
the external ground pin and the frame ground.
v Make sure that the power cord is the correct type, as specified in “Power
cords” on page 93.
v Make sure that the insulation is not frayed or worn.
Remove the cover.
4.
5. Check for any obvious non-IBM alterations. Use good judgment as to the safety
of any non-IBM alterations.
6. Check inside the server for any obvious unsafe conditions, such as metal filings,
contamination, water or other liquid, or signs of fire or smoke damage.
7. Check for worn, frayed, or pinched cables.
8. Make sure that the power-supply cover fasteners (screws or rivets) have not
been removed or tampered with.
Guidelines for servicing electrical equipment
Observe the following guidelines when servicing electrical equipment:
v Check the area for electrical hazards such as moist floors, nongrounded power
extension cords, power surges, and missing safety grounds.
v Use only approved tools and test equipment. Some hand tools have handles that
are covered with a soft material that does not provide insulation from live
electrical currents.
v Regularly inspect and maintain your electrical hand tools for safe operational
condition. Do not use worn or broken tools or testers.
viii IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Do not touch the reflective surface of a dental mirror to a live electrical circuit.
The surface is conductive and can cause personal injury or equipment damage if
it touches a live electrical circuit.
v Some rubber floor mats contain small conductive fibers to decrease electrostatic
discharge. Do not use this type of mat to protect yourself from electrical shock.
v Do not work alone under hazardous conditions or near equipment that has
hazardous voltages.
v Locate the emergency power-off (EPO) switch, disconnecting switch, or electrical
outlet so that you can turn off the power quickly in the event of an electrical
accident.
v Disconnect all power before you perform a mechanical inspection, work near
power supplies, or remove or install main units.
v Before you work on the equipment, disconnect the power cord. If you cannot
disconnect the power cord, have the customer power-off the wall box that
supplies power to the equipment and lock the wall box in the off position.
v Never assume that power has been disconnected from a circuit. Check it to
make sure that it has been disconnected.
v If you have to work on equipment that has exposed electrical circuits, observe
the following precautions:
– Make sure that another person who is familiar with the power-off controls is
near you and is available to turn off the power if necessary.
– When you are working with powered-on electrical equipment, use only one
hand. Keep the other hand in your pocket or behind your back to avoid
creating a complete circuit that could cause an electrical shock.
– When using a tester, set the controls correctly and use the approved probe
leads and accessories for that tester.
– Stand on a suitable rubber mat to insulate you from grounds such as metal
floor strips and equipment frames.
Use extreme care when measuring high voltages.
v
v To ensure proper grounding of components such as power supplies, pumps,
blowers, fans, and motor generators, do not service these components outside of
their normal operating locations.
v If an electrical accident occurs, use caution, turn off the power, and send another
person to get medical aid.
Safety ix
Safety statements
Important:
Each caution and danger statement in this documentation begins with a number.
This number is used to cross reference an English-language caution or danger
statement with translated versions of the caution or danger statement in the Safety
Information document.
For example, if a caution statement begins with a number 1, translations for that
caution statement appear in the Safety Information document under statement 1.
Be sure to read all caution and danger statements in this documentation before
performing the instructions. Read any additional safety information that comes with
your server or optional device before you install the device.
x IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Statement 1:
DANGER
Electrical
current from power, telephone, and communication cables is
hazardous.
To avoid a shock hazard:
v Do not connect or disconnect any cables or perform installation,
maintenance, or reconfiguration of this product during an electrical
storm.
v Connect all power cords to a properly wired and grounded electrical
outlet.
v Connect to properly wired outlets any equipment that will be attached to
this product.
v When possible, use one hand only to connect or disconnect signal
cables.
v Never turn on any equipment when there is evidence of fire, water, or
structural damage.
v Disconnect the attached power cords, telecommunications systems,
networks, and modems before you open the device covers, unless
instructed otherwise in the installation and configuration procedures.
v Connect and disconnect cables as described in the following table when
installing, moving, or opening covers on this product or attached
devices.
To Connect: To Disconnect:
1. Turn everything OFF.
2. First, attach all cables to devices.
3. Attach signal cables to connectors.
4. Attach power cords to outlet.
1. Turn everything OFF.
2. First, remove power cords from outlet.
3. Remove signal cables from connectors.
4. Remove all cables from devices.
5. Turn device ON.
Safety xi
Statement 2:
CAUTION:
When replacing the lithium battery, use only IBM Part Number 33F8354 or an
equivalent type battery recommended by the manufacturer. If your system has
a module containing a lithium battery, replace it only with the same module
type made by the same manufacturer. The battery contains lithium and can
explode if not properly used, handled, or disposed of.
Do not:
v Throw or immerse into water
v Heat to more than 100°C (212°F)
v Repair or disassemble
Dispose
of the battery as required by local ordinances or regulations.
xii IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Statement 3:
CAUTION:
When laser products (such as CD-ROMs, DVD drives, fiber optic devices, or
transmitters) are installed, note the following:
v Do not remove the covers. Removing the covers of the laser product could
result in exposure to hazardous laser radiation. There are no serviceable
parts inside the device.
v Use of controls or adjustments or performance of procedures other than
those specified herein might result in hazardous radiation exposure.
DANGER
laser products contain an embedded Class 3A or Class 3B laser
Some
diode. Note the following.
Laser radiation when open. Do not stare into the beam, do not view directly
with optical instruments, and avoid direct exposure to the beam.
Class 1 Laser Product
Laser Klasse 1
Laser Klass 1
Luokan 1 Laserlaite
Appareil A Laser de Classe 1
`
Safety xiii
Statement 4:
≥ 18 kg (39.7 lb) ≥ 32 kg (70.5 lb) ≥ 55 kg (121.2 lb)
CAUTION:
Use safe practices when lifting.
Statement 5:
CAUTION:
The power control button on the device and the power switch on the power
supply do not turn off the electrical current supplied to the device. The device
also might have more than one power cord. To remove all electrical current
from the device, ensure that all power cords are disconnected from the power
source.
2
1
xiv IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Statement 8:
CAUTION:
Never remove the cover on a power supply or any part that has the following
label attached.
Hazardous voltage, current, and energy levels are present inside any
component that has this label attached. There are no serviceable parts inside
these components. If you suspect a problem with one of these parts, contact
a service technician.
Statement 26:
CAUTION:
Do not place any object on top of rack-mounted devices.
Attention: This server is suitable for use on an IT power distribution system,
whose maximum phase to phase voltage is 240 V under any distribution fault
condition.
WARNING: Handling the cord on this product or cords associated with accessories
sold with this product, will expose you to lead, a chemical known to the State of
California to cause cancer, and birth defects or other reproductive harm. Wash
hands after handling.
ADVERTENCIA: El contacto con el cable de este producto o con cables de
accesorios que se venden junto con este producto, pueden exponerle al plomo, un
elemento químico que en el estado de California de los Estados Unidos está
considerado como un causante de cancer y de defectos congénitos, además de
otros riesgos reproductivos. Lávese las manos después de usar el producto.
Safety xv
xvi IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Chapter 1. Introduction
This Problem Determination and Service Guide contains information to help you
solve problems that might occur in your IBM
®
System x3550 Type 7978 and 1913
server. It describes the diagnostic tools that come with the server, error codes and
suggested actions, and instructions for replacing failing components.
Technical updates might be available to provide additional information that is not
included in the server documentation. To check for updates, go to
http://www.ibm.com/servers/eserver/support/xseries/index.html, select System
x3550 from the Hardware list, and click Go. For firmware updates, click the
Download tab. For Documentation updates, click the Install and use tab, and click
Product documentation.
Replaceable components are of three types:
v Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your
responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for
the installation.
v Tier 2 customer replaceable unit: Yo u may install a Tier 2 CRU yourself or
request IBM to install it, at no additional charge, under the type of warranty
service that is designated for your server.
v Field replaceable unit (FRU): FRUs must be installed only by trained service
technicians.
information about the terms of the warranty and getting service and assistance,
For
see the Warranty and Support Information document.
Related documentation
In addition to this document, the following documentation also comes with the
server:
v Installation Guide
This printed document contains instructions for setting up the server and basic
instructions for installing some options.
v User’s Guide
This document is in Portable Document Format (PDF) on the IBM System x
Documentation CD. It provides general information about the server, including
information about features, and how to configure the server. It also contains
detailed instructions for installing, removing, and connecting optional devices that
the server supports.
v Rack Installation Instructions
This printed document contains instructions for installing the server in a rack.
v Safety Information
This document is in PDF on the IBM System x Documentation CD. It contains
translated caution and danger statements. Each caution and danger statement
that appears in the documentation has a number that you can use to locate the
corresponding statement in your language in the Safety Information document.
v Warranty and Support Information
This document is in PDF on the IBM System x Documentation CD. It contains
information about the terms of the warranty and getting service and assistance.
© Copyright IBM Corp. 2006 1
Depending on the server model, additional documentation might be included on the
IBM System x Documentation CD.
The server might have features that are not described in the documentation that
comes with the server. The documentation might be updated occasionally to include
information about those features, or technical updates might be available to provide
additional information that is not included in the server documentation. These
updates are available from the IBM Web site. To check for updated documentation
and technical updates, complete the following steps.
Note: Changes are made periodically to the IBM Web site. The actual procedure
might vary slightly from what is described in this document.
1. Go to http://www.ibm.com/support/.
2. Under Search technical support , type System x3550 and click Search .
Notices and statements in this document
The caution and danger statements that appear in this document are also in the
multilingual Safety Information document, which is on the IBM System x
Documentation CD. Each statement is numbered for reference to the corresponding
statement in the Safety Information document.
The following notices and statements are used in this document:
v Note: These notices provide important tips, guidance, or advice.
v Important: These notices provide information or advice that might help you avoid
inconvenient or problem situations.
v Attention: These notices indicate potential damage to programs, devices, or
data. An attention notice is placed just before the instruction or situation in which
damage could occur.
v Caution: These statements indicate situations that can be potentially hazardous
to you. A caution statement is placed just before the description of a potentially
hazardous procedure step or situation.
v Danger: These statements indicate situations that can be potentially lethal or
extremely hazardous to you. A danger statement is placed just before the
description of a potentially lethal or extremely hazardous procedure step or
situation.
2 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Features and specifications
The following information is a summary of the features and specifications of the
server. Depending on the server model, some features might not be available, or
some specifications might not apply.
Chapter 1. Introduction 3
Table 1. Features and specifications
Microprocessor:
®
™
v Intel
Xeon
FC-LGA 771
dual-core with 4096 KB (minimum)
Level-2 cache
v Support for up to two
microprocessors
v Support for Intel Extended Memory
64 Technology (EM64T)
Note:
v Use the Configuration/Setup Utility
program to determine the type and
speed of the microprocessors.
v For a list of supported
microprocessors, see
http://www.ibm.com/servers/eserver/
serverproven/compat/us/
Memory:
v Minimum: 1 GB
v Maximum: 32 GB
v Type: PC2-5300, 667 MHz, ECC,
DDR II fully buffered SDRAM
DIMMs only
v Slots: Eight dual inline
v Supports 512 MB, 1 GB, 2 GB, and
4 GB (when available) DIMMs
Drives:
CD/DVD: IDE 24x CD-RW/ 8x DVD
combination
Expansion bays (depending on
model):
Either two 3.5-inch or four 2.5-inch
hard disk drive bays
v Servers with a 2.5-inch hot-swap
drive bay configuration support up
to four 2.5-inch hot-swap SAS hard
disk drives
v Servers with a 3.5-inch hot-swap
drive bay configuration support up
to two 3.5-inch SAS or SATA
hot-swap hard disk drives
v Servers with a 3.5-inch
simple-swap drive bay configuration
support up to two 3.5-inch
simple-swap SATA hard disk drives
PCI
Expansion slots:
v One PCI Express x8 (half length)
v One PCI Express x8 (half length) or
PCI-X (half length)
Power supply:
Maximum of two redundant 670-watt
(110 or 220 V ac auto-sensing)
hot-swap power supplies.
Hot-swap fans:
v Standard: five
v Maximum: six (with two
microprocessors installed)
Size:
v Height: 43 mm (1.69 inches, 1 U)
v Depth: 711 mm (28 inches)
v Width: 440 mm (17.3 inches)
v Maximum weight: 15.4 kg (34 lb)
when fully configured
Integrated
functions:
v Two Broadcom NetXtreme II Gb
Ethernet controllers with TOE and
Wake on LAN
®
support
v Four Universal Serial Bus (USB)
2.0 ports (two front and two rear)
v One Advanced System
Management RJ-45 (active only
when a Remote Supervisor
Adapter II SlimLine is installed)
v One serial port
Hard
disk controllers:
v Serial ATA ( SATA) controller with
integrated RAID (simple-swap
SATA models)
v Serial-attached SCSI (SAS)
controller with integrated RAID
(hot-swap SAS models)
Acoustical
noise emissions:
v Sound power, idling: 6.8 bels
maximum
v Sound power, operating: 6.8 bels
maximum
Environment:
v Air temperature:
– Server on: 10° to 35°C (50.0°
to 95.0°F); altitude: 0 to 914 m
(2998.7 ft)
– Server off: -40° to 60°C
(-104° to 140°F); maximum
altitude: 2133 m (6998.0 ft)
v
Humidity:
– Server on: 8% to 80%
– Server off: 8% to 80%
Heat output:
Approximate heat output in British
thermal units (Btu) per hour:
v Minimum configuration: 662 Btu per
hour (194 watts)
v Maximum configuration: 2390 Btu
per hour (700 watts)
Electrical
input:
v Sine-wave input (47-63 Hz) required
v Input voltage low range:
– Minimum: 100 V ac
– Maximum: 127 V ac
v
Input voltage high range:
– Minimum: 200 V ac
– Maximum: 240 V ac
v
Input kilovolt-amperes (kVA),
approximately:
– Minimum: 0.194 kVA
– Maximum: 0.700 kVA
Video
controller (integrated):
v ATI Radeon RN50 (dual ports - front
and rear)
v Support for SPI Serial flash memory
video BIOS
v Flexible memory support
– 8 MB to 256 MB
– DDR1 and DDR2 SDRAM and
SGRAM
Notes:
1. Power consumption and heat
output vary depending on the
number and type of optional
features installed and the
power-management optional
features in use.
2. These levels were measured in
controlled acoustical environments
according to the procedures
specified by the American National
Standards Institute (ANSI) S12.10
and ISO 7779 and are reported in
accordance with ISO 9296. Actual
sound-pressure levels in a given
location might exceed the average
values stated because of room
reflections and other nearby noise
sources. The declared sound-power
levels indicate an upper limit, below
which a large number of computers
will operate.
4 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Server controls, LEDs, and connectors
This section describes the controls, light-emitting diodes (LEDs), and connectors on
the front and rear of the server.
Front view
The following illustration shows the controls, LEDs, and connectors on the front of
the server. This configuration supports up to four 2.5-inch hot-swappable hard disk
drives.
Rack release latch
USB 3 connector
USB 4 connector
Video connector
Operator information
panel
Rack release latch
2.5-inch hard disk drives
Hard disk drive
status LED
Hard disk drive
activity LED
CD-RW/DVD eject button
CD-RW/DVD drive activity LED
The following illustration shows the controls, LEDs, and connectors on the front of
the server. This configuration supports up to two 3.5-inch hot-swappable hard disk
drives or two 3.5-inch simple-swap SATA hard disk drives.
Rack release latch
3.5-inch hard disk drives
USB 3 connector
USB 4 connector
Video connector
Operator information panel
Rack release latch
CD-RW/DVD eject button
CD-RW/DVD drive
activity LED
Hard disk drive
status LED (SAS model)
Hard disk drive
activity LED (SAS model)
Note: The locations of the controls, LEDs, and connectors vary, depending on the
hardware configuration that you have.
v Operator information panel: This panel contains controls and LEDs about the
status of the server.
Power-on
LED (green)
System
locator
LED (blue)
System-error
LED (amber)
The following controls and LEDs are on the operator information panel:
– Power-on LED: When this green LED is lit and not flashing, it indicates that
the server is turned on. When this LED is flashing, it indicates that the server
Powercontrol
button
Hard drive
activity
LED (green)
System
information
LED (amber)
Release
latch
Chapter 1. Introduction 5
is turned off and is still connected to an ac power source. When this LED is
off, it indicates that ac power is not present, or the power supply or the LED
itself has failed. A power LED is also on the rear of the server.
Note: If this LED is off, it does not mean that there is no electrical power in
the server. The LED might be burned out. To remove all electrical power from
the server, you must disconnect the power cord from the electrical outlet.
– System-locator LED: Use this blue LED to visually locate the server among
other servers. You can use IBM Director to light this LED remotely. This LED
is controlled by the BMC.
– System-error LED: When this amber LED is lit, it indicates that a system
error has occurred. A system-error LED is also on the rear of the server. An
LED on the light path diagnostics panel on the system board is also lit to help
isolate the error. This LED is controlled by the BMC.
– Release latch: Press the release latch to the left to slide out the operator
information panel and view the light path diagnostics LEDs and buttons. See
the Problem Determination and Service Guide for more information about the
light path diagnostics panel.
– System-information LED: When this amber LED is lit, it indicates that a
noncritical event has occurred. Check the error log for additional information.
See the information about light path diagnostics in the Problem Determination
and Service Guide for more information about error logs.
– Hard drive activity LED: When this green LED is lit, it indicates that one of
the hard disk drives is in use.
Notes:
1. For a SAS drive, a hard disk drive activity LED is shown in two places: on
the hard disk drive and on the operator information panel.
2. For a SATA drive, hard disk drive activity is indicated only by the hard disk
drive activity LED on the operator information panel.
Power-control button: Press this button to turn the server on and off
–
manually.
Rack release latches: Press the latches on each front side of the server to
v
remove the server from the rack.
v Video connector: Connect a monitor to this connector. The video connectors on
the front and rear of the server can be used simultaneously.
v USB connectors: Connect a USB device, such as a USB mouse, keyboard, or
other device to any of these connectors.
v CD-RW/DVD eject button: Press this button to release a DVD or CD from the
CD/DVD drive.
v CD-RW/DVD drive activity LED: When this LED is lit, it indicates that the
CD-RW/DVD drive is in use.
v Hard disk drive status LED: This LED is used on SAS hard disk drives. When
this LED is lit, it indicates that the drive has failed. If an optional IBM
ServeRAID
™
controller is installed in the server, when this LED is flashing slowly
(one flash per second), it indicates that the drive is being rebuilt. When the LED
is flashing rapidly (three flashes per second), it indicates that the controller is
identifying the drive.
v Hard disk drive activity LED: This LED is used on SAS hard disk drives. Each
hot-swap hard disk drive has an activity LED, and when this LED is flashing, it
indicates that the drive is in use.
6 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Light path diagnostics panel
The light path diagnostics panel is on the top of the operator information panel.
To access the light path diagnostics panel, push the release button on the operator
panel to the left. Pull forward on the unit until the hinge of the operator panel is free
of the server chassis; then, pull down on the unit, so that the operator information
panel is at a right angle with the server.
Operator information
panel
Light path LEDs
Release button
The following illustration shows the LEDs and controls on the light path diagnostics
panel.
Light Path
Diagnostics
CPU
MEM
FAN
PCI
PS1SPPS2
VRM
CNFG
NMI
S ERR
RAID
DASD
TEMP
BRD
OVER SPEC
REMIND
v Remind button: This button places the system-error LED on the front panel into
Remind mode. In Remind mode, the system-error LED flashes rapidly until the
problem is corrected, the system is restarted, or a new problem occurs.
By placing the system-error LED indicator in Remind mode, you acknowledge
that you are aware of the last failure but will not take immediate action to correct
the problem. The remind function is handled by the BMC.
v Reset button: Press this button to reset the server and run the power-on
self-test (POST). You might have to use a pen or the end of a straightened paper
clip to press the button. The reset button is to the right of the remind button.
information about light path diagnostics, see the System x3550 Problem
For
Determination and Service Guide on the IBM System x Documentation CD.
Chapter 1. Introduction 7
Rear view
The following illustration shows the connectors and LEDs on the rear of the server.
Ethernet 1
Ethernet 2
PCI slot 1 PCI slot 2
USB 2
USB 1
Systems
management
Ethernet connector
Serial
connector
Video
connector
Power connector
Power-on LED
System-locator LED
System-error LED
AC Power
LED
DC Power
LED
v PCI slot 1: Insert a PCI Express type adapter into this slot.
v PCI slot 2: Insert a PCI Express type adapter into this slot. You can purchase an
optional PCI-X riser card assembly to convert this slot to accept a PCI-X adapter.
v Power connector: Connect the power cord to this connector.
v AC power LED: Each hot-swap power supply has an ac power LED and a dc
power LED. When the ac power LED is lit, it indicates that sufficient power is
coming into the power supply through the power cord. During typical operation,
both the ac and dc power LEDs are lit. For any other combination of LEDs, see
the Problem Determination and Service Guide on the IBM System x
Documentation CD.
v DC power LED: Each hot-swap power supply has a dc power LED and an ac
power LED. When the dc power LED is lit, it indicates that the power supply is
supplying adequate dc power to the system. During typical operation, both the ac
and dc power LEDs are lit. For any other combination of LEDs, see the Problem
Determination and Service Guide on the IBM System x Documentation CD.
v System-error LED: When this LED is lit, it indicates that a system error has
occurred. An LED on the light path diagnostics panel is also lit to help isolate the
error.
v Power-on LED: When this LED is lit and not flashing, it indicates that the server
is turned on. When this LED is flashing, it indicates that the server is turned off
and still connected to an ac power source. When this LED is off, it indicates that
ac power is not present, or the power supply or the LED itself has failed.
v System-locator LED: Use this LED to visually locate the server among other
servers. Yo u can use IBM Director to light this LED remotely.
v Video connector: Connect a monitor to this connector. The video connectors on
the front and rear of the server can be used simultaneously.
v Serial connector: Connect a 9-pin serial device to this connector. The serial port
is shared with the baseboard management controller (BMC). The BMC can take
control of the shared serial port to perform text console redirection and to redirect
serial traffic, using Serial over LAN (SOL).
v USB connectors: Connect a USB device, such as a USB mouse, keyboard, or
other device to any of these connectors.
v Systems-management Ethernet connector: Use this connector to connect the
server to a network for systems-management information control. This connector
is active only if you have installed a Remote Supervisor Adapter II SlimLine, and
it is used only by the Remote Supervisor Adapter II SlimLine.
8 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Ethernet port
Ethernet activity LED
v Ethernet activity LEDs: When these LEDs are lit, they indicate that the server is
transmitting to or receiving signals from the Ethernet LAN that is connected to
the Ethernet port.
v Ethernet speed LED: When these LEDs are lit, they indicate that there is an
active link connection on the 10BASE-T, 100BASE-TX, or 1000BASE-TX
interface for the Ethernet port.
v Ethernet connectors: Use either of these connectors to connect the server to a
network.
Internal LEDs, connectors, and jumpers
The illustrations in this section show the connectors, LEDs, and jumpers on the
internal boards. The illustrations might differ slightly from your hardware.
Ethernet speed LED
Ethernet cable
release lever
Chapter 1. Introduction 9
System-board internal connectors
The following illustration shows the internal connectors on the system board.
SAS signal
connector (J65)
(some models)
SATA 1 signal
connector (port 1)
(some models)
Power supply
backplane
connector
Microprocessor 1
connector
SATA 0 signal
connector (port 0)
(some models)
CD-RW/DVD connector
Operator information
panel connector
Video front panel
connector
Power backplane card internal connectors
The following illustration shows the internal connectors on the power backplane
card.
USB front panel
connector
(USB3 and USB4)
10 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Power supply connectors
System board
connector
Hard disk drive
power connector
System-board switches and jumpers
The following illustration shows the switches and jumpers on the system board.
Note: If a clear protective sticker is present on top of the SW2 switch block, you
must remove and discard it in order to access the switches.
Chapter 1. Introduction 11
1
2
3
Boot block recovery
jumper (J14)
8 7 6 5 4 3 2 1
ON
System board switch
block (SW2)
NMI (SW1)
12 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Table 2. Switch and jumper settings
Default
Component
NMI (nonmaskable
value Settings
Off NMI button on rear of server pressed: NMI issued
interrupt) switch (SW1)
Power-on password
switch (SW2-1)
Off Power-on password override. Changing the position
of this switch bypasses the power-on password
check the next time the server is turned on and
starts the Configuration/Setup Utility program so
that you can change or delete the power-on
password. You do not have to move the switch
back to the default position after the password is
overridden.
Changing the position of this switch does not affect
the administrator password check if an
administrator password is set.
See the User’s Guide on the IBM System x
Documentation CD for additional information about
the power-on password.
BMC update switch
(SW2-2)
Off Force BMC update (trained service technician
only). When toggled to On, this switch causes an
update of BMC microcode from the on-board ROM.
BMC disable switch
(SW2-3)
Off Setting this to On might be necessary when a
service processor adapter other than the optional
Remote Supervisor Adapter II SlimLine is installed.
Force power-on switch
(SW2-8)
Off Power-on override. When toggled to On, this switch
forces the server power on, overriding the
power-on button.
Boot block recovery
jumper (J14)
v Pins 1 and 2: Normal (default)
v Pins 2 and 3: Recover boot block.
Note: The server is shipped with a clear plastic shield on the face of switch SW2.
Remove and discard this shield if you need to change the switch settings.
Chapter 1. Introduction 13
System-board external connectors
The following illustration shows the external connectors on the system board.
USB 1 connector
USB 2 connector
Serial connector
Video connector
Ethernet connector
Systems- management
Ethernet 2 connector
Ethernet 1 connector
14 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
System-board LEDs
The following illustration shows the light-emitting diodes (LEDs) on the system
board.
Power-on LED
Location LED
System-error LED
PCI slot 2 error LED
Remote Supervisor
Adapter II
SlimLine error LED
BMC status
LED
System-board
fault LED
System-board battery
error LED
PCI slot 1
error LED
DIMM 5 error LED
DIMM 6 error LED
DIMM 7 error LED
DIMM 8 error LED
Light path diagnostics
active LED
Light path diagnostics switch
RAID error LED
Microprocessor 2
error LED
Microprocessor 1
error LED
Fan 1 error LED
Power B error LED
Power A error LED
Power C error LED
Power D error LED
Fan 2 error LED
DIMM 1 error LED
DIMM 2 error LED
DIMM 3 error LED
DIMM 4 error LED
Fan 6 error LED
Fan 5 error LED
Fan 4 error LED
Fan 3 error LED
Chapter 1. Introduction 15
Table 3. System-board LEDs
LED Description
Error LEDs The associated component has failed.
BMC status LED This LED flashes to indicate that the BMC (baseboard
management controller) is functioning normally.
Standby power LED When this LED is lit and not flashing, it indicates that the
server is turned on. When this LED is flashing, it indicates
that the server is turned off and still connected to an ac
power source. When this LED is off, it indicates that ac
power is not present, or the power supply or the LED itself
has failed.
12-volt power (A, B, C, D)
LEDs
If any of these LEDs is lit, there is a failure in the associated
system board power bus (see “Power problems” on page 52).
Location LED Use this LED to visually locate the server among other
servers. You can use IBM Director to light this LED remotely.
System-error LED When this LED is lit, it indicates that a system error has
occurred. An LED on the light path diagnostics panel is also
lit to help isolate the error.
16 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
System-board option connectors
The following illustration shows the connectors for user-installable options.
Remote Supervisor
Adapter II SlimLine
connector (J60)
Microprocessor 2
connector
Fan 1 connector
PCI Express or
PCI-X riser-card
connector slot 2
(J12)
PCI-Express
riser card connector slot 1
(J34)
RAID controller
connector (J3)
(some models)
DIMM 5 connector
DIMM 6 connector
DIMM 7 connector
DIMM 8 connector
DIMM 1 connector
DIMM 2 connector
DIMM 3 connector
DIMM 4 connector
Chapter 1. Introduction 17
18 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Chapter 2. Diagnostics
This chapter describes the diagnostic tools that are available to help you solve
problems that might occur in the server.
If you cannot locate and correct the problem using the information in this chapter,
see Appendix A, “Getting help and technical assistance,” on page 159 for more
information.
Diagnostic tools
The following tools are available to help you diagnose and solve hardware-related
problems:
v POST beep codes, error messages, and error logs
The power-on self-test (POST) generates beep codes and messages to indicate
successful test completion or the detection of a problem. See “POST” for more
information.
v Troubleshooting tables
These tables list problem symptoms and actions to correct the problems. See
“Troubleshooting tables” on page 43.
v Light path diagnostics
Use the light path diagnostics to diagnose system errors quickly. See “Light path
diagnostics” on page 56 for more information.
v Diagnostic programs, messages, and error messages
The diagnostic programs, which are stored in upgradeable read-only memory
(ROM) on the system board, are the primary method of testing the major
components of the server. See “Diagnostic programs, messages, and error
codes” on page 61 for more information.
POST
When you turn on the server, it performs a series of tests to check the operation of
the server components and some optional devices in the server. This series of tests
is called the power-on self-test, or POST.
If a power-on password is set, you must type the password and press Enter, when
prompted, for POST to run.
If POST detects a problem, one or more beeps might sound, or an error message
is displayed. See “POST beep codes” and “POST error codes” on page 28 for more
information.
POST beep codes
A beep code is a combination of short or long beeps or a series of short beeps that
are separated by pauses. For example, a “1-2-3” beep code is one short beep, a
pause, two short beeps, and pause, and three short beeps. A beep code indicates
that POST has detected a problem.
A single problem might cause more than one error message. When this occurs,
correct the cause of the first error message. The other error messages usually will
not occur the next time POST runs.
© Copyright IBM Corp. 2006 19
Exception: If there are multiple error codes or light path diagnostics LEDs that
indicate a microprocessor error, the error might be in the microprocessor or in the
microprocessor socket. See “Microprocessor problems” on page 48 for information
about diagnosing microprocessor problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Beep code Description Action
No beep System board failure. (Trained service technician only) Replace the
system board.
1-1-2 Microprocessor register test failed.
1-1-3 CMOS write/read test failed.
1-1-4 BIOS EEPROM checksum failed.
1-2-1 Programmable interval timer failed. (Trained service technician only) Replace the
1-2-2 DMA initialization failed. (Trained service technician only) Replace the
1-2-3 DMA page register write/read failed. (Trained service technician only) Replace the
1-2-4 RAM refresh verification failed.
1. (Trained service technician only) Reseat
the microprocessors.
2. (Trained service technician only) Replace
the following components one at a time, in
the order shown, restarting the server each
time:
a. Microprocessor 2 (if present).
b. Microprocessor 1.
1. Reseat the system board battery.
2. Replace the following components one at
a time, in the order shown, restarting the
server each time:
a. System-board battery
b. (Trained service technician only)
System board
1. Reload the server BIOS (see “Recovering
the BIOS code” on page 76).
2. (Trained service technician only) Replace
the system board.
system board.
system board.
system board.
1. Reseat the DIMMs.
2. Replace the following components, one at
a time, in the order shown:
a. DIMMs
b. (Trained service technician only)
System board
20 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Beep code Description Action
1-3-1 1st 64K RAM test failed.
1. Reseat the DIMMs.
2. Replace the following components, one at
a time, in the order shown:
a. DIMMs
b. (Trained service technician only)
System board
1-3-2 1st 64K RAM parity test failed.
1. Reseat the DIMMs.
2. Replace the following components, one at
a time, in the order shown:
a. DIMMs
b. (Trained service technician only)
System board
2-1-1 Secondary DMA register failed. (Trained service technician only) Replace the
system board.
2-1-2 Primary DMA register failed. (Trained service technician only) Replace the
system board.
2-1-3 Primary interrupt mask register failed. (Trained service technician only) Replace the
system board.
2-1-4 Secondary interrupt mask register failed. (Trained service technician only) Replace the
system board.
2-2-2 Keyboard controller failed. Replace the following components, one at a
time, in the order shown, restarting the server
each time:
1. Keyboard
2. (Trained service technician only) System
board
2-2-3 CMOS power failure and checksum
checks failed.
1. Reseat the system-board battery.
2. Replace the following components, one at
a time, in the order shown, restarting the
server each time:
a. System-board battery
b. (Trained service technician only)
System board
2-2-4 CMOS configuration information checks
failed.
1. Reseat the system-board battery.
2. Replace the following components, one at
a time, in the order shown, restarting the
server each time:
a. System-board battery
b. (Trained service technician only)
System board
2-3-1 Screen initialization failed. (Trained service technician only) Replace the
system board.
Chapter 2. Diagnostics 21
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Beep code Description Action
2-3-2 Screen memory failed. (Trained service technician only) Replace the
system board.
2-3-3 Screen retrace failed. (Trained service technician only) Replace the
system board.
2-3-4 Search for video ROM failed. (Trained service technician only) Replace the
system board.
2-4-1 Video failed. (Trained service technician only) Replace the
system board.
2-4-4 Memory configuration error.
1. Make sure that the DIMMS are installed in
the correct configuration.
2. Replace the following components one at
a time, in the order shown, restarting the
server each time:
a. Failing DIMM
b. (Trained service technician only)
System board
3-1-1 Timer tick interrupt failed. (Trained service technician only) Replace the
system board.
3-1-2 Interval timer channel 2 failed. (Trained service technician only) Replace the
system board.
3-1-3 RAM test failed above address 0FFFFh.
1. Reseat the DIMMs.
2. Replace the following components one at
a time, in the order shown, restarting the
server each time:
a. DIMMs
b. (Trained service technician only)
System board
3-1-4 Time-of-day clock failed.
1. Reseat the system-board battery.
2. Replace the following components one at
a time, in the order shown, restarting the
server each time:
a. System-board battery
b. (Trained service technician only)
System board
3-2-1 Serial port failed. (Trained service technician only) Replace the
system board.
22 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Beep code Description Action
3-2-4 Failed comparing CMOS memory size
against actual.
1. Reseat the following components:
a. DIMMs
b. System-board battery
Replace the following components, one at
2.
a time, in the order shown, restarting the
server each time:
a. DIMMs
b. System-board battery
c. (Trained service technician only)
System board
3-3-1 Memory size mismatch occurred.
1. Reseat the following components:
a. DIMMs
b. System-board battery
Replace the following components, one at
2.
a time, in the order shown, restarting the
server each time:
a. DIMMs
b. System-board battery
c. (Trained service technician only)
System board
Chapter 2. Diagnostics 23
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Beep code Description Action
3-3-2 Critical SMBUS (I2C bus) error occurred.
1. Disconnect server power, wait 30 seconds
and retry.
2. Reseat the following components:
a. (Trained service technician only)
Microprocessor
b. PCI-X/PCI Express riser (if present)
c. PCI-X/PCI Express adapter (if present)
d. DIMMs
e. Hard disk drives
f. Hard disk drive backplane
g. Hard disk drive power cable
h. Hard disk drive signal cable (only for
SAS drive)
Replace the following components, one at
3.
a time, in the order shown, restarting the
server each time:
a. (Trained service technician only)
System board
b. (Trained service technician only)
Microprocessor
c. PCI-X/PCI Express riser (if present)
d. PCI-X/PCI Express adapter (if present)
e. DIMMs
f. Hard disk drives
g. Hard disk drive backplane
h. Hard disk drive power cable
i. Hard disk drive signal cable (only for
SAS drive)
24 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
CRUs and which components are FRUs.
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Beep code Description Action
3-3-3 No operational memory in system.
1. Make sure that the server contains the
correct number of DIMMs, in the correct
order; install or reseat DIMMS; then,
restart the server.
Important: In some memory
configurations, the 3-3-3 beep code might
sound during POST, followed by a blank
monitor screen. If this occurs and the Boot
Fail Count option in the Start Options of
the Configuration/Setup Utility program is
enabled, you must restart the server three
times to reset the configuration settings to
the default configuration (the memory
connector or bank of connectors enabled).
2. Replace the following components one at
a time, in the order shown, restarting the
server each time:
a. DIMMs
b. (Trained service technician only)
System board
Chapter 2. Diagnostics 25
Error logs
The POST error log contains the three most recent error codes and messages that
were generated during POST. The BMC log contains messages that were
generated by the BMC. The system event/error log is a combined log that contains
messages that were generated during POST and all system status messages from
the service processor (BMC).
The system event/error log and BMC System Event log are limited in size. When
each log is full, new entries will not overwrite existing entries; therefore, you must
periodically clear these logs through the Configuration/Setup Utility program (the
menu choices are described in the User’s Guide on the IBM System x
Documentation CD). When you are troubleshooting an error, be sure to clear both
the logs so that you can find current errors more easily.
Important: After you complete a repair or correct an error, clear the BMC log to
turn off the system-error LED on the front of the server.
Entries that are written to the system event/error log during the early phase of
POST show an incorrect date and time as the default time stamp; however, the
date and time are corrected as POST continues.
Each system event/error log entry appears on its own page. To move from one
entry to the next, use the Up Arrow and Down Arrow keys.
You can view the contents of the POST error log, the system event log, and the
system event/error log from the Configuration/Setup Utility program.
When you are troubleshooting PCI-X/PCI Express slots, note that the error logs
report the PCI-X/PCI Express buses numerically. The numerical assignments vary
depending on the configuration. You can check the assignments by running the
Configuration/Setup Utility program (see the User’s Guide on the IBM System x
Documentation CD for more information).
Viewing error logs from the Configuration/Setup Utility program
For complete information about using the Configuration/Setup Utility program, see
the User’s Guide on the IBM System x Documentation CD.
To view the error logs, complete the following steps:
1. Turn on the server.
2. When the prompt Press F1 for Setup appears, press F1. If you have set both a
power-on password and an administrator password, you must type the
administrator password to view the error logs.
3. Use one of the following procedures:
v To view the POST error log, select Event/Error Logs, and then select POST
Error Log.
v To view the BMC system event log, select Advanced Setup --> Baseboard
Management Controller (BMC) Settings --> BMC System Event Log
v To view the combined system event/error log that is generated by the Remote
Supervisor Adapter II SlimLine, select Event/Error logs, and then select
System Event/Error Log.
Clearing the error logs
For complete information about using the Configuration/Setup Utility program, see
the User’s Guide on the IBM System x Documentation CD.
26 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
To clear the error logs, complete the following steps:
1. Turn on the server.
2. When the prompt Press F1 for Setup appears, press F1. If you have set both a
power-on password and an administrator password, you must type the
administrator password to view the error logs.
3. Use one of the following procedures:
v To clear the BMC system event log, select Advanced Setup --> Baseboard
Management Controller (BMC) Settings-->BMC System Event Log. Select
Clear BMC SEL; then, press Enter twice.
v To clear the combined system event/error log, select Event/Error logs, and
then select System Event/Error Log . When any log entry is displayed, press
Enter (Clear event/error logs is highlighted on each entry page).
The POST error log is automatically cleared each time the server is
Note:
restarted.
No-beep symptoms
The following table describes situations in which no beep code sounds when POST
is completed.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
No-beep symptom Description Action
No beeps occur, and the
server operates correctly.
No beeps occur after
successful completion of
POST.
No beeps occur, and
there is no video.
Possible problem with the operator
information panel.
1. Check the operator information panel cable
for damage.
2. Reseat the operator information panel
cable.
3. Replace the following components, one at a
time, in the order shown, restarting the
server each time:
a. (Trained service technician only)
Operator information panel
b. (Trained service technician only) System
board
The power-on status is Disabled.
1. Run the Configuration/Setup Utility program
and select Start Options ; then, set
Power-On Status to Enable .
2. Check the operator information panel cable
for damage.
3. Reseat the operator information panel
cable.
4. (Trained service technician only) Replace
the system board
Unknown problem. See “Solving undetermined problems” on page
86.
Chapter 2. Diagnostics 27
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
No-beep symptom Description Action
No beep occurs, and the
power-supply ac LED is
off
Possible power problem.
1. Make sure that the ac power cord is
connected to the power supply and to an ac
outlet.
2. Reseat the power supplies.
3. If two power supplies are installed, swap
them to determine whether one is defective.
4. Disconnect the cable from the hard disk
drive backplane power connector (J13) on
the power backplane. If the ac power LED
comes on, see “Solving undetermined
problems” on page 86.
No beep occurs, the
Possible power problem. See “Power-supply LEDs” on page 60.
server does not start,
and the power-supply ac
LED is lit.
POST error codes
The following table describes the POST error codes and suggested actions to
correct the detected problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
062 Three consecutive boot failures using the
default configuration.
101, 102, 106 System and microprocessor error.
1. Run the Configuration/Setup Utility program, save
the configuration, and restart the server.
2. Update the system firmware to the latest level
(see “Updating the firmware” on page 149).
3. Reseat the following components, one at a time,
in the order shown, restarting the server each
time:
a. System-board battery
b. (Trained service technician only)
Microprocessor 1
Replace the components listed in step 3, one at a
4.
time, in the order shown, restarting the server
each time.
1. (Trained service technician only) Replace the
system board.
28 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
111 Channel check error. Reseat the following components, one at a time, in
the order shown, restarting the server each time:
1. Adapter (if present)
2. DIMMs
Replace
the following components, one at a time, in
the order shown, restarting the server each time:
1. Adapter (if present)
2. DIMMs
3. (Trained service technician only) System board
114 Adapter read-only memory error.
1. Reseat the adapter.
2. Replace the adapter.
129 Internal cache (L2) error.
1. (Trained service technician only) Reseat
microprocessor 1.
2. (Trained service technician only) Reseat
microprocessor 2 (if present).
3. (Trained service technician only) Replace the
following components one at a time, in the order
shown, restarting the server each time:
a. Microprocessor 1
b. Microprocessor 2 (if present)
c. System board
151 Real-time clock error.
1. Reseat the battery.
2. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. System-board battery
b. (Trained service technician only) System
board
161 Real-time clock battery error.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reseat the battery.
3. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. System-board battery
b. (Trained service technician only) System
board
Chapter 2. Diagnostics 29
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
162 Device configuration error.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reseat the following components, one at a time,
in the order shown, restarting the server each
time:
a. System-board battery
b. Failing device (if the device is a FRU, the
device must be reseated by a trained service
technician only)
Replace the following components one at a time,
3.
in the order shown, restarting the server each
time:
a. System-board battery
b. Failing device (if the device is a FRU, the
device must be replaced by a trained service
technician only)
c. (Trained service technician only) System
board
163 Real-time clock error. (time of day not set)
1. Run the Configuration/Setup Utility program,
select Load Default Settings , make sure that the
date and time are correct, and save the settings.
2. Reseat the battery.
3. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. System-board battery
b. (Trained service technician only) System
board
164 Memory configuration changed.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reseat the DIMMs.
3. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. DIMMs
b. (Trained service technician only) System
board
165 Service processor failure. (Trained service technician only) Replace the system
board.
30 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
175 Bad EEPROM CRC #1.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Update the Remote Supervisor Adapter II
SlimLine firmware (if present).
3. (Trained service technician only) Replace the
system board.
178 System VPD not available.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reflash or update firmware for the BMC.
3. (Trained service technician only) Replace the
system board.
184 Power-on password damaged.
1. Restart the server and enter the administrator
password; then, run the Configuration/Setup
Utility program, select Load Default Settings ,
and save the settings.
2. Reseat the battery.
3. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. System-board battery
b. (Trained service technician only) System
board
185 Drive startup sequence information
corrupted.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. (Trained service technician only) Replace the
system board.
187 VPD serial number not set.
1. Run the Configuration/Setup Utility program, set
the serial number, and save the configuration.
2. (Trained service technician only) Replace the
system board.
188 Bad EEPROM CRC #2.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reflash or update firmware for the BMC.
3. Update the Remote Supervisor Adapter II
SlimLine firmware (if present).
4. (Trained service technician only) Replace the
system board.
189 An attempt was made to access the server
with an incorrect password.
Restart the server and enter the administrator
password; then, run the Configuration/Setup Utility
program and change the power-on password.
Chapter 2. Diagnostics 31
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
201 Memory test error.
1. Make sure that the DIMM is installed correctly
(see “Installing a memory module” on page 110).
2. Reseat the DIMM.
3. Replace the DIMM.
4. (Trained service technician only) Replace the
system board.
229 Internal cache (L2) error. (Trained service technician only) Reseat the following
components one at a time, in the order shown,
restarting the server each time:
1. Microprocessor 1
2. Microprocessor 2 (if installed)
(Trained
service technician only) Replace the
components listed above, one at a time, in the order
shown, restarting the server each time.
262 DRAM parity configuration error.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reseat the battery.
3. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. System-board battery
b. (Trained service technician only) System
board
289 A DIMM has been disabled by the user or
by the system.
1. If the DIMM was disabled by the user, run the
Configuration/Setup Utility program and enable
the DIMM.
2. Make sure that the DIMM is installed correctly
(see “Installing a memory module” on page 110).
3. Reseat the DIMM.
4. Replace the DIMM.
5. (Trained service technician only) Replace the
system board.
32 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
301 Keyboard or keyboard controller error.
1. If you have installed a USB keyboard, run the
Configuration/Setup Utility program and enable
keyboardless operation to prevent the POST error
message 301 from being displayed during startup.
2. Reseat the keyboard cable in the connector.
3. If you are using an external USB hub, disconnect
the keyboard from the hub and connect it directly
to the server.
4. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Keyboard
b. (Trained service technician only) System
board
303 Keyboard controller error.
1. If you have installed a USB keyboard, run the
Configuration/Setup Utility program and enable
keyboardless operation to prevent the POST error
message 301 from being displayed during startup.
2. Reseat the keyboard cable in the connector.
3. If you are using an external USB hub, disconnect
the keyboard from the hub and connect it directly
to the server.
4. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Keyboard
b. (Trained service technician only) System
board
762 Coprocessor configuration error.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reseat the battery.
3. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. System-board battery
b. Microprocessor 1
11xx Serial port configuration error.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. (Trained service technician only) Replace the
system board.
Chapter 2. Diagnostics 33
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
1600 BMC failed BIST.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reflash or update firmware for the BMC.
3. (Trained service technician only) Replace the
system board.
1601 BMC is not functioning.
1. Reflash or update firmware for the BMC.
2. (Trained service technician only) Replace the
system board.
1602 Remote Supervisory Adapter II SlimLine
communication error.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reflash or update firmware for the Remote
Supervisory Adapter II SlimLine.
3. (Trained service technician only) Replace the
system board.
1603 Remote Supervisory Adapter II SlimLine
firmware needs to be updated.
1762 Hard drive configuration error.
Reflash or update firmware for the Remote
Supervisory Adapter II SlimLine.
1. Run the hard disk drive diagnostics tests on drive
x.
2. Reseat the following components:
a. Hard disk drive
b. Hard disk drive backplane cable or backplate
cables
Run the Configuration/Setup Utility program,
3.
select Load Default Settings , and save the
settings.
4. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Hard disk drive
b. Hard disk drive backplate
c. (Trained service technician only) System
board
34 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
178x Hard drive error.
Note: x is the drive that has the error.
1. Run the hard disk drive diagnostics tests on drive
x.
2. Reseat the following components:
a. Hard disk drive
b. Hard disk drive backplane cable or backplate
cables
Replace the following components one at a time,
3.
in the order shown, restarting the server each
time:
a. Hard disk drive
b. Hard disk drive backplate
c. (Trained service technician only) System
board
1800 Unavailable PCI hardware interrupt.
1. Run the Configuration/Setup Utility program and
adjust the adapter settings.
2. Remove each adapter one at a time, restarting
the server each time, until the problem is isolated.
1801 An adapter has requested memory
resources that are not available
1. Run the Configuration/Setup Utility program and
verify that sufficient memory is installed in the
server.
2. Run the Configuration/Setup Utility program and
disable some other resources to make more
space available.
3. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Each adapter
b. (Trained service technician only) System
board
Chapter 2. Diagnostics 35
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
1962 A hard disk drive does not contain a valid
boot sector.
1. Make sure that a startable operating system is
installed.
2. Run the hard disk drive diagnostic tests.
3. Reseat the following components:
a. Hard disk drive
b. Hard disk drive backplane cable or backplate
cables
Replace the following components one at a time,
4.
in the order shown, restarting the server each
time:
a. (Hot-swap models) Hard disk drive cables
b. Hard disk drive
c. Hard disk drive backplane or backplate
d. (Trained service technician only) System
board
2400 Video controller test failure.
1. Optional video adapter (if installed)
2. (Trained service technician only) System board
2462 Video memory configuration error.
1. Optional video adapter (if installed)
2. (Trained service technician only) System board
5962 IDE CD-ROM configuration error.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reseat the following components:
a. CD-RW/DVD drive
b. CD-RW/DVD drive cable
c. System-board battery
Replace the following components one at a time,
3.
in the order shown, restarting the server each
time:
a. CD-RW/DVD drive
b. CD-RW/DVD drive cable
c. System-board battery
d. (Trained service technician only) System
board
8603 Pointing device error.
1. Reseat the pointing-device cable.
2. If you are using an external USB hub, disconnect
the pointing device from the hub and connect it
directly to the server.
3. Replace the pointing device.
4. (Trained service technician only) Replace the
system board.
36 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
00012000 Microprocessor machine check error.
1. (Trained service technician only) Reseat the
following components:
a. Microprocessor 1
b. Microprocessor 2 (if present)
(Trained service technician only) Replace the
2.
following components one at a time, in the order
shown, restarting the server each time:
a. Microprocessor 1
b. Microprocessor 2 (if present)
c. System board
00019501 Microprocessor 1 not functioning. (Trained service technician only) Replace the
following components one at a time, in the order
shown, restarting the server each time:
1. Microprocessor 1
2. System board
00019502 Microprocessor 2 not functioning. (Trained service technician only) Replace the
following components one at a time, in the order
shown, restarting the server each time:
1. Microprocessor 2 (if present)
2. System board
00019701 Microprocessor 1 failed BIST.
1. (Trained service technician only) Reseat
microprocessor 1.
2. (Trained service technician only) Replace the
following components one at a time, in the order
shown, restarting the server each time:
a. Microprocessor 1
b. System board
00019702 Microprocessor 2 failed BIST.
1. (Trained service technician only) Reseat
microprocessor 2 (if present).
2. (Trained service technician only) Replace the
following components one at a time, in the order
shown, restarting the server each time:
a. Microprocessor 2 (if present)
b. System board
00180100 No room for PCI option ROM.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Remove the PCI adapters and riser cards, one at
a time, until the problem is isolated.
3. (Trained service technician only) Replace the
system board.
Chapter 2. Diagnostics 37
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
00180200 No more I/O space available for PCI
adapter.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Remove the PCI adapters and riser cards, one at
a time, until the problem is isolated.
3. (Trained service technician only) Replace the
system board.
00180300 No more memory above 1 MB for PCI
adapter.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Remove the PCI adapters and riser cards, one at
a time, until the problem is isolated.
3. (Trained service technician only) Replace the
system board.
00180400 No more memory below1 MB for PCI
adapter.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Remove the PCI adapters and riser cards, one at
a time, until the problem is isolated.
3. (Trained service technician only) Replace the
system board.
00180500 PCI option ROM checksum error.
1. Reseat each of the installed PCI adapters and
riser cards.
2. Replace each of the installed PCI adapters,
restarting the server each time.
3. (Trained service technician only) Replace the
system board.
00180600 PCI device BIST failure.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , and save the
settings.
2. Reseat each installed PCI adapter and riser card.
3. Replace each installed PCI adapter, restarting the
server each time.
4. (Trained service technician only) Replace the
system board.
00180700 PCI device not responding.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , make sure that
installed PCI devices are enabled, and save the
settings.
2. Reseat each installed PCI adapter and riser card.
3. (Trained service technician only) Replace the
system board.
4. Replace each installed PCI adapter, restarting the
server each time.
38 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
00180800 Unsupported PCI device installed.
1. Run the Configuration/Setup Utility program,
select Load Default Settings , make sure that
installed PCI devices are enabled, and save the
settings.
2. Reseat each installed PCI adapter and riser card.
3. Replace each installed PCI adapter, restarting the
server each time.
4. (Trained service technician only) Replace the
system board.
01298001 No update data for microprocessor 1.
1. Update the BIOS code again (see “Recovering
the BIOS code” on page 76).
2. (Trained service technician only) Replace
microprocessor 1.
01298002 No update data for microprocessor 2.
1. Update the BIOS code again (see “Recovering
the BIOS code” on page 76).
2. (Trained service technician only) Replace
microprocessor 2 (if present).
01298101 Bad update data for microprocessor 1.
1. Update the BIOS code again (see “Recovering
the BIOS code” on page 76).
2. (Trained service technician only) Replace
microprocessor 1.
01298102 Bad update data for microprocessor 2.
1. Update the BIOS code again (see “Recovering
the BIOS code” on page 76).
2. (Trained service technician only) Replace
microprocessor 2 (if present).
I9990301 Hard disk drive boot sector error.
1. Reseat the following components:
a. Hard disk drive
b. Hard disk drive backplane cable or backplate
cables
Replace the following components one at a time,
2.
in the order shown, restarting the server each
time:
a. Hard disk drive backplane cable or backplate
cables
b. Hard disk drive
c. Hard disk drive backplane or backplate
d. (Trained service technician only) System
board
I9990305 Operating system not found. Run the Configuration/Setup Utility program to make
sure that a bootable operating system is installed on
one or more devices that are listed in the boot order.
Chapter 2. Diagnostics 39
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
I9990650 AC power has been restored.
1. Check the power cables.
2. Check for interruption of the ac power supply.
40 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Checkout procedure
The checkout procedure is the sequence of tasks that you must follow to diagnose
a problem in the server.
About the checkout procedure
Before performing the checkout procedure for diagnosing hardware problems,
review the following information:
v Read the safety information that begins on page vii.
v The diagnostic programs provide the primary methods of testing the major
components of the server, such as the system board, Ethernet controller,
keyboard, mouse (pointing device), serial ports, and hard disk drives. Yo u can
also use them to test some external devices. If you are not sure whether a
problem is caused by the hardware or by the software, you can use the
diagnostic programs to confirm that the hardware is working correctly.
v When you run the diagnostic programs, a single problem might cause more than
one error message. When this happens, correct the cause of the first error
message. The other error messages usually will not occur the next time you run
the diagnostic programs.
Exception: If there are multiple error codes or light path diagnostics LEDs that
indicate a microprocessor error, the error might be in the microprocessor or in the
microprocessor socket. See “Microprocessor problems” on page 48 for
information about diagnosing microprocessor problems.
v Before running the diagnostic programs, you must determine whether the failing
server is part of a shared hard disk drive cluster (two or more servers sharing
external storage devices). If it is part of a cluster, you can run all diagnostic
programs except the ones that test the storage unit (that is, a hard disk drive in
the storage unit) or the storage adapter that is attached to the storage unit. The
failing server might be part of a cluster if any of the following conditions is true:
– You have identified the failing server as part of a cluster (two or more servers
sharing external storage devices).
– One or more external storage units are attached to the failing server and at
least one of the attached storage units is also attached to another server or
unidentifiable device.
– One or more servers are located near the failing server.
Important:
at a time. Do not run any suite of tests, such as “quick” or “normal” tests,
because this might enable the hard disk drive diagnostic tests.
v If the server is halted and a POST error code is displayed, see “Error logs” on
page 26. If the server is halted and no error message is displayed, see
“Troubleshooting tables” on page 43 and “Solving undetermined problems” on
page 86.
v For information about power-supply problems, see “Solving power problems” on
page 84.
v For intermittent problems, check the error log; see “Error logs” on page 26 and
“Diagnostic programs, messages, and error codes” on page 61.
Performing the checkout procedure
To perform the checkout procedure, complete the following steps:
1. Is the server part of a cluster?
If the server is part of a shared hard disk drive cluster, run one test
Chapter 2. Diagnostics 41
v No: Go to step 2.
v Yes: Shut down all failing servers that are related to the cluster. Go to step 2.
Complete the following steps:
2.
a. Check the power supply LEDs, see “Power problems” on page 52.
b. Turn off the server and all external devices.
c. Check all internal and external devices for compatibility.
d. Check all cables and power cords.
e. Make sure the server is cabled correctly.
f. Set all display controls to the middle positions.
g. Turn on all external devices.
h. Turn on the server. If the server does not start, see “Troubleshooting tables”
on page 43.
i. Check the system-error LED on the operator information panel. If it is
flashing, check the light path diagnostics LEDs (see “Light path diagnostics”
on page 56).
j. Check for the following results:
v Successful completion of POST (see “POST” on page 19 for more
information)
v Successful completion of startup
Did one or more beeps sound?
3.
v No: Find the failure symptom in “Troubleshooting tables” on page 43; if
necessary, run the diagnostic programs (see “Running the diagnostic
programs” on page 62).
– If you receive an error, see “Diagnostic error codes” on page 64.
– If the diagnostic programs were completed successfully and you still
suspect a problem, see “Solving undetermined problems” on page 86.
v Yes: Find the beep code in “POST beep codes” on page 19; if necessary,
see “Solving undetermined problems” on page 86.
42 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Troubleshooting tables
Use the troubleshooting tables to find solutions to problems that have identifiable
symptoms.
If you cannot find the problem in these tables, see “Running the diagnostic
programs” on page 62 for information about testing the server.
If you have just added new software or a new optional device and the server is not
working, complete the following steps before using the troubleshooting tables:
1. Check the system-error LED on the operator information panel; if it is lit, check
the light path diagnostics LEDs (see “Light path diagnostics” on page 56).
2. Remove the software or device that you just added.
3. Run the diagnostic tests to determine whether the server is running correctly.
4. Reinstall the new software or new device.
CD-RW/DVD drive problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The CD-RW/DVD drive is not
recognized.
The CD-RW/DVD is not working
correctly.
1. Make sure that:
v The IDE channel to which the CD-RW/DVD drive is attached (primary) is
enabled in the Configuration/Setup Utility program.
v All cables and jumpers are installed correctly.
v The pins on the cables are not bent.
v The correct device driver is installed for the CD-RW/DVD drive.
Run the CD-RW/DVD drive diagnostic programs.
2.
3. Reseat the following components:
a. CD-RW/DVD drive
b. CD-RW/DVD drive cable
Replace the following components one at a time, in the order shown, restarting
4.
the server each time:
a. CD-RW/DVD drive
b. CD-RW/DVD drive cable
c. (Trained service technician only) System board
1. Clean the CD-RW/DVD drive.
2. Run the CD-RW/DVD drive diagnostic programs.
3. Check the signal cable for bent pins.
4. Reseat the following components:
a. CD-RW/DVD drive
b. CD-RW/DVD drive cable
Replace the CD-RW/DVD drive.
5.
Chapter 2. Diagnostics 43
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The CD-RW/DVD drive tray is
not working.
1. Make sure that the server is turned on.
2. Insert the end of a straightened paper clip into the manual tray-release
opening.
3. Reseat the CD-RW/DVD drive cable.
4. Replace the CD-RW/DVD drive.
General problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
An LED is not working, or a
similar problem has occurred.
If the part is a CRU, replace it. If the part is a FRU, the part must be replaced by a
trained service technician.
Hard disk drive problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
Not all drives are recognized by
the hard disk drive diagnostic
test.
The server stops responding
during the hard disk drive
diagnostic test.
A hard disk drive was not
detected while the operating
system was being started.
A hard disk drive passes the
diagnostic Fixed Disk Test, but
the problem remains.
Remove the drive that is indicated by the diagnostic tests; then, run the hard disk
drive diagnostic test again. If the remaining drives are recognized, replace the drive
that you removed with a new one.
Remove the hard disk drive that was being tested when the server stopped
responding, and run the diagnostic test again. If the hard disk drive diagnostic test
runs successfully, replace the drive that you removed with a new one.
Reseat all hard disk drives and cables; then, run the hard disk drive diagnostic
tests again.
Run the diagnostic for SCSI Attached Disks (see “Running the diagnostic
programs” on page 62).
Note: This test is not available on server models that use any of the available
optional RAID controllers. For these server models, check the system error log for
RAID device errors (see “Error logs” on page 26) and use the RAID device utilities
to confirm correct disk drive setup (“Using the Configuration/Setup Utility program”
on page 151).
44 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
A hard disk drive that you are
installing does not fit correctly in
Make sure that the type of drive is correct for this server (see Chapter 3, “Parts
listing, Type 7978 and 1913 server,” on page 89).
the cage.
Intermittent problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
A problem occurs only
occasionally and is difficult to
diagnose.
The server resets (restarts)
occasionally.
1. Make sure that:
v All cables and cords are connected securely to the rear of the server and
attached devices.
v There is adequate cooling airflow. Reduced airflow due to a failed fan or an
internal or external obstruction can cause the server to overheat and shut
down.
Check the system-error log or BMC log (see “Error logs” on page 26).
2.
3. See “Solving undetermined problems” on page 86.
the problem remains, call for service.
If
1. If the reset occurs during POST and the POST watchdog timer is enabled (click
Advanced Setup --> Baseboard Management Controller (BMC) Setting -->
BMC Post Watchdog in the Configuration/Setup Utility program to see the
POST watchdog setting), make sure that sufficient time is allowed in the
watchdog timeout value (BMC POST Watchdog Timeout ). See the User’s
Guide for information about the settings in the Configuration/Setup Utility
program. If the server continues to reset during POST, see “POST” on page 19
and “Diagnostic programs, messages, and error codes” on page 61.
2. If the reset occurs after the operating system starts, disable any automatic
server restart (ASR) utilities, such as the IBM Automatic Server Restart IPMI
Application for Windows, or ASR devices that may be installed.
Note: ASR utilities operate as operating-system utilities and are related to the
IPMI device driver. If the reset continues to occur after the operating system
starts, the operating system might have a problem; see “Software problems” on
page 55.
3. If neither condition applies, check the system-error log or BMC log (see “Error
logs” on page 26).
the problem remains, call for service.
If
Chapter 2. Diagnostics 45
USB keyboard, mouse, or pointing-device problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
All or some keys on the
keyboard do not work.
The USB mouse or USB
pointing device does not work.
1. Run the Configuration/Setup Utility program and enable keyboardless operation
to prevent the POST error message 301 from being displayed during startup.
2. Make sure that:
v The keyboard is compatible with the server.
v The keyboard cable is securely connected.
v The server and the monitor are turned on.
Reseat the keyboard cable.
3.
4. If you are using an external USB hub, disconnect the keyboard from the hub
and connect it directly to the server.
5. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. Keyboard
b. (Trained service technician only) System board
1. Make sure that:
v The mouse is compatible with the server.
v The mouse or pointing-device USB cable is securely connected to the
server, and the keyboard and the device drivers are installed correctly.
v The server and the monitor are turned on.
v Keyboardless operation has been enabled in the Configuration/Setup Utility
program.
Reseat the mouse or pointing-device cable.
2.
3. If you are using an external USB hub, disconnect the mouse or pointing device
from the hub and connect it directly to the server.
4. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. Mouse or pointing device
b. (Trained service technician only) System board
46 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Memory problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The amount of system memory
that is displayed is less than the
amount of installed physical
memory.
Multiple rows of DIMMs in a
branch are identified as failing.
1. Make sure that:
v No light path diagnostics LEDs are lit on the operator information panel.
v Memory mirroring or sparing does not account for the discrepancy.
v The DIMMs are seated correctly.
v You have installed the correct type of memory. See “Installing a memory
module” on page 110.
v If you changed the memory, you updated the memory configuration in the
Configuration/Setup Utility program.
v All banks of memory are enabled. The server might have automatically
disabled a memory bank when it detected a problem, or a memory bank
might have been manually disabled.
Check the POST error log for error message 289:
2.
v If a DIMM was disabled by a system-management interrupt (SMI), replace
the DIMM.
v If a DIMM was disabled by the user or by POST, run the Configuration/Setup
Utility program and enable the DIMM.
Run memory diagnostics (see “Running the diagnostic programs” on page 62).
3.
4. Add one pair of DIMMs at a time, making sure that the DIMMs in each pair are
matching. Install the DIMMs in the sequence described in “Installing a memory
module” on page 110.
5. Reseat the DIMMs. See “Installing a memory module” on page 110.
6. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. DIMMs
b. (Trained service technician only) System board
1. Reseat the DIMMs; then, restart the server.
2. If the DIMM was disabled the user or POST, run the Configuration/Setup Utility
program and enable the DIMM.
3. Replace the lowest-numbered DIMM pair of those identified; then, restart the
server. Repeat as necessary.
4. (Trained service technician only) Replace the system board.
Chapter 2. Diagnostics 47
Microprocessor problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The server emits a continuous
beep during POST, indicating
that the startup (boot)
microprocessor is not working
correctly.
1. Correct any errors that are indicated by the light path diagnostics LEDs (see
“Light path diagnostics” on page 56).
2. Make sure that the server supports the microprocessor.
3. (Trained service technician only) Make sure that the microprocessor is seated
correctly.
4. (Trained service technician only) Replace the microprocessor.
48 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Monitor problems
Some IBM monitors have their own self-tests. If you suspect a problem with your
monitor, see the documentation that comes with the monitor for instructions for
testing and adjusting the monitor. If you cannot diagnose the problem, call for
service.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
Testing the monitor.
The screen is blank.
The monitor works when you
turn on the server, but the
screen goes blank when you
start some application
programs.
1. Make sure that the monitor cables are firmly connected.
2. Try using a different monitor on the server, or try testing the monitor on a
different server.
3. Run the diagnostic programs. If the monitor passes the diagnostic programs,
the problem might be a video device driver.
4. Reseat the Remote Supervisor Adapter II SlimLine (if one is present).
5. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. Remote Supervisor Adapter II SlimLine (if one is present)
b. (Trained service technician only) System board
1. If an external USB hub is in use, disconnect the monitor from the hub and
connect it directly to the server.
2. Make sure that:
v The server is turned on. If there is no power to the server, see “Power
problems” on page 52.
v The monitor cables are connected correctly.
v The monitor is turned on and the brightness and contrast controls are
adjusted correctly.
v No beep codes sound when the server is turned on.
Important:
In some memory configurations, the 3-3-3 beep code might sound
during POST, followed by a blank monitor screen. If this occurs and the Boot
Fail Count option in the Start Options of the Configuration/Setup Utility
program is enabled, you must restart the server three times to reset the
configuration settings to the default configuration (the memory connector or
bank of connectors enabled).
3. Make sure that the correct server is controlling the monitor, if applicable.
4. Make sure that damaged BIOS code is not affecting the video; see “Recovering
the BIOS code” on page 76.
5. See “Solving undetermined problems” on page 86.
1. Make sure that:
v The application program is not setting a display mode that is higher than the
capability of the monitor.
v You installed the necessary device drivers for the application.
Run video diagnostics (see “Running the diagnostic programs” on page 62).
2.
v If the server passes the video diagnostics, the video is good; see “Solving
undetermined problems” on page 86.
v (Trained service technician only) If the server fails the video diagnostics,
replace the system board.
Chapter 2. Diagnostics 49
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The monitor has screen jitter, or
the screen image is wavy,
unreadable, rolling, or distorted.
1. If the monitor self-tests show the that monitor is working correctly, consider the
location of the monitor. Magnetic fields around other devices (such as
transformers, appliances, fluorescent lights, and other monitors) can cause
screen jitter or wavy, unreadable, rolling, or distorted screen images. If this
happens, turn off the monitor.
Attention: Moving a color monitor while it is turned on might cause screen
discoloration.
Move the device and the monitor at least 305 mm (12 in.) apart, and turn on
the monitor.
Notes:
a. To prevent diskette drive read/write errors, make sure that the distance
between the monitor and any external diskette drive is at least 76 mm (3
in.).
b. Non-IBM monitor cables might cause unpredictable problems.
Reseat the following components:
2.
v Monitor cable
v Remote Supervisor Adapter II SlimLine (if one is present)
Replace the following components one at a time, in the order shown, restarting
3.
the server each time:
a. Monitor cable
b. Monitor
c. Remote Supervisor Adapter II SlimLine (if one is present)
d. (Trained service technician only) System board
Wrong characters appear on the
screen.
1. If the wrong language is displayed, update the BIOS code (see “Updating the
firmware” on page 149) with the correct language.
2. Reseat the monitor cable.
3. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. Monitor
b. (Trained service technician only) System board
50 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Optional-device problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
An IBM optional device that was
just installed does not work.
An IBM optional device that
worked previously does not
work now.
1. Make sure that:
v The device is designed for the server (see http://www.ibm.com/servers/
eserver/serverproven/compat/us/).
v You followed the installation instructions that came with the device and the
device is installed correctly.
v You have not loosened any other installed devices or cables.
v You updated the configuration information in the Configuration/Setup Utility
program. Whenever memory or any other device is changed, you must
update the configuration.
Reseat the device that you just installed.
2.
3. Replace the device that you just installed.
1. Make sure that all of the hardware and cable connections for the device are
secure.
2. If the device comes with test instructions, use those instructions to test the
device.
3. Reseat the failing device.
4. Replace the failing device.
Chapter 2. Diagnostics 51
Power problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The power-control button does
not work, and the reset button
does work (the server does not
start).
Note: The power-control button
will not function until 20
seconds after the server has
been connected to ac power.
1. Make sure that the power-control button is working correctly:
a. Disconnect the server power cords.
b. Reconnect the power cords.
c. Press the power-control button.
d. If the server does not start, disconnect the server power cords and reseat
the operator information panel cables; then, repeat steps 1a through 1c. If
the problem remains, replace the operator information panel.
Make sure that:
2.
v The power cords are correctly connected to the server and to a working
electrical outlet.
v The server contains the correct type of DIMMs.
v The DIMMs are correctly seated.
v (Trained service technician only) The microprocessor is correctly installed.
If you just installed an optional device, remove it, and restart the server. If the
3.
server now turns on, you might have installed more devices than the power
supply supports.
4. Reseat the following components:
a. DIMMs
b. (Trained service technician only) Power backplane
Replace the following components one at a time, in the order shown, restarting
5.
the server each time:
a. DIMMs
b. Power supply
c. (Trained service technician only) Power backplane
d. (Trained service technician only) System board
See “Solving undetermined problems” on page 86.
6.
52 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The server does not start. Check the four 12-volt power LEDs (A, B, C, and D) on the system board. See
“Internal LEDs, connectors, and jumpers” on page 9 for the LED locations.
1. If the Channel A power LED is lit, check components in the following order.
a. Remove all PCI adapters and riser cards. Try restarting the server. If the
server starts, reinstall the PCI adapters and riser cards, one at a time, to
isolate the defective adapter.
b. (Trained service technician only) System board
c. (Trained service technician only) Power backplane.
If the Channel B power LED is lit, check components in the order listed below.
2.
a. Fans 1 and 2
b. (Trained service technician only) Remove microprocessor 2 (if present). Try
restarting the server.
c. (Trained service technician only) System board
d. (Trained service technician only) Power backplane
If the Channel C power LED is lit, check components in the following order.
3.
a. Fans 3 and 4
b. (Trained service technician only) System board
c. (Trained service technician only) Power backplane
d. (Trained service technician only) Microprocessor 1
If the Channel D power LED is lit, check components in the following order.
4.
a. Remove all DIMMs. Try restarting the server, listening for any memory error
beep codes. If the server restarts, reinstall the DIMMs, one pair at a time, to
isolate the defective DIMM (see “Installing a memory module” on page 110).
b. Fans 5 and 6
c. (Trained service technician only) System board
d. (Trained service technician only) Power backplane
The server does not turn off.
1. Determine whether you are using an Advanced Configuration and Power
Interface (ACPI) or a non-ACPI operating system. If you are using a non-ACPI
operating system, complete the following steps:
a. Press Ctrl+Alt+Delete.
b. Turn off the server by pressing the power-control button for 5 seconds.
c. Restart the server.
d. If the server fails POST and the power-control button does not work,
disconnect the ac power cord for 20 seconds; then, reconnect the ac power
cord and restart the server.
The server unexpectedly shuts
See “Solving undetermined problems” on page 86.
down, and the LEDs on the
operator information panel are
not lit.
Chapter 2. Diagnostics 53
Serial port problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The number of serial ports that
are identified by the operating
system is less than the number
of installed serial ports.
A serial device does not work.
1. Make sure that:
v Each port is assigned a unique address in the Configuration/Setup Utility
program and none of the serial ports is disabled.
v The serial-port adapter (if one is present) is seated correctly.
Reseat the serial port adapter.
2.
3. Replace the serial port adapter.
1. Make sure that:
v The device is compatible with the server.
v The serial port is enabled and is assigned a unique address.
v The device is connected to the correct connector (see “Internal LEDs,
connectors, and jumpers” on page 9).
Reseat the following components:
2.
a. Failing serial device
b. Serial cable
c. Remote Supervisor Adapter II SlimLine (if one is present)
Replace the following components one at a time, in the order shown, restarting
3.
the server each time:
a. Failing serial device
b. Serial cable
c. Remote Supervisor Adapter II SlimLine (if one is present)
d. (Trained service technician only) System board
ServerGuide problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
™
The ServerGuide
Installation CD will not start.
The ServeRAID program cannot
view all installed drives, or the
operating system cannot be
installed.
54 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Setup and
1. Make sure that the server supports the ServerGuide program and has a
startable (bootable) CD-RW/DVD drive.
2. If the startup (boot) sequence settings have been changed, make sure that the
CD-RW/DVD drive is first in the startup sequence.
3. If more than one CD-RW/DVD drive is installed, make sure that only one drive
is set as the primary drive. Start the CD from the primary drive.
1. Make sure that there are no duplicate IRQ assignments.
2. Make sure that the hard disk drive is connected correctly.
3. Make sure that the hard disk drive cables are securely connected.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
The operating-system
Make more space available on the hard disk.
installation program
continuously loops.
The ServerGuide program will
not start the operating-system
CD.
The operating system cannot be
installed; the option is not
available.
Make sure that the operating-system CD is supported by the ServerGuide program.
See the ServerGuide Setup and Installation CD label for a list of supported
operating-system versions.
Make sure that the server supports the operating system. If it does, no logical drive
is defined (RAID servers). Run the ServerGuide program and make sure that setup
is complete.
Software problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
You suspect a software
problem.
1. To determine whether the problem is caused by the software, make sure that:
v The server has the minimum memory that is needed to use the software. For
memory requirements, see the information that comes with the software. If
you have just installed an adapter or memory, the server might have a
memory-address conflict.
v The software is designed to operate on the server.
v Other software works on the server.
v The software works on another server.
If you received any error messages when using the software, see the
2.
information that comes with the software for a description of the messages and
suggested solutions to the problem.
3. Contact your place of purchase of the software.
Chapter 2. Diagnostics 55
Universal Serial Bus (USB) port problems
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Symptom Action
A USB device does not work.
1. Make sure that:
v The correct USB device driver is installed.
v The operating system supports USB devices.
Make sure that the USB configuration options are set correctly in the
2.
Configuration/Setup Utility program (see the User’s Guide on the IBM System x
Documentation CD for more information).
3. If you are using an external USB hub, disconnect the USB device from the hub
and connect it directly to the server.
Video problems
See “Monitor problems” on page 49.
Light path diagnostics
Light path diagnostics is a system of LEDs on various external and internal
components of the server. When an error occurs, LEDs are lit throughout the
server. By viewing the LEDs in a particular order, you can often identify the source
of the error.
When LEDs are lit to indicate an error, they remain lit when the server is turned off,
provided that the server is still connected to power and the power supply is
operating correctly.
Before working inside the server to view light path diagnostics LEDs, read the
safety information that begins on page “Safety” on page vii and “Handling
static-sensitive devices” on page 96.
If an error occurs, view the light path diagnostics LEDs in the following order:
1. Look at the operator information panel on the front of the server.
v If the information LED is lit, it indicates that information about a suboptimal
condition in the server is available in the BMC log or in the system-error log.
v If the system-error LED is lit, it indicates that an error has occurred; go to
step 2 on page 57.
The following illustration shows the operator information panel.
Power-on
LED
Hard drive
activity LED
Information
LED
Release
latch
Power-control
button
56 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
System
locator
LED
System-error
LED
2. To view the light path diagnostics panel, slide the latch to the left on the front of
the light path diagnostics drawer. This reveals the light path diagnostics panel.
Lit LEDs on this panel indicate the type of error that has occurred.
The following illustration shows the light path diagnostics panel.
Light Path
Diagnostics
CPU
MEM
FAN
PCI
PS1SPPS2
VRM
CNFG
NMI
S ERR
RAID
DASD
TEMP
BRD
OVER SPEC
REMIND
Note any LEDs that are lit, and then close the drawer.
Look at the system service label on the top of the server, which gives an
overview of internal components that correspond to the LEDs on the light path
diagnostics panel. This information and the information in “Light path diagnostics
LEDs” on page 58 can often provide enough information to diagnose the error.
3. Remove the server cover and look inside the server for lit LEDs. A lit LED on or
beside a component identifies the component that is causing the error.
The following illustration shows the LEDs on the system board.
Chapter 2. Diagnostics 57
Power-on LED
Location LED
System-error LED
PCI slot 2 error LED
Remote Supervisor
Adapter II
SlimLine error LED
BMC status
LED
System-board
fault LED
Fan 1 error LED
Power B error LED
Power A error LED
Power C error LED
Power D error LED
Fan 2 error LED
System-board battery
error LED
PCI slot 1
error LED
DIMM 5 error LED
DIMM 6 error LED
DIMM 7 error LED
DIMM 8 error LED
Light path diagnostics
active LED
Light path diagnostics switch
RAID error LED
Microprocessor 2
error LED
Microprocessor 1
error LED
DIMM 1 error LED
DIMM 2 error LED
DIMM 3 error LED
DIMM 4 error LED
Fan 6 error LED
Fan 5 error LED
Fan 4 error LED
Fan 3 error LED
Remind button
You can use the remind button on the light path diagnostics panel to put the
system-error LED on the operator information panel into Remind mode. When you
press the remind button, you acknowledge the error but indicate that you will not
take immediate action. The system-error LED flashes while it is in Remind mode
and stays in Remind mode until one of the following conditions occurs:
v All known errors are corrected.
v The server is restarted.
v A new error occurs, causing the system-error LED to be lit again.
Light path diagnostics switch
The light path diagnostics switch allows you to review error indications after the
server has been powered down. Press and hold the diagnostics switch, located on
the system board to relight the LEDs that were lit before you removed power from
the server. The LEDs will remain lit for as long as you press the switch, to a
maximum of 25 seconds.
Light path diagnostics LEDs
The following table describes the LEDs on the light path diagnostics panel and
suggested actions to correct the detected problems.
Note: Check the system-error log or BMC log for additional information before
replacing a FRU.
58 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Lit light path
diagnostics LED with
the system-error or
information LED also
lit Description Action
None An error has occurred and cannot be
diagnosed, or the Advanced System
Check the system error log for information about
the error.
Management (ASM) processor on the
Remote Supervisor Adapter II SlimLine
has failed. The error is not represented
by a light path diagnostics LED.
OVER SPEC The power supplies are using more
power than their maximum rating.
Replace the failing power supply, or remove
optional devices from the server.
PS1 The power supply in bay 1 has failed. Replace the failed power supply.
PS2 The power supply in bay 2 has failed. Replace the failed power supply.
CPU A microprocessor has failed. Make sure that the failing microprocessor, which is
indicated by a lit LED on the system board, is
installed correctly. See “Installing a microprocessor”
on page 138 for information about installing a
microprocessor.
VRM Reserved. Reserved.
CNFG Microprocessor configuration error. v Check the microprocessor options for
compatibility.
v Check the system error log for information
indicating incompatible components.
MEM A memory error has occurred. Replace the failing DIMM, which is indicated by the
lit LED on the system board.
NMI A machine check error has occurred. Check the system error log for information about
the error.
S ERR Reserved
SP The service processor has failed. Remove ac power from the server; then, reconnect
the server to ac power and restart the server.
If a Remote Supervisor Adapter II SlimLine is
installed, replace it.
DASD A hard disk drive error has occurred. Check the LEDs on the hard disk drives and
replace the indicated drive.
BRD An error has occurred on the system
board.
v Check the LEDs on the system board to identify
the component that is causing the error.
v Check the system error log for information about
the error.
FAN A fan has failed, is operating too slowly,
or has been removed. A failing fan can
Replace the failing fan, which is indicated by a lit
LED near the fan connector on the system board.
also cause the TEMP LED to be lit.
Chapter 2. Diagnostics 59
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Lit light path
diagnostics LED with
the system-error or
information LED also
lit Description Action
TEMP The system temperature has exceeded
a threshold level.
v Determine whether a fan has failed. If it has,
replace it.
v Make sure that the room temperature is not too
high. See “Features and specifications” on page
3 for temperature information.
v Make sure that the air vents are not blocked.
RAID A RAID controller error has occurred. Check the system error log for information about
the error. If an optional RAID controller is installed,
see the documentation that comes with the RAID
controller.
PCI An error has occurred on a PCI bus or
on the system board. An additional LED
will be lit next to a failing PCI slot.
v Check the LEDs at the PCI slots to identify the
component that is causing the error.
v Check that the PCI riser assemblies are seated
correctly.
v Check the system error log for information about
the error.
v If you cannot isolate the failing adapter through
the LEDs and the information in the system error
log, remove one adapter at a time from the
failing PCI bus, and restart the server after each
adapter is removed.
Power-supply LEDs
The following minimum configuration is required for the DC LED on the power
supply to be lit:
v Power supply
v Power backplane
v Power cord
following minimum configuration is required for the server to start:
The
v One microprocessor in microprocessor socket 1
v Two 512 MB DIMMs on the system board
v One power supply
v Power backplane
v Power cord
v Five cooling fans
following illustration shows the locations of the power-supply LEDs.
The
60 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
Power connector
Power-on LED
System-locator LED
System-error LED
AC Power
LED
DC Power
LED
Ethernet 1
Ethernet 2
PCI slot 1 PCI slot 2
USB 2
USB 1
Systems
management
Ethernet connector
Serial
connector
Video
connector
The following table describes the problems that are indicated by various
combinations of the power-supply LEDs and the power-on LED on the operator
information panel and suggested actions to correct the detected problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Power-supply
LEDs
Off Off Off No power to the
Lit Off Off DC source power
Lit Lit Off Standby power
Lit Lit Flashing The power is good. The server is not powered on. No action is necessary.
Lit Lit Lit The power is good. The server is powered on. No action is necessary.
Operator
information
panel
power-on
LED
Description Action AC DC
server, or a problem
with the ac power
source.
problem.
problem.
1. Check the ac power to the server.
2. Make sure that the power cord is connected to a
functioning power source.
3. Remove one power supply at a time.
1. Remove one power supply at a time.
2. View the system-error log (see “Error logs” on page
26).
1. View the event log (see “Error logs” on page 26).
2. Remove one power supply at a time.
3. (Trained service technician only) Replace the power
backplane.
Diagnostic programs, messages, and error codes
The diagnostic programs are the primary method of testing the major components
of the server. As you run the diagnostic programs, text messages and error codes
are displayed on the screen and are saved in the test log. A diagnostic text
message or error code indicates that a problem has been detected; to determine
what action you should take as a result of a message or error code, see the table in
“Diagnostic error codes” on page 64.
Chapter 2. Diagnostics 61
Running the diagnostic programs
To run the diagnostic programs, complete the following steps:
1. Turn off the server and any peripheral devices.
2. Turn on all attached devices; then, turn on the server.
3. When the prompt F2 for Diagnostics appears, press F2.
Note: To run the diagnostic programs, you must start the server with the
highest level password that is set. That is, if an administrator password is set,
you must enter the administrator password, not the user password, to run the
diagnostic programs.
4. Type the applicable password; then, press Enter.
5. Select either Extended or Basic from the top of the screen.
6. From the diagnostic programs screen, select the test that you want to run, and
follow the instructions on the screen.
You can press F1 while running the diagnostic programs to obtain help information.
You also can press F1 from within a help screen to obtain online documentation
from which you can select different categories. To exit from the help information and
return to where you left off, press Esc.
If the server stops during testing and you cannot continue, restart the server and try
running the diagnostic programs again. If the problem remains, replace the
component that was being tested when the server stopped.
The keyboard and mouse (pointing device) tests assume that a keyboard and
mouse are attached to the server.
If you run the diagnostic programs with no mouse attached to the server, you will
not be able to navigate between test categories using the Next Cat and Prev Cat
buttons. All other functions provided by mouse-selectable buttons are also available
using the function keys.
You can test the USB keyboard by using the regular keyboard test. The regular
mouse test can test a USB mouse. Also, you can run the USB interface test only if
there are no USB devices attached.
You can view server configuration information (such as system configuration,
memory contents, interrupt request (IRQ) use, direct memory access (DMA) use,
device drivers, and so on) by selecting Hardware Info from the top of the screen.
When you are diagnosing hard disk drives, select SCSI Attached Disks for the
most thorough test. Select Fixed Disks for any of the following situations:
v You want to run a faster test.
v The server contains RAID arrays.
v The server contains simple-swap SATA hard disk drives.
determine what action you should take as a result of a diagnostic text message
To
or error code, see the table in “Diagnostic error codes” on page 64.
If the diagnostic programs do not detect any hardware errors but the problem
remains during normal server operations, a software error might be the cause. If
you suspect a software problem, see the information that comes with your software.
62 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
A single problem might cause more than one error message. When this happens,
correct the cause of the first error message. The other error messages usually will
not occur the next time you run the diagnostic programs.
Exception: If there are multiple error codes or diagnostics LEDs that indicate a
microprocessor error, the error might be in a microprocessor or in a microprocessor
socket. See “Microprocessor problems” on page 48 for information about diagnosing
microprocessor problems.
If the server stops during testing and you cannot continue, restart the server and try
running the diagnostic programs again. If the problem remains, replace the
component that was being tested when the server stopped.
Diagnostic text messages
Diagnostic text messages are displayed while the tests are running. A diagnostic
text message contains one of the following results:
Passed: The test was completed without any errors.
Failed: The test detected an error.
User Aborted: Yo u stopped the test before it was completed.
Not Applicable: Yo u attempted to test a device that is not present in the server.
Aborted: The test could not proceed because of the server configuration.
Warning: The test could not be run. There was no failure of the hardware that was
being tested, but there might be a hardware failure elsewhere, or another problem
prevented the test from running; for example, there might be a configuration
problem, or the hardware might be missing or is not being recognized.
The result is followed by an error code or other additional information about the
error.
Viewing the test log
To view the test log when the tests are completed, select Utility from the top of the
screen and then select View Test Log . The summary test log is displayed. To view
the detailed test log, press the Ta b key while viewing the summary log.
The test-log data is maintained only while you are running the diagnostic programs.
When you exit from the diagnostic programs, the test log is cleared.
To save the test log to a file on a diskette or to the hard disk, click Save Log on the
diagnostic programs screen and specify a location and name for the saved log file.
Notes:
1. To create and use a diskette, you must add an optional external diskette drive to
the server before you turn it on.
2. To save the test log to a diskette, you must use a diskette that you have
formatted yourself; this function does not work with preformatted diskettes. If the
diskette has sufficient space for the test log, the diskette can contain other data.
Chapter 2. Diagnostics 63
Diagnostic error codes
The following table describes the error codes that the diagnostic programs might
generate and suggested actions to correct the detected problems.
If the diagnostic programs generate error codes that are not listed in the table,
make sure that the latest levels of BIOS, Remote Supervisor Adapter II SlimLine,
and ServeRAID code are installed.
In the error codes, x can be any numeral or letter. However, if the three-digit
number in the central position of the code is 000, 195, or 197, do not replace a
CRU or FRU. These numbers appearing in the central position of the code have the
following meanings:
000 The server passed the test. Do not replace a CRU or FRU.
195 The Esc key was pressed to end the test. Do not replace a CRU or FRU.
197 This is a warning error, but it does not indicate a hardware failure; do not
replace a CRU or FRU. Take the action that is indicated in the Action
column but do not replace a CRU or a FRU . See the description of
Warning in “Diagnostic text messages” on page 63 for more information.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
001-250-000 Failed microprocessor board ECC.
001-xxx-000 Failed core tests. (Trained service technician only) Replace the system
001-xxx-001 Failed core tests. (Trained service technician only) Replace the system
001-292-000 Failed microprocessor board ECC. Load BIOS code defaults and run the test again.
005-xxx-000 Failed video test.
011-xxx-000 Failed COM1 serial port test.
1. Check the system-error log and the BMC log for
messages that indicate the cause of the error
(see “Error logs” on page 26).
2. From the diagnostic programs, run Quick Memory
Test All Banks (see “Running the diagnostic
programs” on page 62).
3. From the diagnostic programs, run the ECC test
again (see “Running the diagnostic programs” on
page 62).
4. (Trained service technician only) Replace the
system board.
board.
board.
1. Reseat the optional video adapter, if one is
installed.
2. (Trained service technician only) Replace the
system board.
1. Check the loopback plug that is connected to the
serial port.
2. (Trained service technician only) Replace the
system board.
64 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
011-xxx-001 Failed COM2 serial port test.
1. Check the loopback plug that is connected to the
serial port.
2. (Trained service technician only) Replace the
system board.
030-xxx-000 Failed internal SAS interface test. (Trained service technician only) Replace the system
board.
035-285-001 Adapter communication error.
1. Update the RAID controller firmware.
2. Reseat the RAID controller.
3. Replace the RAID controller.
035-286-001 Adapter CPU test error.
1. Update the RAID controller firmware.
2. Reseat the RAID controller.
3. Replace the RAID controller.
035-287-001 Adapter local RAM test error.
1. Update the RAID controller firmware.
2. Reseat the RAID controller.
3. Replace the RAID controller.
035-288-001 Adapter NVSRAM test error.
1. Update the RAID controller firmware.
2. Reseat the RAID controller.
3. Replace the RAID controller.
035-289-001 Adapter cache test error.
1. Update the RAID controller firmware.
2. Reseat the RAID controller.
3. Replace the RAID controller.
035-292-001 Adapter parameter set error.
1. Update the RAID controller firmware.
2. Reseat the RAID controller.
3. Replace the RAID controller.
035-230-001 Battery low. Replace the battery module of the RAID controller.
035-231-001 Abnormal battery temperature. Replace the battery module of the RAID controller.
035-230-001 Battery status unknown. Replace the battery module of the RAID controller.
035-xxx-snn Failed hard disk drive with ID nn on RAID
adapter in slot s.
1. Check the system-error log and replace any
indicated failing devices.
2. Reseat the disk with ID nn on adapter in slot s.
3. Replace the disk with ID nn on adapter in slot s.
035-xxx-099 No adapters were found. If an adapter is installed:
1. Reseat the adapter.
2. Check the adapter cables to be sure they are
secure.
Chapter 2. Diagnostics 65
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
035-xxx-s99 Failed RAID test: s = number of failing
adapter slot.
1. Check the system-error log and replace any
indicated failing devices.
2. Reseat the following components, one at a time,
in the order shown, restarting the server each
time:
a. RAID adapter in slot s
b. Cable for the RAID adapter in slot s
c. Riser card
Replace the following components one at a time,
3.
in the order shown, restarting the server each
time:
a. RAID adapter in slot s
b. Cable for the RAID adapter in slot s
c. Riser card
d. (Trained service technician only) System
board
035-253-s99 RAID adapter initialization failure.
1. Reseat the following components, one at a time,
in the order shown, restarting the server each
time:
a. ServeRAID adapter
b. Hot-swap hard disk drive backplane cable
Replace the components listed in step 1 one at a
2.
time, in the order shown, restarting the server
each time.
089-xxx-00n Failed microprocessor test.
1. Make sure that the BIOS code is at the latest
level.
2. Trained service technician only:
a. Reseat microprocessor 1 (if n = 0 or 1) or
microprocessor 2 (if n = 2 or 3).
b. Replace microprocessor 1 (if n = 0 or 1) or
microprocessor 2 (if n = 2 or 3).
165-060-000 Service Processor: ASM may be busy.
1. Rerun the diagnostic test.
2. Fix other error conditions that may be keeping the
ASM busy. Refer to the error log and diagnostic
panel.
3. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
4. (Trained service technician only) Replace the
system board.
66 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
165-198-000 Service Processor: Aborted.
1. Rerun the diagnostic test.
2. Fix other error conditions that may be keeping
ASM busy. Refer to the error log and diagnostic
panel.
3. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
4. (Trained service technician only) Replace the
system board.
165-201-000 Service Processor: Failed.
1. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
2. (Trained service technician only) Replace the
system board.
165-330-000 Service Processor: Failed. Update to the latest ROM diagnostic level and retry.
165-342-000 Service Processor: Failed.
1. Ensure that the latest firmware levels for ASM
and BIOS are installed.
2. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
3. (Trained service technician only) Replace the
system board.
165-051-000 System Management: Failed. (Unable to
communicate with RSA. It may be busy.
Run the test again.)
1. Update to the latest levels of firmware (BIOS,
service processor, diagnostics).
2. Rerun the diagnostic test.
3. Correct other error conditions (including failed
system management tests and items logged in
Remote Supervisor Adapter II SlimLine
system-error log and BMC log) and retry.
4. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
5. Reseat the remote Supervisor Adapter II
SlimLine.
6. Replace the remote Supervisor Adapter II
SlimLine.
Chapter 2. Diagnostics 67
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
166-060-000 System Management: Failed. (Unable to
communicate with RSA. It may be busy.
Run the test again.)
1. Flash the latest levels of the firmware (BIOS,
service processor, diagnostics).
2. Rerun the diagnostic test.
3. Correct other error conditions (including failed
system management tests and items logged in
Remote Supervisor Adapter II SlimLine
system-error log and BMC log) and retry.
4. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
5. Reseat the remote Supervisor Adapter II
SlimLine.
6. Replace the remote Supervisor Adapter II
SlimLine.
166-070-000 System Management: Failed. (Unable to
communicate with RSA. It may be busy.
Run the test again.)
1. Flash the latest levels of the firmware (BIOS,
service processor, diagnostics).
2. Rerun the diagnostic test.
3. Correct other error conditions (including failed
system management tests and items logged in
Remote Supervisor Adapter II SlimLine
system-error log and BMC log) and retry.
4. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
5. Reseat the remote Supervisor Adapter II
SlimLine.
6. Replace the remote Supervisor Adapter II
SlimLine.
166-198-000 System Management: Aborted. (Unable to
communicate with RSA. It may be busy.
Run the test again.)
1. Run the diagnostic test again.
2. Correct other error conditions and retry. These
include other failed system management tests
and items logged in the system-error log of the
optional Remote Supervisor Adapter II SlimLine.
3. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
4. Remote Supervisor Adapter II SlimLine, if
installed.
5. (Trained service technician only) Replace the
system board.
68 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
166-201-001 System Management: Failed (I2C bus
error(s). See SERVPROC and DIAGS
entries in the event log.)
Reseat the following components, one at a time, in
the order shown, restarting the server each time:
1. Remote Supervisor II SlimLine (if installed).
2. DIMMs.
Replace
the following components, one at a time, in
the order shown, restarting the server each time:
1. Remote Supervisor II SlimLine (if installed).
2. DIMMs.
3. (Trained service technician only) System board.
166-201-002 System Management: Failed (I2C bus
error(s) See SERVPROC and DIAGS
entries in event log.)
Reseat the following components, one at a time, in
the order shown, restarting the server each time:
1. I2C cable between the operator information panel
and the system board (“System-board internal
connectors” on page 10).
2. Operator information panel.
Replace
the following components, one at a time, in
the order shown, restarting the server each time:
1. I2C cable between the operator information panel
and the system board (“System-board internal
connectors” on page 10).
2. Operator information panel.
3. (Trained service technician only) System board.
166-201-003 System Management: Failed (I2C bus
error(s) See SERVPROC and DIAGS
entries in event log.)
Reseat the following components, one at a time, in
the order shown, restarting the server each time:
1. Power backplane.
2. Power supply.
Replace
the following components, one at a time, in
the order shown, restarting the server each time:
1. Power backplane.
2. Power supply.
3. (Trained service technician only) System board.
166-201-004 System Management: Failed
(I2C bus error(s) See SERVPROC and
DIAGS entries in event log.)
Reseat the SAS backplane. Replace the following
components, one at a time, in the order shown,
restarting the server each time:
1. SAS backplane.
2. (Trained service technician only) System board.
Chapter 2. Diagnostics 69
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
166-201-005 System Management: Failed
(I2C bus error(s) See SERVPROC and
DIAGS entries in event log.)
Reseat the following components, one at a time, in
the order shown, restarting the server each time:
1. DIMMs.
2. (Trained service technician only) Microprocessors.
Replace
the following components, one at a time, in
the order shown, restarting the server each time:
1. DIMMs.
2. Microprocessors.
3. (Trained service technician only) System board.
166-250-000 System Management: Failed (I2C cable is
disconnected. Reconnect I2C cable between
RSA and system board.)
1. Reseat the Remote Supervisor Adapter II
SlimLine.
2. Replace the Remote Supervisor Adapter II
SlimLine.
3. (Trained service technician only) Replace the
system board.
166-260-000 System Management: Failed
(Restart RSA Error. After restarting, RSA
communication was lost. Unplug and cold
boot to reset RSA.)
1. Disconnect all the option and power cords from
the server, wait 30 seconds, reconnect, and retry.
2. Reseat the Remote Supervisor Adapter II
SlimLine.
3. Replace the Remote Supervisor Adapter II
SlimLine.
166-342-000 System Management: Failed
(RSA adapter BIST indicate failed tests.)
1. Ensure the latest firmware levels for the Remote
Supervisor Adapter II SlimLine and BIOS are
installed.
2. Disconnect all the option and power cords from
the server, wait 30 seconds, reconnect, and retry.
3. Reseat the Remote Supervisor Adapter II
SlimLine.
4. Replace the Remote Supervisor Adapter II
SlimLine.
166-400-000 System Management: Failed (BMC self test
result failed tests: x where x = Flash, RAM,
or ROM.)
166-404-001 System Management: Failed (BMC indicates
failure in I2C bus test.)
1. Reflash or update the firmware for the BMC.
2. (Trained service technician only) Replace the
system board.
1. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
2. Reflash or update the firmware for the BMC.
3. Reseat the power backplane
4. Replace the power backplane.
5. (Trained service technician only) Replace the
system board.
70 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
166-406-001 System Management: Failed (BMC indicates
failure in I2C bus test.)
1. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
2. Reflash or update the firmware for the BMC.
3. Reseat the SAS backplane and the SAS
backplane cable.
Replace
the following components, one at a time, in
the order shown, restarting the server each time:
1. SAS backplane
2. SAS backplane cable
3. (Trained service technician only) System board.
166-407-001 System Management: Failed (BMC indicates
failure in I2C bus test.)
1. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
2. Reflash or update the firmware for the BMC.
3. Operator information panel cable.
4. Operator information panel.
5. (Trained service technician only) Replace the
system board.
166-NNN-001 System Management: Failed (BMC indicates
failure in self test where NNN=300 to 320.)
1. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
2. Reflash or update the firmware for the BMC.
3. (Trained service technician only) Replace the
system board.
166-NNN-001 System Management: Failed (BMC indicates
failure in I2C bus test where NNN=400 to
420 (excluding 412, 414, and 415).)
1. Disconnect all server and option power cords
from the server, wait 30 seconds, reconnect, and
retry.
2. Reflash or update the firmware for the BMC.
3. (Trained service technician only) Replace the
system board.
180-197-000 SAS ASPI driver not installed. Ignore this message if the server is a SATA system.
This test is not supported for SATA drives.
1. Update the SAS configuration parameters (see
“Configuring hot-swap SAS or hot-swap SATA
RAID” on page 152).
2. (Trained service technician only) Replace the
system board.
Chapter 2. Diagnostics 71
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
180-197-000 Hard disk drive backplane not found . Ignore this message if the server is a SATA system.
This test is not supported for SATA drives. Reseat the
following components, one at a time, in the order
shown, restarting the server each time:
1. SAS backplane.
2. SAS backplane cable.
Replace
the following components, one at a time, in
the order shown, restarting the server each time:
1. SAS backplane.
2. SAS backplane cable.
3. (Trained service technician only) System board.
180-198-000 Test aborted. Review the error log for the failure condition that
caused the test to abort.
180-358-000 Ethernet failure.
1. Enable Ethernet with the Configuration/Setup
Utility program (see “Using the
Configuration/Setup Utility program” on page
151).
2. Update the Ethernet firmware (see “Updating the
firmware” on page 149).
3. (Trained service technician only) Replace the
system board.
180-361-003 Failed fan LED test. Reseat the following components, one at a time, in
the order shown, restarting the server each time:
1. Fan cable.
2. Fan.
Replace
the following components, one at a time, in
the order shown, restarting the server each time:
1. Fan cable.
2. Fan.
3. (Trained service technician only) System board.
180-xxx-000 Diagnostics LED failure. Run the diagnostics panel LED test for the failing
LED.
180-xxx-001 Failed front LED panel test. Reseat the operator information card cable
connection on the system board. Replace the
following components, one at a time, in the order
shown, restarting the server each time:
1. Operator information card.
2. (Trained service technician only) Replace the
system board.
72 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
180-xxx-002 Failed diagnostics LED panel test. Trained service technician only:
1. Disconnect the server power cords and reseat the
operator information panel cable. Restart the
server.
2. Replace the operator information panel.
180-xxx-003 Failed system board LED test. (Trained service technician only) Replace the system
board.
180-xxx-005 Failed SAS backplane LED test. Reseat the following components, one at a time, in
the order shown, restarting the server each time:
1. SAS backplane.
2. SAS backplane cable.
Replace
the following components, one at a time, in
the order shown, restarting the server each time:
1. SAS backplane.
2. SAS backplane cable.
3. (Trained service technician only) System board.
201-xxx-0nn Failed memory test.
Note: n = slot number of failing DIMM.
Replace the following components one at a time, in
the order shown, restarting the server each time:
1. DIMM identified by nn.
2. (Trained service technician only) System board.
201-xxx-n99 Multiple DIMM failure.
Note: n = bank number of failing pair.
1. See the error text to identify the failing DIMMs.
2. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. DIMMs in bank n.
b. (Trained service technician only) System
board.
202-xxx-00n Failed system cache test.
1. Trained service technician only:
a. Reseat microprocessor 1 (if n = 0 or 1) or
microprocessor 2 (if n = 2 or 3).
b. Replace microprocessor 1 (if n = 0 or 1) or
microprocessor 2 (if n = 2 or 3).
c. Replace the system board.
Chapter 2. Diagnostics 73
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
215-xxx-000 Failed CD or DVD test.
1. Run the test again with a different CD or DVD.
2. Reseat the following components:
a. CD-RW/DVD drive
b. CD-RW/DVD drive cable
c. (Trained service technician only) operator
information panel assembly
Replace the following components one at a time,
3.
in the order shown, restarting the server each
time:
a. CD-RW/DVD drive cable
b. CD-RW/DVD drive
217-198-xxx Could not establish drive parameters.
1. Reseat the hard disk drive cables.
2. Reseat the hard disk drive.
3. Replace the following components in the order
shown, restarting the server each time:
a. Hard disk drive
b. Hard disk drive cable
c. (Hot-swap models) RAID controller
d. Hard disk drive backplane or backplate
217-xxx-000 Failed fixed disk test.
1. Reseat the hard disk drive 1 cables.
2. Reseat hard disk drive 1.
3. Replace hard disk drive 1.
217-xxx-001 Failed fixed disk test.
1. Reseat the hard disk drive 2 cables.
2. Reseat hard disk drive 2.
3. Replace hard disk drive 2.
217-xxx-002 Failed fixed disk test.
1. Reseat the hard disk drive 3 cables.
2. Reseat hard disk drive 3.
3. Replace hard disk drive 3.
217-xxx-003 Failed fixed disk test.
1. Reseat the hard disk drive 4 cables.
2. Reseat hard disk drive 4.
3. Replace hard disk drive 4.
301-xxx-000 Failed keyboard test.
1. Reseat the keyboard cable.
2. Replace the following components one at a time,
in the order shown, restarting the server each
time:
a. Keyboard
b. (Trained service technician only) System
board
74 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
Error code Description Action
405-xxx-000 Failed Ethernet test on controller on the
system board.
1. Verify that Ethernet is not disabled in BIOS.
2. (Trained service technician only) Replace the
system board.
405-xxx-00n Failed Ethernet test on adapter in PCI slot
n .
Reseat the adapter in PCI slot n . Replace the
following components one at a time, in the order
shown, restarting the server each time:
1. Adapter in PCI slot n
2. (Trained service technician only) System board
405-xxx-a0n Failed Ethernet test on adapter in PCI slot
a .
1. For a = 0, (trained service technician only)
replace the system board.
2. For a > 0,
a. Reseat the adapter in PCI slot a .
b. Replace the adapter in PCI slot a .
Chapter 2. Diagnostics 75
Recovering the BIOS code
If the BIOS code has become damaged, such as from a power failure during an
update, you can recover the BIOS code using the boot block jumper and a BIOS
recovery diskette.
Notes:
1. You can obtain a BIOS recovery diskette from one of the following sources:
v Download the BIOS code update from the World Wide Web and use it to
make a recovery diskette.
v Contact your IBM service representative.
To create and use a diskette, you must add an optional external diskette drive to
2.
the server.
To download the BIOS code update from the World Wide Web, complete the
following steps:
1. Go to http://www.ibm.com/support.
2. In the Search technical support box, enter x3550 bios
3. Download the latest BIOS code update.
4. Create the BIOS recovery diskette, following the instructions that come with the
update file that you downloaded.
flash memory of the server consists of a primary page and a backup page. The
The
backup page is a protected area that cannot be overwritten. The recovery boot
block is a section of code in this protected area that enables the server to start up
and to read a recovery diskette. The recovery utility recovers the system BIOS code
from the BIOS recovery files on the diskette.
To recover the BIOS code and restore the server operation to the primary page,
complete the following steps:
1. Turn off the server, and disconnect all power cords and external cables.
2. Remove the server cover. See “Removing the cover” on page 98 for more
information.
3. Locate the boot block recovery jumper block (J14) on the system board.
76 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
NMI (SW1)
1
2
3
Boot block recovery
jumper (J14)
8 7 6 5 4 3 2 1
ON
System board switch
block (SW2)
4. Move the jumper from pins 1 and 2 to pins 2 and 3 to enable the BIOS
recovery mode.
5. Connect an external USB diskette drive to the server and insert the BIOS
recovery diskette.
6. Reinstall the server cover; then, reconnect all power cords.
7. Restart the server. The system begins the power-on self test (POST).
8. Select 1 - Update POST/BIOS from the menu that contains various flash
update options.
9. When prompted as to whether you want to save the current code to a diskette,
press N .
10. When prompted to choose a language, select a language (from 0 to 7), and
press Enter to accept your choice.
11. Remove the BIOS recovery diskette from the diskette drive.
12. Turn off the server, and disconnect all power cords and external cables; then,
remove the server cover.
13. Remove the jumper from the boot block recovery jumper block, or move it to
pins 1 and 2, to return to normal startup mode.
14. Reconnect all external cables and power cords, and turn on the peripheral
devices; then, reinstall the server cover.
15. Restart the server. The server starts up normally.
Chapter 2. Diagnostics 77
System-error log messages
A system-error log is generated only if a Remote Supervisor Adapter II SlimLine is
installed. The system-error log can contain messages of three types:
Message Messages do not require action; they record significant system-level
events, such as when the server is started.
Warning Warning messages do not require immediate action; they indicate
possible problems, such as when the recommended maximum
ambient temperature is exceeded.
Error Error messages might require action; they indicate system errors,
such as when a fan is not detected.
Each message contains date and time information, and it indicates the source of
the message (POST/BIOS or the BMC service processor).
Note: The BMC log, which you can view through the Configuration/Setup Utility
program, also contains many information, warning, and error messages.
In the following example, the system-error log message indicates that the server
was turned on at the recorded time.
- - - - - - - - - - - - - - - - - - - - - - - - Date/Time: 2002/05/07 15:52:03
DMI Type:
Source: SERVPROC
Error Code: System Complex Powered Up
Error Code:
Error Data:
Error Data:
- - - - - - - - - - - - - - - - - - - - - - - - -
The following table describes the possible system-error log messages and
suggested actions to correct the detected problems.
Note: These actions have the following meaning:
Reseat the power supply
Complete the following steps:
1. Remove the power supply from the server.
2. Check the power supply for damage and for damaged connectors.
3. Install the power supply in the server (see “Installing a power supply” on
page 121).
Reseat the microprocessor
Complete the following steps:
1. Remove the heatsink and the microprocessor from the server using a
vacuum tool (see “Removing a microprocessor” on page 137).
2. Visually inspect the microprocessor and the microprocessor socket for
damage.
3. Reinstall the microprocessor and the heatsink in the server, taking
special care that the layer of thermal grease is intact (see “Installing a
microprocessor” on page 138).
Attention: If the layer of thermal grease is disturbed, the
microprocessor could overheat and be damaged.
78 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
System event/error log message Action
+12v critical over voltage fault
1. If the OVER SPEC LED on the light path diagnostics panel is
lit, or any of the four power channel error LEDs (A, B, C, or D)
on the system board are lit, see the entries about
power-channel error LEDs in “Power problems” on page 52.
(See “System-board LEDs” on page 15 for the location of the
power channel error LEDs.)
2. If the actions in “Power problems” on page 52 do not identify a
defective component, complete the following steps:
a. Remove the power supplies. Replace the power supplies
one at a time, restarting the server each time, to isolate a
failing power supply.
b. If the server fails to start, (trained service technician only)
replace the power backplane. Restart the server.
c. If the server fails to start, (trained service technician only)
replace the system board.
+12v critical under voltage fault
1. If the OVER SPEC LED on the light path diagnostics panel is
lit, or any of the four power channel error LEDs (A, B, C, or D)
on the system board are lit, see the entries about
power-channel error LEDs in “Power problems” on page 52.
(See “System-board LEDs” on page 15 for the location of the
power channel error LEDs.)
2. If the actions in “Power problems” on page 52 do not identify a
defective component, complete the following steps:
a. Remove the power supplies. Replace the power supplies
one at a time, restarting the server each time, to isolate a
failing power supply.
b. If the server fails to start, (trained service technician only)
replace the power backplane. Restart the server.
c. If the server fails to start, (trained service technician only)
replace the system board.
12v planar fault
1. If the OVER SPEC LED on the light path diagnostics panel is
lit, or any of the four power channel error LEDs (A, B, C, or D)
on the system board are lit, see the entries about
power-channel error LEDs in “Power problems” on page 52.
(See “System-board LEDs” on page 15 for the location of the
power channel error LEDs.)
2. If the actions in “Power problems” on page 52 do not identify a
defective component, complete the following steps:
a. Remove the power supplies. Replace the power supplies
one at a time, restarting the server each time, to isolate a
failing power supply.
b. If the server fails to start, (trained service technician only)
replace the power backplane. Restart the server.
c. If the server fails to start, (trained service technician only)
replace the system board.
Chapter 2. Diagnostics 79
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
System event/error log message Action
+5v critical over voltage fault
1. Remove the following devices, which are powered by 5 volts:
v All PCI adapters
v USB devices
v CD-RW/DVD drive
v (Trained service technician only) Hard disk drive backplane
Reinstall each I/O device removed in step 1, one at a time,
2.
restarting the server each time, to isolate a defective device.
Replace any defective device.
3. If the error continues, (trained service technician only) replace
the power backplane. Restart the server.
4. If the error continues, (trained service technician only) replace
the system board.
+5v critical under voltage fault
1. Remove the following devices, which are powered by 5 volts:
v All PCI adapters
v USB devices
v CD-RW/DVD drive
v (Trained service technician only) Hard disk drive backplane
Reinstall each I/O device removed in step 1, one at a time,
2.
restarting the server each time, to isolate a defective device.
Replace any defective device.
3. If the error continues, (trained service technician only) replace
the power backplane. Restart the server.
4. If the error continues, (trained service technician only) replace
the system board.
5V fault
1. Remove the following devices, which are powered by 5 volts:
v All PCI adapters
v USB devices
v CD-RW/DVD drive
v (Trained service technician only) Hard disk drive backplane
Reinstall each I/O device removed in step 1, one at a time,
2.
restarting the server each time, to isolate a defective device.
Replace any defective device.
3. If the error continues, replace the power backplane. Restart the
server.
4. If the error continues, (trained service technician only) replace
the system board.
+2.5v critical over voltage fault Information only
+2.5v critical under voltage fault Information only
+1.8v critical over voltage fault Information only
+1.8v critical under voltage fault Information only
The system real time clock battery is no longer
Replace the battery.
reliable.
80 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
System event/error log message Action
+3.3v critical over voltage fault
1. Remove all PCI adapters.
2. Reinstall each PCI adapter, one at a time, restarting the server
each time, to isolate a defective adapter. Replace any defective
adapter.
3. If the error continues, (trained service technician only) replace
the system board.
+3.3v critical under voltage fault
1. Remove all PCI adapters.
2. Reinstall each PCI adapter, one at a time, restarting the server
each time, to isolate a defective adapter. Replace any defective
adapter.
3. If the error continues, (trained service technician only) replace
the system board.
3.3V Bus Fault
1. Remove all PCI adapters.
2. Reinstall each PCI adapter, one at a time, restarting the server
each time, to isolate a defective adapter. Replace any defective
adapter.
3. If the error continues, (trained service technician only) replace
the system board.
Power Good Fault
1. Reseat the power supplies.
2. If the error continues, (trained service technician only) replace
the power backplane.
VRM 1 Power Good Fault
1. (Trained service technician only) Reseat microprocessor 1.
2. (Trained service technician only) Replace microprocessor 1.
3. (Trained service technician only) Replace the system board.
VRM 2 Power Good Fault
1. (Trained service technician only) Reseat microprocessor 2.
2. (Trained service technician only) Replace microprocessor 2.
3. (Trained service technician only) Replace the system board.
Memory Area non-critical over temperature
warning
1. Make sure that the fans are operating and are not obstructed.
2. Make sure that the air baffles are in place and correctly
installed.
3. Make sure that the server cover is installed and fully closed.
Memory Area non-recoverable over temperature
fault
1. Make sure that the fans are operating and are not obstructed.
2. Make sure that the air baffles are in place and correctly
installed.
3. Make sure that the server cover is installed and fully closed.
4. (Trained service technician only) Replace the system board.
Chapter 2. Diagnostics 81
v Follow the suggested actions in the order in which they are listed in the Action column until the problem
is solved.
v See Chapter 3, “Parts listing, Type 7978 and 1913 server,” on page 89 to determine which components are
customer replaceable units (CRU) and which components are field replaceable units (FRU).
v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a
trained service technician.
System event/error log message Action
Fan n Failure
n = the fan number
1. Make sure that the connector on the fan is not damaged.
2. Make sure that the fan connector on the system board is not
damaged.
3. Make sure that the fan is fully installed (press down on the
fan).
4. Reseat fan n .
5. Replace fan n .
Fan n Fault
n = the fan number
1. Make sure that the connector on the fan is not damaged.
2. Make sure that the fan connector on the system board is not
damaged.
3. Make sure that the fan is fully installed (press down on the
fan).
4. Reseat fan n .
5. Replace fan n .
Hard Drive n Fault
n = the hard disk drive number
Hard drive n removal detected.
1. Reseat hard disk drive n .
2. Replace hard disk drive n .
Reseat hard disk drive n .
n = the hard disk drive number
Power supply n removed
n = the power supply number
1. Reseat power supply n .
2. Replace power supply n .
3. Replace the power backplane.
Power supply n fault
n = the power supply number
1. If the server power-on LED is lit, perform the following steps:
a. Reduce the server to the minimum configuration (see
“Power-supply LEDs” on page 60).
b. Reinstall the components you removed, one at a time,
restarting the server each time.
c. If the error reoccurs, the component you just reinstalled is
defective; replace the defective component.
Reseat the following components:
2.
a. Power supply n
b. (Trained service technician only) power backplane
Replace the components listed in step 2, one at a time, in the
3.
order shown, restarting the server each time.
Power supply n AC power removed
n
= the power supply number
1. Make sure that the power cords are correctly connected to the
server and to a working electrical outlet.
2. (Trained service technician only) replace the power supply n .
3. (Trained service technician only) replace the power backplane.
Power supply n fan fault
n = the power supply number
1. Make sure that there are no obstructions, such as bundled
cables, to the airflow on the power-supply fan.
2. Replace power supply n .
82 IBM System x3550 Type 7978 and 1913: Problem Determination and Service Guide