:
Note: Before using this information and the product it supports, be sure to read the general information
under “Notices” on page 160
Sixth Edition (September 2003)
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION ″AS IS″ WITHOUT
WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow
disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This publication could include technical inaccuracies or typographical errors. Changes are periodically made to the
information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time.
This publication was developed for products and services offered in the United States of America. IBM may not offer
the products, services, or features discussed in this document in other countries, and the information is subject to
change without notice. Consult your local IBM representative for information on the products, services, and features
available in your area.
Requests for technical information about IBM products should be made to your IBM reseller or IBM marketing
representative.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
About this manual
This manual contains diagnostic information, a Symptom-to-FRU index, service
information, error codes, error messages, and configuration information for the IBM
Eserver
Important: This manual is intended for trained servicers who are familiar with IBM
™
xSeries™350 Type 8682 server.
xSeries products. Before servicing an IBM product, be sure to review
“Safety information” on page 127.
Important safety information
Be sure to read all caution and danger statements in this book before performing
any of the instructions.
Leia todas as instruções de cuidado e perigo antes de executar qualquer operação.
Prenez connaissance de toutes les consignes de type Attention et
Danger avant de procéder aux opérations décrites par les instructions.
Lesen Sie alle Sicherheitshinweise, bevor Sie eine Anweisung ausführen.
®
Online support
Accertarsi di leggere tutti gli avvisi di attenzione e di pericolo prima di effettuare
qualsiasi operazione.
Lea atentamente todas las declaraciones de precaución y peligro ante de llevar a
cabo cualquier operación.
WARNING: Handling the cord on this product or cords associated with accessories
sold with this product, will expose you to lead, a chemical known to the State of
California to cause cancer, and birth defects or other reproductive harm. Wash
hands after handling.
ADVERTENCIA: El contacto con el cable de este producto o con cables de
accesorios que se venden junto con este producto, pueden exponerle al plomo, un
elemento químico que en el estado de California de los Estados Unidos está
considerado como un causante de cancer y de defectos congénitos, además de
otros riesgos reproductivos. Lávese las manos después de usar el producto.
You can download the most current diagnostic, BIOS flash, and device driver files
from http://www.ibm.com/pc/support.
Problem determination tips ....................160
Notices ...........................160
Trademarks..........................161
Contentsvii
viiiIBM xSeries 350 Type 8682: Hardware Maintenance Manual
General checkout
The server diagnostic programs are stored in upgradable read-only memory (ROM)
on the system board. These programs are the primary method of testing the major
components of the server: The system board, Ethernet controller, video controller,
RAM, keyboard, mouse (pointing device), diskette drive, serial ports, hard drives,
and parallel port. You can also use them to test some external devices. See
“Diagnostic programs and error messages” on page 13
Also, if you cannot determine whether a problem is caused by the hardware or by
the software, you can run the diagnostic programs to confirm that the hardware is
working properly.
When you run the diagnostic programs, a single problem might cause several error
messages. When this occurs, work to correct the cause of the first error message.
After the cause of the first error message is corrected, the other error messages
might not occur the next time you run the test.
A failed system might be part of a shared DASD cluster (two or more systems
sharing the same external storage device(s)). Prior to running diagnostics, verify
that the failing system is not part of a shared DASD cluster.
A system might be part of a cluster if:
v The customer identifies the system as part of a cluster.
v One or more external storage units are attached to the system and at least one
of the attached storage units is additionally attached to another system or
unidentifiable source.
v One or more systems are located near the failing system.
If the failing system is suspected to be part of a shared DASD cluster, all diagnostic
tests can be run except diagnostic tests which test the storage unit (DASD residing
in the storage unit) or the storage adapter attached to the storage unit.
Notes:
1. For systems that are part of a shared DASD cluster, run one test at a time in
looped mode. Do not run all tests in looped mode, as this could enable the
DASD diagnostic tests.
2. If multiple error codes are displayed, diagnose the first error code displayed.
3. If the computer hangs with a POST error, go to the “Symptom-to-FRU index” on
page 97
4. If the computer hangs and no error is displayed, go to “Undetermined problems”
on page 122
5. Power supply problems, see “Symptom-to-FRU index” on page 97
6. Safety information, see “Safety information” on page 127
7. For intermittent problems, check the error log; see “POST error messages” on
1. IS THE SYSTEM PART OF A CLUSTER?
YES. Schedule maintenance with the customer. Shut down all systems related to
the cluster. Run storage test.
NO. Go to step 2.
2. IF THE SYSTEM IS NOT PART OF A CLUSTER:
v Power-off the computer and all external devices.
v Check all cables and power cords.
v Set all display controls to the middle position.
v Power-on all external devices.
v Power-on the computer.
v Record any POST error messages displayed on the screen. If an error is
v Check the information LED panel System Error LED; if on, see “Diagnostic
v Check the System Error Log. If an error was recorded by the system, see
v Start the Diagnostic Programs. See “Diagnostic programs and error
v Check for the following responses:
3. DID YOU RECEIVE BOTH OF THE CORRECT RESPONSES?
NO. Find the failure symptom in “Symptom-to-FRU index” on page 97
YES. Run the Diagnostic Programs. If necessary, refer to “Diagnostic programs and
error messages” on page 13
displayed, look up the first error in the “POST error codes” on page 110
panel error LEDs” on page 101
“Symptom-to-FRU index” on page 97
messages” on page 13
a. One beep.
b. Readable instructions or the Main Menu.
If you receive an error, go to “Symptom-to-FRU index” on page 97
If the diagnostics completed successfully and you still suspect a problem, see
“Undetermined problems” on page 122
2IBM xSeries 350 Type 8682: Hardware Maintenance Manual
General information
The IBM xSeries 350 server is a high-performance server with the capability of
microprocessor upgrade to a symmetric multiprocessing (SMP) server. It is ideally
suited for networking environments that require superior microprocessor
performance, efficient memory management, flexibility, and large amounts of reliable
data storage.
Performance, ease of use, reliability, and expansion capabilities were key
considerations during the design of the server. These design features make it
possible for you to customize the system hardware to meet your needs today, while
providing flexible expansion capabilities for the future.
The xSeries 350 server comes with a three-year limited warranty and 90-Day IBM
Start Up Support. If you have access to the World Wide Web, you can obtain
up-to-date information about the server model and other IBM server products at the
following World Wide Web address: http://www.ibm.com/eserver/xseries/
Features and specifications
The following provides a summary of the features and specifications for the xSeries
350 server.
v Microprocessor:
– Intel Pentium III Xeon
– 32 KB of level-1 cache
– 1 MB or 2 MB Level-2 cache depending upon model
– 100 MHz front-side bus (FSB)
– Supports up to four microprocessors
v Memory:
– Maximum: 16GB
– Type: ECC, SDRAM, Registered DIMMs
– 16 slots, 4-way interleaved
v Drives standard:
– Diskette: 1.44 MB
– CD-ROM: 40X IDE
v Expansion bays:
Hot-swap drives: Three standard slim-high, three optional slim-high
– Height: 178 mm (7 in.) (4 U)
– Depth: 711.2 mm (28 in.)
– Width: 482.6 mm (19 in.)
– Weight: 34.9 kg (77 lb.) to 50.4 kg (111 lb.) depending upon configuration
v Integrated functions:
– Advanced System Management processor with Light Path Diagnostics
– Dual channel Ultra160 SCSI controller (one internal and one external channel)
(non-RAID)
– One 10BASE-T/100BASE-TX AMD Ethernet controller
– Two serial ports
– One parallel port
– Two universal serial bus ports
– Keyboard port
– Mouse port
– Video port
v Acoustical noise emissions:
– Sound power, idling: 6.3 bel maximum
– Sound power, operating: 6.3 bel maximum
– Sound pressure, operating: 48 dBa maximum
v Environment:
– Air temperature:
- Server on: 10° to 35°C (50° to 95°F). Altitude: 0 to 914 m (3000 ft.)
- Server on: 10° to 32°C (50° to 89.6°F). Altitude: 914 m (3000 ft.) to 2133 m
(7000 ft.)
- Server off: 10° to 43°C (50° to 110°F). Maximum altitude: 2133 m (7000 ft.)
– Humidity:
- Server on: 8% to 80%
- Server off: 8% to 80%
v Heat output:
Approximate heat output in British Thermal Units (BTU) per hour
– Minimum configuration:461 BTU (0.14 kilowatts per hour)
– Maximum configuration: 1796 BTU (0.53 kilowatts per hour)
4IBM xSeries 350 Type 8682: Hardware Maintenance Manual
Server features
The unique design of the server takes advantage of advancements in symmetric
multiprocessing (SMP), data storage, and memory management. The server
combines:
v Impressive performance using an innovative approach to SMP
The server supports up to four Pentium III Xeon processors. The server comes
with at least one processor installed; you can install additional processors to
enhance performance and provide SMP capability.
v Large data-storage and hot-swap capabilities
All models of the server support up to three standard and three optional 26 mm
(1-inch) slim-high 3.5-inch hot-swap hard disk drives in the hot-swap bays. This
hot-swap feature enables you to remove and replace hard disk drives without
turning off the server.
v Active PCI (hot-plug) adapter capabilities
The server has six hot-plug slots for PCI adapters. With operating system
support, you can replace failing hot-plug PCI adapters without turning off the
server. If the hot-add feature is supported by the operating system and the PCI
adapter, you can also add PCI adapters in these slots without turning off the
server.
v Redundant cooling and power capabilities
The redundant cooling and hot-swap capabilities of the fans in the server enable
continued operation if one of the fans fails. You can also replace a failing fan
without turning off the server.
The server comes standard with one 270-watt power supply. Install three
270-watt power supplies to ensure redundancy and hot-swap capability for a
typical configuration. (See “Installing a hot-swap power supply” on page 70 for
instructions.)
v 100 MHz front-side bus (FSB)
The FSB is the processor external bus. This bus is the interface between the
processors and the system board. The FSB is also known as the processor/host
bus.
v Large system memory
The memory bus in the server supports up to 16 GB of system memory. The
memory controller provides error correcting code (ECC) support for up to 16
industry-standard, 3.3 V, 168-pin, 8-byte, PCI, PC100 registered, dual inline
memory modules (DIMMs). The memory controller also provides Chipkill
™
memory protection. Chipkill memory protection is a technology that protects the
system from a single chip failure on a DIMM.
v System-management capabilities
The server comes with an Advanced System Management Processor on the
system board. This processor enables you to manage the functions of the server
locally and remotely. The Advanced System Management Processor also
provides system monitoring, event recording, and dial-out alert capability.
Note: The Advanced System Management Processor is sometimes referred to
as the service processor.
v Integrated network environment support
The server comes with an Ethernet controller on the system board. This Ethernet
controller has an interface for connecting to 10-Mbps or 100-Mbps networks. The
server automatically selects between 10BASE-T and 100BASE-TX. The controller
General information5
provides full-duplex (FDX) capability, which enables simultaneous transmission
and reception of data on the Ethernet local area network (LAN).
v Redundant network-interface card (NIC)
The addition of an optional, redundant network-interface card (NIC) provides a
failover capability to a redundant Ethernet connection. If a problem occurs with
the primary Ethernet connection, all Ethernet traffic associated with this primary
connection is automatically switched to the redundant NIC. This switching occurs
without data loss and without user intervention.
™
v IBM ServerGuide
CDs
The ServerGuide CDs included with xSeries servers provide programs to help
you set up the server and install the network operating system (NOS). The
ServerGuide program detects the hardware options installed, and provides the
correct configuration program and device drivers. In addition, the ServerGuide
CDs include a variety of application programs such as IBM Update Connector
™
to help keep the server basic input/output system (BIOS) and microcode
updated.
Note: The latest level of BIOS for the server is also available through the World
Wide Web. Refer to “Recovering BIOS” on page 20 for the appropriate
World Wide Web addresses and bulletin-board telephone numbers.
The server is designed to be cost-effective, powerful, and flexible. It uses peripheral
component interconnect (PCI) bus architecture to provide compatibility with a wide
range of existing hardware devices and software applications.
As always, the IBM server meets stringent worldwide certifications for power,
electromagnetic compatibility (EMC), and safety. See “Related service information”
on page 127 for additional information.
Reliability, availability, and serviceability
Three of the most important features in server design are reliability, availability, and
serviceability (RAS). These factors help to ensure the integrity of the data stored on
the server; that the server is available when you want to use it; and that should a
failure occur, you can easily diagnose and repair the failure with minimal
inconvenience.
The following is an abbreviated list of the RAS features that the server supports.
v Cooling fans with speed-sensing capability (hot-swap)
v Error correcting code (ECC) FSBs
v ECC L2 cache
v ECC memory
v Fast power-on self-test (POST)
v 45°C (113°F) normal operating temperature for hard disk drives
v Parity checking on the small computer system interface (SCSI) bus and PCI
buses
v Power Managed - Advanced Configuration and Power Interface (ACPI) level
v System management monitoring via Intra-Integrated Circuit (I2C) bus
v Ambient temperature monitoring
v Automatic error retry/recovery
v Automatic restart after a power failure
v Built-in temperature/fan/voltages monitoring
v Chipkill memory protection
v Fault-resistant startup
v Hot-swap drive bays
6IBM xSeries 350 Type 8682: Hardware Maintenance Manual
v Hot-swap hard disk drives
v Active PCI (hot-plug) adapter slots
v Information and diagnostic LED panels
v Menu-driven setup, system configuration, SCSISelect configuration, and
diagnostic programs
v Memory scrubbing and Predictive Failure Analysis
®
(PFA) (background and real
time)
v Microcode and diagnostic levels available
v NIC failover support
v Power and temperature monitoring
v Power-supply redundancy monitoring
v Predictive Failure Analysis (PFA) alerts
v Redundant Ethernet capabilities (with optional adapter)
v Redundant hot-swap cooling
v Redundant and hot-swap power supplies
v Remote Connect
v Remote system problem-determination support
v System auto-configuring from a configuration menu
v Upgradable POST, BIOS, diagnostics, and Advanced System Management
Processor microcode
v Wake on LAN
v Windows NT
®
capability
®
failover support
v Alert on LAN™capability
v Backup BIOS switching by jumper
v Error codes and messages
v Integrated service processor subsystem provides control for remote system
management
v Processor serial number access
v Standard cables present detection
v System error logging (POST and Advanced System Management Processor)
v Vital Product Data (VPD) on microprocessors, system board, power supplies,
hot-swap-drive backplane, and power backplane
General information7
Start the server
Use the following procedure to start the server.
1. Turn on all external devices, such as the monitor.
Note: After you plug the power cord into an outlet, wait 20 seconds before
pressing the power control button. During this time, the
system-management processor is initializing and the power control button
does not respond.
2. Press the power control button on the front of the server. The power-on light
comes on and the power-on self-test (POST) begins.
v If the server is turned on and a power failure occurs, the server will start
automatically when power is restored.
v The server can also be turned on by the Advanced System Management
Processor.
When you turn off the server, observe the following precaution:
Statement 5
CAUTION:
The power control button on the device and the power switch on the power supply do
not turn off the electrical current supplied to the device. The device also might have
more than one power cord. To remove all electrical current from the device, ensure
that all power cords are disconnected from the power source.
2
1
The server can be turned off as follows:
v You can turn off the server by pressing the power-control button on the front of
the server.
Note: After turning off the server, wait at least five seconds before pressing the
power-control button to turn on the server again.
v You can disconnect the server power cords from the electrical outlets to shut off
all power to the server.
Note: Wait about 15 seconds after disconnecting the power cords for the system
to stop running. Watch for the system-power light on the information LED
panel to stop blinking.
The following section describes the controls and indicators on the server.
8IBM xSeries 350 Type 8682: Hardware Maintenance Manual
Controls and indicators
1Power-control button: Press this button to manually turn on or off the
server.
2Reset button: Press this button to reset the server and run the power-on
self-test (POST).
3Hard-disk drive activity light: Each hot-swap drive has a hard–disk drive
activity light. When this green light is flashing, the drive is being accessed.
4Hard-disk drive status light: Each hot-swap drive has a hard-disk drive
status light. With a ServeRAID
continuously, it means that the drive has failed.
Information LED panel
The information panel on the front of the server contains status lights.
The following illustration shows the server information panel.
POWERRESET
1System power: When this green light is on, system power is present in the
server. When this light flashes, the server is in standby mode (the system
power supply is turned off and ac current is present). When this light is off,
either a power supply, AC power, or a light has failed. The power light is
located above and between the power-control button and the reset button.
Attention: If this light is off, it does not mean there is no electrical current
™
installation, if this amber light is on
LINK
OK
SCSI ACTLINK OK
TX
100
MB
RX
100 MBTX/RXINFO SYS ERROR
present in the server. The light might be burned out. To remove
all electrical current from the server, you must unplug the server
power cords from the electrical outlets.
2Hard disk drive activity light: This green light is on when there is activity
on a hard disk drive.
3Ethernet-link status light: When this green light is on, there is an active
connection on the Ethernet port. The Ethernet transmit/receive activity light
is also located on the Ethernet (RJ-45) connector on the rear of the server.
4Information light: When this amber light is on, the server power supplies
General information9
are nonredundant or some other noncritical event has occurred. Check the
diagnostic LED panel for more information (see “Diagnostic panel LEDs” on
page 18).
5System error light: This amber light is on when a system error occurs. A
light on the diagnostics LED panel will also be on to further isolate the error.
(For more information, see “Diagnostic panel LEDs” on page 18)
5Ethernet transmit/receive activity light: When this green light is on, there
is activity between the server and the network. The Ethernet
transmit/receive activity light is also located on the Ethernet (RJ-45)
connector on the rear of the server.
7Ethernet speed 100 Mbps: When this green light is on, the Ethernet speed
is 100 Mbps. When the light is off, the Ethernet speed is 10 Mbps.
10IBM xSeries 350 Type 8682: Hardware Maintenance Manual
Diagnostics
This section provides basic troubleshooting information to help you resolve some
common problems that might occur with the server.
If you cannot locate and correct the problem using the information in this section,
refer to “Symptom-to-FRU index” on page 97 for more information.
Diagnostic tools overview
The following tools are available to help you identify and resolve hardware-related
problems:
v POST beep codes, error messages, and error logs
The power-on self-test (POST) generates beep codes and messages to indicate
successful test completion or the detection of a problem. See “POST” for more
information.
v Diagnostic programs and error messages
The server diagnostic programs are stored in upgradable read-only memory
(ROM) on the system board. These programs are the primary method of testing
the major components of the server. See “Diagnostic programs and error
messages” on page 13 for more information.
v Light Path Diagnostics
The server has light-emitting diodes (LEDs) to help you identify problems with
server components. These LEDs are part of the light-path diagnostics that are
built into the server. By following the path of lights, you can quickly identify the
type of system error that occurred. See “Light path diagnostics” on page 16 for
more information.
v Error symptoms
These charts list problem symptoms, along with suggested steps to correct the
problems. See “Diagnosing errors” on page 23 for more information.
POST
When you turn on the server, it performs a series of tests to check the operation of
server components and some of the options installed in the server. This series of
tests is called the power-on self-test or POST.
If POST finishes without detecting any problems, a single beep sounds, the first
screen of the operating system or application program appears.
If POST detects a problem, more than one beep sounds and an error message
appears on the screen. See “POST beep codes” on page 12 and “POST error
messages” on page 12 for more information.
Notes:
1. If you have a power-on password or administrator password set, you must type
the password and press Enter, when prompted, before POST will continue.
2. A single problem might cause several error messages. When this occurs, work
to correct the cause of the first error message. After you correct the cause of
the first error message, the other error messages usually will not occur the next
time you run the test.
POST generates beep codes to indicate successful completion or the detection of a
problem.
v One beep indicates the successful completion of POST.
v More than one beep indicates that POST detected a problem. For more
information, see “Beep symptoms” on page 97
POST error messages
POST error messages occur during startup when POST finds a problem with the
hardware or detects a change in the hardware configuration. For a list of POST
errors, see “POST error codes” on page 110
Event/error logs
The POST error log contains the three most recent error codes and messages that
the system generated during POST. The System Event/Error Log contains all error
messages issued during POST and all system status messages from the Advanced
System Management Processor.
To view the contents of the error logs, start the Configuration/Setup Utility program
(see “Starting the Configuration/Setup Utility program” on page 33); then, select
Event/Error Logs from the main menu.
Small computer system interface messages
If you receive a SCSI error message, see “SCSI error codes” on page 118
Note: If the server does not have a hard disk drive, ignore any message that
indicates that the BIOS is not installed.
You will get these messages only when running the SCSI Select Utility.
ServerGuide error symptoms
Look for symptoms in the left column of the following chart. Probable solutions
appear in the right column.
SetupAction
Setup and Installation CD won’t start.v Be sure the system is a supported eServer with a startable
(bootable) CD-ROM drive.
v If the startup (boot) sequence settings have been altered, be sure
the CD-ROM is first in the boot sequence.
v If more than one CD-ROM drive is installed, be sure that only one
drive is set as the primary drive. Start the CD from the primary
drive.
ServeRAID program cannot view all installed
drives – or – cannot install NOS.
The Operating System Installation program
continuously loops.
ServerGuide won’t start your NOS CD.Be sure the NOS CD you have is supported by ServerGuide. See the
Can’t install NOS – option is grayed out.Either there is no logical drive defined (ServeRAID systems) or the
v Be sure there are no duplicate SCSI IDs or IRQ assignments.
v Be sure that the hard disk drive is connected properly.
Free up more space on the hard disk.
Setup and Installation CD label for a list of NOS versions supported.
ServerGuide system partition is not present. Run the setup and
configuration program.
12IBM xSeries 350 Type 8682: Hardware Maintenance Manual
TechConnect CDAction
®
Can’t start TechConnect
Can’t view publications from TechConnect CD,
or text is unreadable.
Diskette Factory CDAction
Get “time out” or “Unknown host” errorsBe sure you have access to the Internet through FTP directly.
CD.Be sure you’re starting the CD on a system with Microsoft
Windows®installed.
Be sure you have the Adobe reader installed (available from the
TechConnect CD).
®
Diagnostic programs and error messages
The server diagnostic programs are stored in upgradable read-only memory (ROM)
on the system board. These programs are the primary method of testing the major
components of the server.
Diagnostic error messages indicate that a problem exists; they are not intended to
be used to identify a failing part. Troubleshooting and servicing of complex
problems that are indicated by error messages should be performed by trained
service personnel.
Sometimes the first error to occur causes additional errors. In this case, the server
displays more than one error message. Always follow the suggested action
instructions for the first error message that appears.
The following sections contain the error codes that might appear in the detailed test
log and summary log when running the diagnostic programs.
The error code format is as follows:
fff-ttt-iii-date-cc-text message
where:
fffis the three-digit function code that indicates the function being
tested when the error occurred. For example, function code 089 is
for the microprocessor.
tttis the three-digit failure code that indicates the exact test failure that
was encountered.
iiiis the three-digit device ID.
dateis the date that the diagnostic test was run and the error recorded.
ccis the check digit that is used to verify the validity of the information.
text message is the diagnostic message that indicates the reason for the problem.
Diagnostics13
Text messages
The diagnostic text message format is as follows:
Function Name: Result (test specific string)
where:
Function Name
is the name of the function being tested when the error occurred. This
corresponds to the function code (fff) given in the previous list.
Result
can be one of the following:
Passed
This result occurs when the diagnostic test completes without any
errors.
Failed This result occurs when the diagnostic test discovers an error.
User Aborted
This result occurs when you stop the diagnostic test before it is
complete.
Not Applicable
This result occurs when you specify a diagnostic test for a device
that is not present.
Aborted
This result occurs when the test could not proceed because of the
system configuration.
Warning
This result occurs when a possible problem is reported during the
diagnostic test, such as when a device that is to be tested is not
installed.
Test Specific String
This is additional information that you can use to analyze the problem.
Starting the diagnostic programs
You can press F1 while running the diagnostic programs to obtain Help information.
You also can press F1 from within a help screen to obtain online documentation
from which you can select different categories. To exit Help and return to where you
left off, press Esc.
To start the diagnostic programs:
1. Turn on the server and watch the screen.
Note: To run the diagnostic programs, you must start the server with the
highest level password that is set. That is, if an administrator password is
set, you must enter the administrator password, not the power-on
password, to run the diagnostic programs.
2. When the message F2 for Diagnostics appears, press F2.
3. Type in the appropriate password; then, press Enter.
4. Select either Extended or Basic from the top of the screen.
5. When the Diagnostic Programs screen appears, select the test you want to run
from the list that appears; then, follow the instructions on the screen.
14IBM xSeries 350 Type 8682: Hardware Maintenance Manual
Notes:
a. If the server stops during testing and you cannot continue, restart the server
and try running the diagnostic programs again.
b. The keyboard and mouse (pointing device) tests assume that a keyboard
and mouse are attached to the server.
c. If you run the diagnostic programs with no mouse attached to the server,
you will not be able to navigate between test categories using the Next Cat
and Prev Cat buttons. All other functions provided by mouse-selectable
buttons are also available using the function keys.
d. You can run the USB interface test and the USB external loopback test only
if there are no USB devices attached.
e. You can view server configuration information (such as system configuration,
memory contents, interrupt request (IRQ) use, direct memory access (DMA)
use, device drivers, and so on) by selecting Hardware Info from the top of
the screen.
When the tests have completed, you can view the Test Log by selecting Utility from
the top of the screen.
If the hardware checks out OK but the problem persists during normal server
operations, a software error might be the cause. If you suspect a software problem,
refer to the information that comes with the software package.
Viewing the test log
The test log will not contain any information until after the diagnostic program has
run.
Note: If you already are running the diagnostic programs, begin with step 3
To view the test log:
1. Turn on the server and watch the screen.
If the server is on, shut down the operating system and restart the server.
2. When the message F2 for Diagnostics appears, press F2.
If a power-on password or administrator password is set, the server prompts
you for it. Type in the appropriate password; then, press Enter.
3. When the Diagnostic Programs screen appears, select Utility from the top of
the screen.
4. Select View Test Log from the list that appears; then, follow the instructions on
the screen.
The system maintains the test-log data while the server is powered on. When
you turn off the power to the server, the test log is cleared.
Diagnostic error message tables
For descriptions of the error messages that might appear when you run the
diagnostic programs, see “Diagnostic error codes” on page 103 If diagnostic error
messages appear that are not listed in those tables, make sure that the server has
the latest levels of BIOS, Advanced System Management Processor, ServeRAID,
and diagnostics microcode installed.
Diagnostics15
Light path diagnostics
The server has LEDs to help you identify problems with some server components.
These LEDs are part of the light path diagnostics built into the server. By following
the path you can quickly identify the type of system error that occurred.
Status LEDs are located on the following components:
v Information panel
v Hard disk drive trays
v Power supply
v Diagnostic panel
v System board
Power supply LEDs
The ac and dc power LEDs on the power supply provide status information about
the power supply. See “Installing a hot-swap power supply” on page 70 for the
location of these LEDs.
1Filler panel
2AC power light
3DC power light
4Power supply handle
5Power supply
16IBM xSeries 350 Type 8682: Hardware Maintenance Manual
The following table describes the ac and dc power LEDs.
AC power LEDDC power LEDDescription and action
OnOnThe power supply is on and operating correctly.
OnOffThere is a dc power problem.
Possible causes:
1. The server is not turned on (the power LED is blinking on the front of the
server).
Action: Press the power-control button to start the server.
2. The power supply has failed.
Action: Replace the power supply.
OffOffThere is an ac power problem.
Possible causes:
1. There is no ac power to the power supply.
Actions: Verify that:
v The electrical cord is properly connected to the server.
v The electrical outlet functions properly.
2. The power supply has failed.
Action: Replace the power supply.
Diagnostics17
Diagnostic panel LEDs
The following illustration shows the LEDs on the diagnostics panel inside the server.
See Table 1 on page 19 for information on identifying problems using these LEDs.
CPUMicroprocessor fault
MemoryMemory fault
PCI Bus A (PCIA)PCI bus A fault
PCI Bus B (PCIB)PCI bus B fault
PCI Bus C (PCIC)PCI bus C fault
PCI Bus D (PCID)Not implemented at this time
Power supply 1Power supply number 1 failure
Power supply 2Power supply number 2 failure
Power supply 3Power supply number 3 failure
FANFan failure
DASDHard disk drive fault
NMINonmaskable interrupt
SP BusService processor failure
Event LogNot implemented at this time.
NON REDNonredundant power mode
OVER SPECOver specification
TEMPSystem temperature failure
18IBM xSeries 350 Type 8682: Hardware Maintenance Manual
Notes:
1. The server does not support replaceable voltage regulator modules (VRMs).
2. The server supports a maximum of three PCI buses.
3. The server supports a maximum of three power supplies.
Light Path Diagnostics
You can use the light path diagnostics built into the server to quickly identify the
type of system error that occurred. The server is designed so that LEDs remain
illuminated when the server shuts down, as long as the power supplies are
operating properly. This feature helps you to isolate the problem if an error causes
the server to shut down.
If the system error LED (on the information LED panel) is not lit and no diagnostics
panel LEDs are lit, it means that the light path diagnostics have not detected a
system error.
If the system error LED (on the information LED panel) is lit, it means that a system
error was detected. Check to see which of the LEDs on the diagnostics panel inside
the server are lit and refer to the following table:
Table 1. Light Path Diagnostics
LED onCause
None
CPUOne of the microprocessors has failed. (See “Diagnostic panel error LEDs” on page 101)
MemoryA memory error occurred. (See “Diagnostic panel error LEDs” on page 101)
PCIAAn error occurred on PCI bus A. An adapter in PCI slot 1, or the system board, caused the error.
PCIBAn error occurred on PCI bus B. An adapter in PCI slot 2, 3, or 4, or the system board, caused the
PCICAn error occurred on PCI bus C. An adapter in PCI slot 5 or 6, or the system board, caused the
PCIDNot implemented at this time.
PS1The first power supply has failed. (See “Diagnostic panel error LEDs” on page 101)
PS2The second power supply has failed. (See “Diagnostic panel error LEDs” on page 101)
PS3The third power supply has failed. (See “Diagnostic panel error LEDs” on page 101)
FanOne of the fan assemblies has failed or is operating too slowly.
DASDA hot-swap hard disk drive has failed on SCSI channel B (see “Diagnostic panel error LEDs” on
NMIA nonmaskable interrupt occurred. (The PCIA, PCIB, PCIC, or Memory LED will probably also be
SPThe service processor has failed. (See “Diagnostic panel error LEDs” on page 101)
Event LogNot implemented at this time.
Non RedSystem is operating in non-redundant power mode. (See “Diagnostic panel error LEDs” on
1. The system error log is 75% or more full or a PFA alert was logged. (See “Diagnostic panel error
LEDs” on page 101)
2. Bad, missing, or mis-installed processor terminator.
(See “Diagnostic panel error LEDs” on page 101)
error. (See “Diagnostic panel error LEDs” on page 101)
error. (See “Diagnostic panel error LEDs” on page 101)
Note: A failing fan can also cause the TEMP and/or DASD LEDs to be on; see “Diagnostic panel
error LEDs” on page 101.
page 101).
on; see “Diagnostic panel error LEDs” on page 101.)
Note: The NMI LED can only be reset by completely removing power from system.
page 101)
Diagnostics19
Table 1. Light Path Diagnostics (continued)
LED onCause
Over SpecThe server is drawing more power than the power supplies are rated for. (See “Diagnostic panel
error LEDs” on page 101)
TempThe system temperature has exceeded the maximum rating. (See “Diagnostic panel error LEDs” on
page 101)
Power checkout
Power problems can be difficult to troubleshoot. For instance, a short circuit can
exist anywhere on any of the power distribution busses. Usually a short circuit will
cause the power subsystem to shut down because of an overcurrent condition.
A general procedure for troubleshooting power problems is as follows:
1. Power off the system and disconnect the AC cord(s).
2. Check for loose cables in the power subsystem. Also check for short circuits, for
instance if there is a loose screw causing a short circuit on a circuit board.
3. Remove adapters and disconnect the cables and power connectors to all
internal and external devices until system is at minimum configuration required
for power on (see ″Minimum operating requirements″ on page 122).
4. Reconnect the AC cord and power on the system. If the system powers up
successfully, replace adapters and devices one at a time until the problem is
isolated. If system does not power up from minimal configuration, replace FRUs
of minimal configuration one at a time until the problem is isolated.
To use this method it is important to know the minimum configuration required for a
system to power up (see page 122). For specific problems, see “Power error
messages” on page 119
Recovering BIOS
If the BIOS code in the server has become corrupted, such as from a power failure
during a flash update, you can recover the BIOS using the recovery boot block and
a BIOS flash diskette.
Note: You can obtain a BIOS flash diskette from one of the following sources:
The flash memory of the server consists of a primary page and a backup page. The
J14 jumper controls which page is used to start the server. If the BIOS in the
primary page is corrupted, you can use the backup page to start the server; then
boot the BIOS Flash Diskette to restore the BIOS to the primary page.
To recover the BIOS:
v Use the ServerGuide program to make a BIOS flash diskette.
v Download a BIOS flash diskette from the World Wide Web. Go to
http://www.pc.ibm.com/support/, select IBM Server Support, and make the
selections for the server.
1. Turn off the server and peripheral devices and disconnect all external cables
and power cords; then, remove the cover.
2. Locate jumper J14 on the processor board (see “System board jumpers” on
page 44).
3. Move J14 to pins 1 and 2 to enable secondary boot block page.
20IBM xSeries 350 Type 8682: Hardware Maintenance Manual
Loading...
+ 144 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.