IBM Netfinity 5600 - Type 8664
Models 11Y, 1RY, 21Y, 2RY, 31Y, 3RY
Hardware Maintenance Manual
November 1999
We Want Your Comments!
(Please see page 262)
S09N-1595-00
IBM Netfinity Servers
IBM Netfinity 5600 - Type 8664
Models 11Y, 1RY, 21Y, 2RY, 31Y, 3RY
Hardware Maintenance Manual
November 1999
We Want Your Comments!
(Please see page 262)
S09N-1595-00
IBM
Note
Before using this information and the product it
supports, be sure to read the general information
under “Notices” on page 266.
First Edition (November 1999)
The following paragraph does not apply to the United
Kingdom or any country where such provisions are
inconsistent with local law: INTERNATIONAL
BUSINESS MACHINES CORPORATION PROVIDES THIS
PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT
NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE. Some states do not allow disclaimer of
express or implied warranties in certain transactions,
therefore, this statement may not apply to you.
This publication could include technical inaccuracies or
typographical errors. Changes are periodically made to
the information herein; these changes will be incorporated
in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the
program(s) described in this publication at any time.
This publication was developed for products and services
offered in the United States of America. IBM may not offer
the products, services, or features discussed in this
document in other countries, and the information is subject
to change without notice. Consult your local IBM
representative for information on the products, services,
and features available in your area.
Requests for technical information about IBM products
should be made to your IBM reseller or IBM marketing
representative.
Copyright International Business Machines
Corporation 1997, 1999. All rights reserved.
Note to U.S. Government users–Documentation related to
Restricted rights–Use, duplication, or disclosure is subject
to restrictions set forth in GSA ADP Schedule Contract
with IBM Corp.
iiNetfinity Server HMM
About this supplement
This supplement contains diagnostic information,
Symptom-to-FRU Indexes, service information, error
codes, error messages, and configuration information for
the Netfinity 5600 - Type 8664.
Important
This manual is intended for trained servicers who are
familiar with IBM PC Server products.
Important safety information
Be sure to read all caution and danger statements in this
book before performing any of the instructions.
Leia todas as instruções de cuidado e perigo antes de
executar qualquer operação.
Prenez connaissance de toutes les consignes de type
Attention et
Danger avant de procéder aux opérations décrites par les
instructions.
Lesen Sie alle Sicherheitshinweise, bevor Sie eine
Anweisung ausführen.
iii
Accertarsi di leggere tutti gli avvisi di attenzione e di
pericolo prima di effettuare qualsiasi operazione.
Lea atentamente todas las declaraciones de precaución y
peligro ante
de llevar a cabo cualquier operación.
Online support
Use the World Wide Web (WWW) to download Diagnostic,
BIOS Flash, and Device Driver files.
The server diagnostic programs are stored in upgradable
read-only memory (ROM) on the system board. These
programs are the primary method of testing the major
components of the server: the system board, Ethernet
controller, video controller, RAM, keyboard, mouse
(pointing device), diskette drive, serial port, and parallel
port. You can also use them to test some external
devices. See “Diagnostic programs” on page 10.
Also, if you cannot determine whether a problem is caused
by the hardware or by the software, you can run the
diagnostic programs to confirm that the hardware is
working properly.
When you run the diagnostic programs, a single problem
might cause several error messages. When this occurs,
work to correct the cause of the first error message. After
the cause of the first error message is corrected, the other
error messages might not occur the next time you run the
test.
A failed system might be part of a shared DASD cluster
(two or more systems sharing the same external storage
device(s)). Prior to running diagnostics, verify that the
failing system is not part of a shared DASD cluster.
A system might be part of a cluster if:
The customer identifies the system as part of a
cluster.
One or more external storage units are attached to
the system and at least one of the attached storage
units is additionally attached to another system or
unidentifiable source.
One or more systems are located near the failing
system.
If the failing system is suspect to be part of a shared
DASD cluster, all diagnostic tests can be run except
diagnostic tests which tests the storage unit (DASD
residing in the storage unit) or the storage adapter
attached to the storage unit.
6Netfinity Server HMM
Notes
1. For systems that are part of a shared DASD
cluster, run one test at a time in looped mode.
Do not run all tests in looped mode, as this could
enable the DASD diagnostic tests.
2. If multiple error codes are displayed, diagnose
the first error code displayed.
3. If the computer hangs with a POST error, go to
the “Symptom-to-FRU index” on page 194.
4. If the computer hangs and no error is displayed,
go to “Undetermined problems” on page 215.
5. Power Supply problems, see “Symptom-to-FRU
index” on page 194.
6. Safety information, see “Safety information” on
page 230.
7. For intermittent problems, check the error log;
see, “POST Error Log” on page 36.
001
IS THE SYSTEM PART OF A CLUSTER?
Yes No
002
Go to Step 004.
003
Schedule maintenance with the customer. Shut down all
systems related to the cluster. Run storage test.
004
– Power-off the computer and all external devices.
– Check all cables and power cords.
– Set all display controls to the middle position.
– Power-on all external devices.
– Power-on the computer.
– Record any POST error messages displayed on the
screen. If an error is displayed, look up the first error in
the “Symptom-to-FRU index” on page 194.
– Check the information LED panel System Error LED; if
on, see “Light path diagnostics” on page 13.
– Check the System Error Log. If an error was recorded
by the system, see “Symptom-to-FRU index” on
page 194.
– Start the Diagnostic Programs. See “Running diagnostic
programs” on page 10.
– Check for the following responses:
1. One beep.
2. Readable instructions or the Main Menu.
(Step 004 continues)
Netfinity 5600 - Type 86647
004 (continued)
DID YOU RECEIVE BOTH OF THE CORRECT
RESPONSES?
Yes No
005
Find the failure symptom in “Symptom-to-FRU index”
on page 194. Or, use remote video mode to
monitor and access POST or to look at the System
Error Log.
006
– Run the Diagnostic Programs. If necessary, refer to
“Running diagnostic programs” on page 10.
If you receive an error, go to “Symptom-to-FRU index”
on page 194.
If the diagnostics completed successfully and you still
suspect a problem, see “Undetermined problems” on
page 215.
(CONTINUED)
8Netfinity Server HMM
Diagnostic tools
The following tools are available to help identify and
resolve hardware-related problems:
Diagnostic programs
Power-on self-test (POST)
POST beep codes
Error messages
System error log
Option diskettes
Light path diagnostics
Netfinity 5600 - Type 8664
9
Diagnostic programs
The server diagnostic programs are stored in upgradable
read-only memory (ROM) on the system board. These
programs are the primary method of testing the major
components of your server, such as the the system board,
Ethernet controller, video controller, RAM, keyboard,
mouse (pointing device), diskette drive, serial port, and
parallel port. You can also use them to test some external
devices.
Also, if you cannot determine whether a problem is caused
by the hardware or by the software, you can run the
diagnostic programs to confirm that the hardware is
working properly.
Note
When you run the diagnostic programs, a single
problem might cause several error messages. When
this occurs, work to correct the cause of the first error
message. After the cause of the first error message is
corrected, the other error messages might not occur
the next time you run the test.
Running diagnostic programs
While you are running the diagnostic programs, F1
displays Help information. Pressing F1 from within a help
screen provides a online documentation from which you
can select different categories. Pressing Esc exits Help
and returns to where you left off.
Important
If you run the diagnostic programs with either no
mouse or a USB mouse attached to your server, you
will not be able to navigate between test categories
using the Next Cat and Prev Cat buttons. All other
functions provided by mouse-selectable buttons are
also available using the function keys.
You can test the USB keyboard using the regular
keyboard test. The regular mouse test cannot test a
USB mouse. Also, you can run the USB hub test only
if there are no USB devices attached.
10Netfinity Server HMM
Notes
1. To run the diagnostic programs, you must start
the server with the highest level password that is
set. That is, if an administrator password is set,
you must enter the administrator password, not
the power-on password, to run the diagnostic
programs.
2. If the server stops during testing and you cannot
continue, restart the server and try running the
diagnostic programs again. If the problem
persists, go to “Undetermined problems” on
page 215.
3. If the diagnostic tests do not find a problem but
the problem persists during normal operations,
see “Symptom-to-FRU index” on page 194 and
look for the problem symptom.
4. You might have to install a wrap connector on
your active parallel, serial, or Ethernet port to
obtain accurate test results for these ports. If
you do not have a wrap connector, contact your
IBM reseller or IBM marketing representative.
5. You might need a scratch diskette (that is, a
diskette which has no contents that you want to
save) to obtain accurate test results when testing
the diskette drive.
6. The keyboard and mouse (pointing device) tests
assume that a keyboard and mouse are attached
to the server.
To start the diagnostic programs:
1. Turn on the server and watch the screen.
If the server is turned on already, shut down your
operating system and restart the server.
2. When the message F2 for Diagnostics appears,
press F2.
If a power-on password or administrator password is
set, the server prompts you for it. Type in the
appropriate password; then, press Enter.
3. The Diagnostics Programs screen appears.
4. Select either Extended or Basic from the top of the
screen.
5. Select the test you want to run from the list that
appears; then, follow the instructions on the screen.
When the tests have completed, you can view the
Test Log by selecting Utility from the top of the
screen.
Also, you can view server configuration information
(such as system configuration, memory contents,
interrupt request (IRQ) use, direct memory access
(DMA) use, device drivers, and so on) by selecting
Hardware Info from the top of the screen.
Netfinity 5600 - Type 8664
11
If the hardware checks out OK but the problem persists
during normal server operations, a software error might be
the cause. If you suspect a software problem, refer to the
information that comes with the software package.
Viewing the test log
If you are already running the diagnostic programs,
continue with step 4 in this procedure.
Notes
1. The test log will not contain any information until
after the diagnostic program has run.
2. The test log is maintained in memory while the
server is powered on. Turning off the power
clears the test log.
To view the Test Log:
1. Turn on the server and watch the screen.
If the server is turned on already, shut down your
operating system and restart the server.
2. When the message F2 for Diagnostics appears,
press F2.
If a power-on password or administrator password is
set, the server prompts you for it. Type in the
appropriate password; then, press Enter.
3. The Diagnostic Programs screen appears.
4. Select Utility from the top of the screen.
5. Select View Test Log from the list that appears; then,
follow instructions on the screen.
Power-on self-test (POST)
When you turn on the server, it performs a series of tests
to check the operation of server components and some of
the options installed in the server. This series of tests is
called the power-on self-test or POST.
POST does the following:
Checks the operation of some basic system-board
operations
Checks the memory
Compares the current server configuration with the
stored server configuration information
Configures PCI adapters
Starts the video operation
Verifies that drives (such as the diskette, CD-ROM,
and hard disk drives) are connected properly
If you have a power-on password or administrator
password set, you must type the password and press
Enter, when prompted, before POST will continue.
While the memory is being tested, the amount of available
memory appears on the screen. These numbers advance
12Netfinity Server HMM
as the server progresses through POST and the final
number that appears on the screen represents the total
amount of memory available. If POST finishes without
detecting any problems, a single beep sounds, the first
screen of your operating system or application program
appears, and the System POST Complete (OK) light is
illuminated on the operator information panel.
If POST detects a problem, more than one beep sounds
and an error message appears on your screen.
Note
A single problem might cause several error messages.
When this occurs, work to correct the cause of the first
error message. After the cause of the first error
message is corrected, the other error messages
usually will not occur the next time you run the test.
POST beep codes
POST generates beep codes to indicate successful
completion or the detection of a problem.
One beep indicates the successful completion of
POST.
More than one beep indicates that POST detected a
problem. For more information, see “Beep
symptoms” on page 197.
Light path diagnostics
You can use the light path diagnostics built into your
server to quickly identify the type of system error that
occurred. Your server is designed so that any LEDs that
are illuminated remain illuminated when the server shuts
down as long as the AC power source is good and the
power supplies can supply +5V dc current to the server.
This feature helps you isolate the problem if an error
causes the server to shut down. See Table 1 on page 14.
Netfinity 5600 - Type 866413
Action
Check the system error log and correct any problems. See
“POST Error Log” on page 36 for information about
processor board.
indicated by the lit Microprocessor Error LED, and
restart the server.
1. Check the Microprocessor Error LEDs on the
clearing the error log. Disconnecting the server from all
power sources for at least 20 seconds will turn off the
System Error LED.
2. Turn off the server, reseat the microprocessor
VRM Error LED, and restart the server.
3. If the problem persists, replace the microprocessor.
1. Check the VRM Error LEDs on the processor board.
2. Turn off the server, reseat the VRM indicated by the lit
3. If the problem persists, replace the VRM.
1. Check the DIMM Error LEDs on the memory board.
2. Replace the DIMM indicated by the lit DIMM Error
LED.
Cause
The system error log is 75% or more full or a PFA alert was
Lit diagnostics panel
LED
None
Table 1 (Page 1 of 5). Light path diagnostics
System Error LED
(information LED panel)OnA system error was
logged.
detected. Check to see
which of the LEDs on the
diagnostics panel inside
the server are on.
One of the microprocessors has failed or a microprocessor
is installed in the wrong connector.
CPU
One of the voltage regulator modules on the processor
board has failed.
VRM
A memory error occurred.
MEMORY
14Netfinity Server HMM
error log indicates a problem with the integrated
Ethernet controller, replace the system board.
information in the error log, try to determine the failing
adapter by removing one adapter at a time from PCI
bus A (PCI slot 1 and 2) and restarting the server after
each adapter is removed.
in the error log, try to determine the failing adapter by
removing one adapter at a time from PCI bus B (PCI
slots 3–5) and restarting the server after each adapter
is removed.
1. Check the error log for additional information. If the
Action
2. If you cannot isolate the failing adapter from the
1. Check the error log for additional information.
2. If you cannot correct the problem from the information
Cause
An error occurred on PCI bus A. An adapter in PCI slot 1
Lit diagnostics panel
LED
PCI BUS A
Table 1 (Page 2 of 5). Light path diagnostics
System Error LED
(information LED panel)
or 2 or the system board caused the error.
An error occurred on PCI bus B. An adapter in PCI slot 3,
4, or 5 or the system board caused the error.
PCI BUS B
Netfinity 5600 - Type 866415
Action
1. Check the error log for additional information. If the
error log indicates a temperature problem and the fans
Server Library
are working correctly, go to “General checkout” on
page 6.
hot-swap hard disk drives is on, refer to the
“ServeRAID Information” section of this
for more information.
instructions for those LEDs.
the server.
2. If the amber Hard Disk Status LED on one of the
1. If the PCI BUS A or PCI BUS B LED is on, follow the
2. If the PCI BUS A or PCI BUS B LED is not on, restart
Restart the server.
Disconnect all power from the server for 30 seconds.
Reconnect the power to the server; then, restart the server.
Replace power supply 1.
Replace power supply 2.
Replace power supply 3.
Cause
A hot-swap hard disk drive has failed on bus 1.
Lit diagnostics panel
LED
HDD
Table 1 (Page 3 of 5). Light path diagnostics
System Error LED
(information LED panel)
A nonmaskable interrupt occurred.
NMI
A systems management event occurred.
An error has occurred on the service processor bus.
SMI
SERVICE PROCESSOR
BUS
Power supply 1 has failed.
POWER SUPPLY 1
Power supply 2 has failed.
POWER SUPPLY 2
16Netfinity Server HMM
Power supply 3 has failed.
POWER SUPPLY 3
indicated power supply.
redundancy.
1. If one of the power supply LEDs is on, replace the
Action
2. Install an additional power supply to regain
Replace fan 1.
Replace fan 2.
Cause
Power supply redundancy has been lost.
Lit diagnostics panel
LED
POWER SUPPLY NON
Table 1 (Page 4 of 5). Light path diagnostics
System Error LED
(information LED panel)
Fan 1 has failed or is operating too slowly.
REDUNDANT
FAN 1
1. An LED on the failing fan assembly will also be on.
Notes:
2. A failing fan can also cause the TEMPERATURE and
HDD LEDs to be on.
Fan 2 has failed or is operating too slowly.
FAN 2
1. An LED on the failing fan assembly will also be on.
Notes:
HDD LEDs to be on.
2. A failing fan can also cause the TEMPERATURE and
Netfinity 5600 - Type 866417
Action
fan.
“Specifications” on page 55.)
1. Check to see if a fan has failed. If it has, replace the
Replace fan 3.
2. Make sure the room temperature is not too hot. (See
If the problem persists, go to “General checkout” on
page 6.
None
HDD LEDs to be on.
1. An LED on the failing fan assembly will also be on.
Fan 3 has failed or is operating too slowly.
Notes:
FAN 3
2. A failing fan can also cause the TEMPERATURE and
The system temperature has exceeded a threshold level.
TEMPERATURE
The light path diagnostics have not detected a system error.
None
Off
Cause
Lit diagnostics panel
LED
Table 1 (Page 5 of 5). Light path diagnostics
System Error LED
(information LED panel)
18Netfinity Server HMM
Error messages
Error messages indicate that a problem exists.
Hardware error messages that occur can be text, numeric,
or both. Messages generated by your software generally
are text messages, but they also can be numeric.
POST error messages: POST error messages
occur during startup when POST finds a problem with the
hardware or detects a change in the hardware
configuration. For more information, see
“Symptom-to-FRU index” on page 194.
Diagnostic error messages: Diagnostic error
messages occur when a test finds a problem with the
server hardware. These error messages are alphanumeric
and they are saved in the Test Log. For more information,
see “Error symptoms” on page 208.
Software-generated error messages: These
messages occur if a problem or conflict is found by an
application program, the operating system, or both.
Messages are generally text messages, but they also can
be numeric. For information about these error messages,
refer to the documentation that comes with your software.
System error log
The system error log contains all error and warning
messages issued during POST and all system status
messages from the Netfinity Advanced System
Management Processor. See “System Event/Error Log” on
page 36 for information about how to view the system
error log.
Option diskettes
An optional device or adapter might come with an Option
Diskette. Option Diskettes usually contain option-specific
diagnostic test programs or configuration files.
If your optional device or adapter comes with an Option
Diskette, follow the instructions that come with the option.
Different instructions apply depending on whether the
Option Diskette is startable or not.
Recovering BIOS
If your BIOS has become corrupted, such as from a power
failure during a flash update, you can recover your BIOS
using the recovery boot block and a BIOS flash diskette.
Netfinity 5600 - Type 8664
19
Note
You can obtain a BIOS flash diskette from one of the
following sources:
Use the ServerGuide program to make a BIOS
flash diskette.
Download a BIOS flash diskette from the World
Wide Web. Go to
http://www.pc.ibm.com/support/, select IBM
Server Support, and make the selections for your
server.
The flash memory of your server contains a protected area
that cannot be overwritten. The recovery boot block is a
section of code in this protected area that enables the
server to start up and to read a flash diskette. The flash
utility recovers the system BIOS from the BIOS recovery
files on the diskette.
To recover the BIOS:
Before you begin:
Read “Safety information” on page 230.
1. Turn off the server and peripheral devices and
disconnect all external cables and power cords (see
“Preparing to install options” on page 100); then
remove the cover (see “Removing the left-side cover
(tower model)” on page 103 or “Removing the cover
(rack model)” on page 104).
2. Locate switch block 2 (SW2) on the system board
(see “System board component locations” on
page 166).
3. Set switch 1 on switch block 2 to ON to enable BIOS
recovery mode.
4. Insert the BIOS flash diskette into the diskette drive.
5. Restart the server.
The Recovery Boot screen will appear. A progress
report, Loading data from diskette xx%, is
displayed. When programming is underway, a further
progress report, Programming block n of 7 yy%,is
displayed. When recovery is complete, Recovery
complete, remove the diskette and return boot
block switch to the off position before
rebooting.
6. Remove the flash diskette from the diskette drive.
7. Turn the server off.
8. Set switch 1 on switch block 2 (SW2) to Off to return
to normal startup mode.
9. Restart the server. The system should start up
normally.
20Netfinity Server HMM
Features
The following table summarizes the features of the
Netfinity 5600.
Microprocessor
Intel Pentium III microprocessor with MMX
technology and SIMD extensions
32 KB of level-1 cache
256 KB of level-2 cache (min.)
Expandable to two microprocessors