IBM 8686, Eserver xSeries 360 Type 8686 Hardware Maintenance Manual

򔻐򗗠򙳰
Hardware Maintenance Manual
xSeries 360 Type 8686
ER s e r v e r
򔻐򗗠򙳰
Hardware Maintenance Manual
xSeries 360 Type 8686
©
US
Note:
Before using this information and the product it supports, be sure to read the general information under “Notices” on page 178.
Eighth Edition August 2004 The following paragraph does not apply to the United Kingdom or any country where such provisions are
inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This publication could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time.
This publication was developed for products and services offered in the United States of America. IBM may not offer the products, services, or features discussed in this document in other countries, and the information is subject to change without notice. Consult your local IBM representative for information on the products, services, and features available in your area.
Requests for technical information about IBM products should be made to your IBM reseller or IBM marketing representative.
Copyright International Business Machines Corporation 2003. All rights reserved.
Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Be
©
About this manual
This manual contains diagnostic information, a Symptom-to-FRU index, service information, error codes, error messages, and configuration information for the IBM
Eserver xSeries
®
Important: This manual is intended for trained servicers who are familiar with IBM
xSeries products.
Important safety information
sure to read all caution and danger statements in this book before performing
any of the instructions.
Leia todas as instruções de cuidado e perigo antes de executar qualquer operação.
Prenez connaissance de toutes les consignes de type Attention et
Danger avant de procéder aux opérations décrites par les instructions.
Lesen Sie alle Sicherheitshinweise, bevor Sie eine Anweisung ausführen.
®
360, Type 8686.
Online support
Accertarsi di leggere tutti gli avvisi di attenzione e di pericolo prima di effettuare qualsiasi operazione.
Lea atentamente todas las declaraciones de precaución y peligro ante de llevar a cabo cualquier operación.
WARNING: Handling the cord on this product or cords associated with accessories
sold with this product, will expose you to lead, a chemical known to the State of California to cause cancer, and birth defects or other reproductive harm. Wash
hands after handling.
ADVERTENCIA: El contacto con el cable de este producto o con cables de
accesorios que se venden junto con este producto, pueden exponerle al plomo, un elemento químico que en el estado de California de los Estados Unidos está considerado como un causante de cancer y de defectos congénitos, además de otros riesgos reproductivos. Lávese las manos después de usar el producto.
You can download the most current diagnostic, BIOS flash, and device driver files from http://www.ibm.com/pc/support on the World Wide Web.
Copyright IBM Corp. 2003
iii
iv
Hardware Maintenance Manual: xSeries 360 Type 8686
©
Contents
About this manual . . . . . . . . . . . . . . . . . . . . . . . iii
Important safety information . . . . . . . . . . . . . . . . . . . . iii
Online support . . . . . . . . . . . . . . . . . . . . . . . . . iii
General checkout . . . . . . . . . . . . . . . . . . . . . . . .1
Features overview . . . . . . . . . . . . . . . . . . . . . . . .3
Features and specifications . . . . . . . . . . . . . . . . . . . . .4
Notices and statements used in this book . . . . . . . . . . . . . . .4
What the xSeries 360 offers . . . . . . . . . . . . . . . . . . . .5
Reliability, availability, and serviceability . . . . . . . . . . . . . . . .7
Controls and indicators . . . . . . . . . . . . . . . . . . . . . .8
Operator information panel . . . . . . . . . . . . . . . . . . . . .9
Turning on the server . . . . . . . . . . . . . . . . . . . . .10
Turning off the server . . . . . . . . . . . . . . . . . . . . .11
Standby mode . . . . . . . . . . . . . . . . . . . . . . . .12
Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . .13
Diagnostic tools overview . . . . . . . . . . . . . . . . . . . . .13
POST . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
Error logs . . . . . . . . . . . . . . . . . . . . . . . . . . .14
Small computer systems interface (SCSI) messages . . . . . . . . . . .14
Diagnostic programs and error messages . . . . . . . . . . . . . . .15
Starting the diagnostic programs . . . . . . . . . . . . . . . . .15
Error code format . . . . . . . . . . . . . . . . . . . . . . .16
Text message format . . . . . . . . . . . . . . . . . . . . . .16
Viewing the test log . . . . . . . . . . . . . . . . . . . . . .17
Viewing the System Event/Error log . . . . . . . . . . . . . . . .17
Diagnostic error messages . . . . . . . . . . . . . . . . . . .17
Recovering the BIOS code . . . . . . . . . . . . . . . . . . . .18
Power supply LEDs . . . . . . . . . . . . . . . . . . . . . . .19
Level 2 Light Path Diagnostics . . . . . . . . . . . . . . . . . . .19
Level 2 diagnostic panel LEDs . . . . . . . . . . . . . . . . . .20
Level 2 Light Path troubleshooting . . . . . . . . . . . . . . . . .21
Power checkout . . . . . . . . . . . . . . . . . . . . . . . .22
Error symptoms . . . . . . . . . . . . . . . . . . . . . . . .22
Ethernet controller troubleshooting . . . . . . . . . . . . . . . . . .22
Battery replacement . . . . . . . . . . . . . . . . . . . . . . .23
Configuration . . . . . . . . . . . . . . . . . . . . . . . . .27
Using the Configuration/Setup Utility . . . . . . . . . . . . . . . . .28
Starting the Configuration/Setup Utility . . . . . . . . . . . . . . .28
Configuration/Setup Utility main menu . . . . . . . . . . . . . . .28
Using passwords . . . . . . . . . . . . . . . . . . . . . . . .32
Power-on password . . . . . . . . . . . . . . . . . . . . . .32
Administrator password . . . . . . . . . . . . . . . . . . . . .33
Remote-control security settings . . . . . . . . . . . . . . . . .34
Using the SCSISelect utility program . . . . . . . . . . . . . . . . .34
Starting the SCSISelect utility program . . . . . . . . . . . . . . .34
SCSISelect menu . . . . . . . . . . . . . . . . . . . . . . .35
PXE Boot Agent Utility program . . . . . . . . . . . . . . . . . . .36
Starting the PXE Boot Agent Utility program . . . . . . . . . . . . .36
PXE Boot Agent Utility menu . . . . . . . . . . . . . . . . . . .36
Copyright IBM Corp. 2003
v
vi
ServeRAID configuration program . . . . . . . . . . . . . . . . . .37
Configuring the Ethernet controller . . . . . . . . . . . . . . . . . .37
Remote Supervisor Adapter . . . . . . . . . . . . . . . . . . . .38
Remote Supervisor Adapter features . . . . . . . . . . . . . . . .38
Setup requirements . . . . . . . . . . . . . . . . . . . . . .38
Using the documentation . . . . . . . . . . . . . . . . . . . .39
Cabling and configuring the Remote Supervisor Adapter . . . . . . . . .39
Configuring the adapter . . . . . . . . . . . . . . . . . . . . .41
Using the ASM interconnect network . . . . . . . . . . . . . . . .50
Using the ServerGuide CDs . . . . . . . . . . . . . . . . . . .57
Features at a glance . . . . . . . . . . . . . . . . . . . . . . .58
Setup and configuration overview . . . . . . . . . . . . . . . . . .58
System partition . . . . . . . . . . . . . . . . . . . . . . . .60
Typical NOS Installation . . . . . . . . . . . . . . . . . . . . .60
Setting up or updating multiple servers . . . . . . . . . . . . . . . .60
Installing the NOS without ServerGuide . . . . . . . . . . . . . . . .61
Additional programs included with ServerGuide . . . . . . . . . . . . .61
Installing components . . . . . . . . . . . . . . . . . . . . .63
Major components of the xSeries 360 server . . . . . . . . . . . . . .64
Internal connector and LED locations . . . . . . . . . . . . . . . . .65
System board locations . . . . . . . . . . . . . . . . . . . . .65
Level 2 diagnostic and system board LEDs . . . . . . . . . . . . .68
Memory board locations . . . . . . . . . . . . . . . . . . . .70
Remote Supervisor Adapter locations . . . . . . . . . . . . . . .71
Before you begin . . . . . . . . . . . . . . . . . . . . . . . .71
System reliability considerations . . . . . . . . . . . . . . . . .71
Working inside the server with the power on . . . . . . . . . . . . .72
Internal component locations . . . . . . . . . . . . . . . . . . . .72
Removing the server top cover and bezel . . . . . . . . . . . . . .72
Working with adapters . . . . . . . . . . . . . . . . . . . . .73
Internal drive installation . . . . . . . . . . . . . . . . . . . .79
Memory modules . . . . . . . . . . . . . . . . . . . . . . .84
Microprocessor installation . . . . . . . . . . . . . . . . . . .86
VRM replacement . . . . . . . . . . . . . . . . . . . . . . .89
Hot-swap power supply installation . . . . . . . . . . . . . . . .90
Hot-swap fan replacement . . . . . . . . . . . . . . . . . . . .92
Installing the serial cable . . . . . . . . . . . . . . . . . . . .93
Locking PCI slot 6 . . . . . . . . . . . . . . . . . . . . . . .94
Enabling the serial port . . . . . . . . . . . . . . . . . . . . .95
Completing the installation . . . . . . . . . . . . . . . . . . . .95
Connecting external options . . . . . . . . . . . . . . . . . . . .97
Input/output ports . . . . . . . . . . . . . . . . . . . . . . .97
Cabling the server . . . . . . . . . . . . . . . . . . . . . . 103
FRU information (service only) . . . . . . . . . . . . . . . . . . 105
Hot-swap drive backplane . . . . . . . . . . . . . . . . . . . . 105
Light path diagnostic card . . . . . . . . . . . . . . . . . . . . 106
Operator panel card . . . . . . . . . . . . . . . . . . . . . . 108
PCI adapter switch card . . . . . . . . . . . . . . . . . . . . . 109
Power supply backplane . . . . . . . . . . . . . . . . . . . . .110
System board/shuttle . . . . . . . . . . . . . . . . . . . . . .110
Replacing the serial port cable . . . . . . . . . . . . . . . . . . .113
Symptom-to-FRU index . . . . . . . . . . . . . . . . . . . . .115
Hardware Maintenance Manual: xSeries 360 Type 8686
No
Beep symptoms . . . . . . . . . . . . . . . . . . . . . . . .115
beep symptoms . . . . . . . . . . . . . . . . . . . . . . .117
Level 2 light path LED errors . . . . . . . . . . . . . . . . . . .118
Diagnostic error codes . . . . . . . . . . . . . . . . . . . . . 120
Error symptoms . . . . . . . . . . . . . . . . . . . . . . . . 127
ServerGuide error symptoms . . . . . . . . . . . . . . . . . . . 131
SMI Handler messages . . . . . . . . . . . . . . . . . . . . . 132
Power supply LED errors . . . . . . . . . . . . . . . . . . . . . 133
POST error codes . . . . . . . . . . . . . . . . . . . . . . . 134
ServeRAID . . . . . . . . . . . . . . . . . . . . . . . . . . 139
SCSI error codes . . . . . . . . . . . . . . . . . . . . . . . 139
Temperature error messages . . . . . . . . . . . . . . . . . . . 139
Fan error messages . . . . . . . . . . . . . . . . . . . . . . 140
Power error messages . . . . . . . . . . . . . . . . . . . . . 140
System shutdown . . . . . . . . . . . . . . . . . . . . . . . 141
Voltage related system shutdown . . . . . . . . . . . . . . . . . 141
Temperature related system shutdown . . . . . . . . . . . . . . . 141
DASD checkout . . . . . . . . . . . . . . . . . . . . . . . . 141
Processor checkout . . . . . . . . . . . . . . . . . . . . . . . 142
I2C bus fault messages . . . . . . . . . . . . . . . . . . . . . 142
Undetermined problems . . . . . . . . . . . . . . . . . . . . . 143
Related service information . . . . . . . . . . . . . . . . . . . 145
Safety information . . . . . . . . . . . . . . . . . . . . . . . 145
General safety . . . . . . . . . . . . . . . . . . . . . . . 145
Electrical safety . . . . . . . . . . . . . . . . . . . . . . . 146
Safety inspection guide . . . . . . . . . . . . . . . . . . . . 147
Handling static-sensitive devices . . . . . . . . . . . . . . . . . 148
Grounding requirements . . . . . . . . . . . . . . . . . . . . 148
Safety notices (multilingual translations) . . . . . . . . . . . . . . 149
Problem determination tips . . . . . . . . . . . . . . . . . . . . 178
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . 179
Parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
System . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Keyboards . . . . . . . . . . . . . . . . . . . . . . . . . . 184
Contents
vii
viii
Hardware Maintenance Manual: xSeries 360 Type 8686
on
If
A
A
v
v
v
If
in
1.
2. If
3. If
on
4. If
5.
on
6.
7.
©
General checkout
The server diagnostic programs are stored in upgradable read-only memory (ROM)
the system board. These programs are the primary method of testing the major components of a server. Major components that can be tested include the system board, Ethernet controller, video controller, RAM, keyboard, mouse (pointing device), diskette drive, serial ports, hard drives, and parallel port. Not all servers include all of these items; your hardware might differ. You can also use the diagnostic programs to test some external devices. See “Diagnostic programs and error messages” on page 15.
you cannot determine whether a problem is caused by the hardware or by the software, you can run the diagnostic programs to confirm that the hardware is working properly.
When you run the diagnostic programs, a single problem might cause several error messages. When this occurs, work to correct the cause of the first error message. After the cause of the first error message is corrected, the other error messages might not occur the next time you run the test.
failed server might be part of a shared DASD cluster (two or more servers sharing the same external storage device(s)). Prior to running diagnostics, verify that the failing server is not part of a shared DASD cluster.
server might be part of a cluster if:
The customer identifies the server as part of a cluster. One or more external storage units are attached to the server and at least one of
the attached storage units is additionally attached to another server or unidentifiable source.
One or more servers are located near the failing server.
the failing server is suspected to be part of a shared DASD cluster, all diagnostic tests can be run except diagnostic tests which test the storage unit (DASD residing
the storage unit) or the storage adapter attached to the storage unit.
Notes:
For servers that are part of a shared DASD cluster, run one test at a time in looped mode. Do not run all tests in looped mode, as this could enable the DASD diagnostic tests.
multiple error codes are displayed, diagnose the first error code displayed. the server stops working with a POST error, go to “Symptom-to-FRU index”
page 115.
the server stops working and no error is displayed, go to “Undetermined
problems” on page 143.
For information about power supply problems, see “Power supply LED errors”
page 133. For safety information, see “Safety information” on page 145. For intermittent problems, check the error log; see “Error logs” on page 14.
Copyright IBM Corp. 2003
1
v
v
v
v
v
v
v
v
v
v
1.
2.
If
If
2
001 IS THE SERVER PART OF A CLUSTER?
YES. Schedule maintenance with the customer. Shut down all servers
related to the cluster. Run storage test. NO. Go to step 002.
002 IF THE SERVER IS NOT PART OF A CLUSTER:
Turn off the server and all external devices. Check all cables and power cords. Set all display controls to the middle position. Turn on all external devices. Turn on the server. Record any POST error messages displayed on the screen. If an error is displayed, look up the first error in the “POST error codes” on page 134. Check the diagnostic LED panel system error LED; if on, see “Level 2 Light Path Diagnostics” on page 19. Check the System Error Log. If the server recorded an error, see “Symptom-to-FRU index” on page 115. Start the diagnostic programs. See “Diagnostic programs and error messages” on page 15. Check for the following responses:
One beep. Readable instructions or the Main Menu.
003
DID YOU RECEIVE BOTH OF THE CORRECT RESPONSES? NO. Find the failure symptom in “Symptom-to-FRU index” on page 115. YES. Run the diagnostic programs. If necessary, see “Diagnostic programs
and error messages” on page 15.
you receive an error, go to “Symptom-to-FRU index” on page 115. the diagnostics completed successfully and you still suspect a problem,
see “Undetermined problems” on page 143.
Hardware Maintenance Manual: xSeries 360 Type 8686
If
©
Features overview
The IBM Eserver xSeries 360 server, which features IBM X-Architecture technology, is a high-performance rack model server that can be upgraded to a symmetric multiprocessing (SMP) server. It is ideally suited for networking environments that require superior microprocessor performance, efficient memory management, flexibility, and large amounts of reliable data storage.
Performance, ease of use, reliability, and expansion capabilities are key features of the server. These design features make it possible for you to customize the hardware to meet your needs today, while providing flexible expansion capabilities for the future.
you have access to the World Wide Web, you can obtain up-to-date information about the server model and other IBM server products at http://www.ibm.com/eserver/xseries/.
The machine type, model number, and serial number are on the ID labels, one located on the bezel to the left of the CD-ROM drive and the other at the rear of the right side, as shown in the following illustration.
Note: The illustrations in this document might differ slightly from your hardware.
ID labels
Copyright IBM Corp. 2003
3
v
v
v
v
v
v
v
v
v
v
v
v S3
v
v 8 MB
(3 U) v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
in
v
v
4
Features and specifications
The following table provides a summary of the features and specifications for the xSeries 360 server.
Table 1. Features and Specifications
Microprocessor:
®
Intel
Xeon
Processor MP (frequency varies with server model; each MP processor functions as two logical processors)
Minimum: two Maximum: four
v
Level-3 cache size varies with model 100 MHz front-side bus (FSB)
Chipset:
IBM XA-32
Chipset with integrated memory, I/O controller, and remote I/O controller. Memory:
DDR registered SDRAM DIMMs with ECC and Chipkill
™™
memory Minimum: 512 MB Maximum: 8 GB
Slots: Interleaved, 8 slots
v
standard:
Drives
Slim diskette: 1.44 MB Slim CD-ROM: 24X-10X IDE
disk drives:
Hard
Slim-high hot-swap drives. (number of drives and drive capacity vary with model)
Maximum: Three
PCI
Active
expansion slots:
Two 100 MHz/64-bit PCI-X Four 66 MHz/64-bit PCI-X
Hot-swap
power supplies:
370 W (115-230 V ac)
Minimum: One or two power supplies are shipped with server, depending on model
Maximum: Three
Hot-swap cooling:
Six hot-swap fans Video:
Compatible with SVGA 4
Size
Height: 134 mm (5.28 in.) Depth: 720 mm (28.35 in.) Width: 440 mm (17.32 in.) Weight: 25 kg (55 lb) to 31.7 kg (70 lb) depending upon configuration
Integrated
Light Path Diagnostics Ultra160 SCSI controller One 10BASE-T/100BASE-TX Intel Ethernet controller Remote Supervisor Adapter (service processor)
Three universal serial bus ports
v
Keyboard port Mouse port Video port Serial port (uses the serial cable that comes with the server to provide a serial connector on the rear of the server in the position usually occupied by an adapter in PCI slot 6)
Acoustical
Sound power, idling: 6.3 bel maximum Sound power, operating: 6.3 bel
maximum Sound pressure, operating: 47 dBa
maximum
video controller
SDRAM video memory
functions:
ASM interconnect (peer-to-peer) port RXE Expansion port RXE management port Ethernet port Serial port
noise emissions:
Environment:
Air temperature:
Server on: 10° to 35°C (50° to 95°F). Altitude: 0 to 914 m (3000 ft) Server on: 10° to 32°C (50° to
89.6°F). Altitude: 914 m (3000 ft) to 2133 m (7000 ft) Server off: 10° to 43°C (50° to 110°F). Maximum altitude: 2133 m (7000 ft)
Humidity:
v
Server on: 8% to 80% Server off: 8% to 80%
output:
Heat
Approximate heat output in British thermal units (Btu) per hour
Minimum configuration: 1232 Btu (0.36 kilowatts) Maximum configuration: 3566 Btu (1.045 kilowatts)
Electrical
input:
Sine-wave input (50-60 Hz) required Input voltage low range:
Minimum: 100 V ac Maximum: 127 V ac
v
Input voltage high range:
Minimum: 200 V ac Maximum: 240 V ac
Input kilovolt-amperes (kVA)
v
approximately:
Minimum: 0.08 kVA (ac power connected, server off) Minimum: 0.38 kVA (dc power on, server idle) Maximum: 1.1 kVA
Notices and statements used in this book
The caution and danger statements used in this book also appear in the multilingual
Safety Information book, provided on the Documentation CD. Each caution and
danger statement is numbered for easy reference to the corresponding statements
the safety book.
The following types of notices and statements are used in this book:
Note: These notices provide important tips, guidance, or advice. Important: These notices provide information or advice that might help you avoid
inconvenient or problem situations.
Hardware Maintenance Manual: xSeries 360 Type 8686
v
to
v
v
v
v
v
v
v
v
Attention: These notices indicate possible damage to programs, devices, or
data. An attention notice is placed just before the instruction or situation in which damage could occur.
Caution: These statements indicate situations that can be potentially hazardous
you. A caution statement is placed just before the description of a potentially
hazardous procedure step or situation.
Danger: These statements indicate situations that can be potentially lethal or
extremely hazardous to you. A danger statement is placed just before the description of a potentially lethal or extremely hazardous procedure step or situation.
What the xSeries 360 offers
The design of the server takes advantage of advancements in symmetric multiprocessing (SMP), data storage, disk-array technologies, and memory management. The server combines:
X-Architecture technology X-Architecture technology is an IBM design blueprint that takes full advantage of
existing IBM technologies to build powerful, scalable, and reliable Intel processor-based servers.
For more information about X-Architecture technology, go to the Web at http://www.pc.ibm.com/us/eserver/xseries/xarchitecture/index.html.
Impressive performance using an innovative approach to SMP The server supports up to four Intel Xeon Processor MP
server comes with at least one microprocessor installed; you can install additional processors to enhance performance and provide SMP capability.
Large data-storage and hot-swap capabilities The xSeries 360 server supports up to three 26 mm (1-inch) slim-high 3.5-inch
hot-swap hard disk drives in the hot-swap bays. This hot-swap feature enables you to remove and replace hard disk drives without turning off the server.
Active PCI-X (hot-plug) adapter capabilities The server has six hot-plug PCI-X slots for PCI or PCI-X adapters. With
operating-system support, you can replace failing hot-plug PCI/PCI-X adapters without turning off the server. If the hot-add feature is supported by both the operating system and a PCI/PCI-X adapter, you can also add such PCI/PCI-X adapters in these slots without turning off the server.
Redundant hot-swap cooling and redundant hot-swap power capabilities The fans in the server can operate at variable speeds so that if one fan fails, the
remaining fans increase in speed to enable continued operation until the failing fan is replaced. You can replace a failing fan without turning off the server.
The server supports up to three 370-watt power supplies. Multiple hot-swap 370-watt power supplies ensure redundancy and hot-swap capability for a typical configuration.
Large system memory The memory bus in the server supports up to 8 GB of system memory. The
memory controller provides error correcting code (ECC) support for up to eight industry-standard, 2.5 V, 168-pin, 16-byte, PC1600 registered DDR SDRAM dual inline memory modules (DIMMs). The memory controller also provides Chipkill Memory protection. Chipkill Memory protection is a technology that protects the server from a single chip failure on a DIMM.
microprocessors. The
Features overview
5
a
v
v
v
up to
v
v
on
6
v
Light Path Diagnostics Light Path Diagnostics provide LEDs to assist in isolating problems with the
server. An LED on the operator information panel is lit if an unusual condition or
problem occurs. If this happens, you can look at the LEDs visible through a special cutout in the cover and isolate the cause. Under some circumstances, you might need to remove the cover and view LEDs on the system board to pinpoint the source of a problem.
Systems-management capabilities The server comes with a Remote Supervisor Adapter installed. This adapter, in
conjunction with the systems-management software provided with the server, enables you to manage the functions of the server both locally and remotely. The Remote Supervisor Adapter also provides system monitoring, event recording, and dial-out alert capability.
Integrated network environment support The server comes with an Ethernet controller integrated into the system board.
This Ethernet controller has an interface for connecting to 10-Mbps or 100-Mbps networks. The server automatically selects between 10BASE-T and 100BASE-TX. The controller provides full-duplex (FDX) capability, which enables simultaneous transmission and reception of data on the Ethernet local area network (LAN).
Remote expansion enclosure The addition of an optional, IBM RXE-100 Remote Expansion Enclosure provides
twelve additional hot-plug PCI/PCI-X slots for PCI or PCI-X adapters. The enclosure can be managed through the Remote Supervisor Adapter, enabling remote operation over a network or a modem.
Redundant network adapter The addition of an optional, redundant network adapter provides a failover
capability to a secondary Ethernet connection. If a problem occurs with the primary Ethernet connection, all Ethernet traffic associated with this primary connection is automatically switched to the redundant network adapter. If the appropriate device drivers are installed, this switching occurs without data loss and without user intervention.
IBM ServerGuide The ServerGuide
CDs
CDs that are included with the server provide programs to help you set up the server and install the network operating system (NOS). The ServerGuide program detects the installed hardware options and provides the correct configuration programs and device drivers. In addition, the ServerGuide CDs include a variety of application programs for the server.
Note: The latest level of basic input/output system (BIOS) code for the server is
available through the World Wide Web. See “Recovering the BIOS code”
For more information about the ServerGuide CDs, see “Using the ServerGuide CDs” on page 57.
server is designed to be cost-effective, powerful, and flexible. It uses peripheral
The component interconnect-extended (PCI-X) bus architecture to provide compatibility with a wide range of existing hardware devices and software applications.
The server meets stringent worldwide certifications for power, electromagnetic compatibility (EMC), and safety.
Hardware Maintenance Manual: xSeries 360 Type 8686
page 18 for more information.
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
Reliability, availability, and serviceability
Three of the most important features in server design are reliability, availability, and serviceability (RAS). These factors help to ensure that the integrity of the data stored on the server is preserved, that the server is available when you want to use it, and that should a failure occur, you can easily diagnose and repair the failure with minimal inconvenience.
The following is an abbreviated list of the RAS features that the server supports:
ECC memory, L3 cache, and front side buses (FSBs) Parity checking on the small computer system interface (SCSI) bus Advanced Configuration and Power Interface (ACPI) power management Power-on self-test (POST) Systems-management monitoring through Intra-Integrated Circuit (I2C) bus Light Path Diagnostics Automatic error retry and recovery Automatic restart after a power failure Built-in temperature, fan, and voltage monitoring Chipkill Memory protection Hot-swap cooling with fan speed-sensing capability Hot-swap and redundant power supplies Hot-swap hard disk drives Hot-plug (Active) PCI-X adapter slots Menu-driven setup, system configuration, SCSISelect configuration, and diagnostic programs Memory scrubbing and Predictive Failure Analysis time) Microcode and diagnostic program levels available System-management software and LANDesk Network adapter failover support Remote Supervisor Adapter to enable remote server management Remote Connect Remote system problem-determination support Upgradable POST/BIOS, diagnostics, and Remote Supervisor Adapter microcode Wake on LAN Backup BIOS switching by means of a flash ROM page swap jumper Error codes and messages Microprocessor serial number access Cable presence detection for standard internal cables Vital product data (VPD) for microprocessors, system and memory boards, memory DIMMs, power supplies, hot-swap-drive backplane, and power supply backplane
®
and Alert on LAN
capability
®
enabled
™®
(PFA) (background and real
Features overview
7
AC
DC
to
on
is
is
8
Controls and indicators
The following illustration shows the controls and indicators on the front of the server.
Operator information panel
AC LED DC LED
Diskette drive CD-ROM drive
Hard disk drive filler panels
Activity
LED
(green)
Status
LED
(amber)
Hard disk drive
Power supply 1
Power supply 3
Power supply 2
Operator information panel: This panel, also known as the front LED panel,
contains controls, indicators, and one of the three USB ports. See “Operator information panel” on page 9 for details.
LED: Each hot-swap power supply has an ac power LED to indicate that ac
power is available to the power supply.
LED: Each hot-swap power supply has a dc power LED to indicate that the
power supply is active.
Diskette drive: The server contains one standard 3.5 diskette drive with an LED to
indicate when it is active.
CD-ROM drive: The server contains one standard IDE CD-ROM drive with an LED
indicate when it is active.
Hard disk drive filler panels: A filler panel is used to cover empty hard disk drive
bays. A filler panel must be installed in each empty bay to ensure proper system cooling.
Activity LED: Each hot-swap hard disk drive has an activity LED, which flashes if
the drive is being accessed.
Status LED: Each hot-swap hard-disk drive has a status LED. If this amber LED is
continuously, the drive has failed. If an optional IBM ServeRAID
adapter is
installed in the server and the LED flashes slowly (one flash per second), the drive
being rebuilt. If the LED flashes rapidly (three flashes per second), the controller identifying the drive.
Hard disk drive: The server supports a maximum of three hot-swap hard disk
drives.
Power supplies 1, 2, and 3: The server comes with a minimum of one hot-swap
power supply and supports a maximum of three hot-swap power supplies.
Hardware Maintenance Manual: xSeries 360 Type 8686
The following illustration shows the indicators on the rear of the server.
Remote Supervisor Adapter Ethernet port
Ethernet TX/RX LEDs
Ethernet link LEDs
Remote Supervisor Adapter fault
LED
(amber)
Remote Supervisor Adapter power LED (green)
Ethernet port
Ethernet Link LEDs: These LEDs (one for the Remote Supervisor Adapter, one for
the network) indicate that the related Ethernet link is operational.
Ethernet TX/RX LEDs: These LEDs (one for the Remote Supervisor Adapter, one
for the network) indicate that activity is taking place on the related network.
Remote Supervisor Adapter Ethernet port: This port enables you to manage the
server from a remote location.
Remote Supervisor Adapter fault LED: This LED indicates that the Remote
Supervisor Adapter has failed.
Remote Supervisor Adapter power LED: This LED indicates that power is being
supplied to the Remote Supervisor Adapter.
Ethernet port: This port enables you to connect the server to a network.
Operator information panel
The following illustration shows the controls, indicators, and USB port on the operator information panel (also known as the front LED panel).
USB port
SCSI bus activity LED Information LED
Fault LED
USB port: This port enables you to attach a USB device to the front of the server.
Reset button
Power-control button
Power LED
Features overview
9
it
to
or
1.
2.
to
3.
10
SCSI bus activity LED: This green LED is on when there is activity on the SCSI
bus.
Information LED: This amber light is on if the server power supplies are
nonredundant or some other noncritical event has occurred. See the LEDs on the Level 2 diagnostic LED panel or on the system board for specific error information. For details, see “Level 2 diagnostic and system board LEDs” on page 68.
Fault LED: This amber LED is on if a system error has occurred. See the LEDs on
the Level 2 diagnostic LED panel or on the system board for specific error information. For details, see “Level 2 diagnostic and system board LEDs” on page
68.
Reset button: Press this button to reset the server and run the power-on self-test
(POST).
Power-control button: If the server is in operational mode, press this button to put
into standby mode. If the server is in standby mode, press this button to restore it
operational mode. See “Turning on the server” and “Turning off the server” on
page 11 for more information about applying and removing external power to the server.
Power LED: This green LED is on if ac and dc power are present in the server. If
this LED is flashing, the server is in standby mode (ac power is present, but the power supplies are turned off). If this LED is off, either ac power, all power supplies,
the light has failed.
Note: If this LED is off, it does not mean that there is no electrical power present in
the server. The LED might be burned out. To remove all electrical current from the server, you must unplug the server power cords from the electrical outlets.
Turning on the server
Turning on the server refers to the act of plugging the server power cords into the power source and pressing the power-control button. After you install the network operating system in the server, this typically causes the operating system to start.
Complete the following steps to turn on the server.
Turn on all external devices, such as the monitor.
Note: After you plug the power cords into the power source, wait 30 seconds
Plug the server power cords into the power source.
before pressing the power-control button on the operator information panel. During this time, the Remote Supervisor Adapter is initializing and the power-control button does not respond. The power LED on the information panel blinks, indicating that ac power is present in the server.
Note: Plugging the power cords into the power source might cause the server
Wait 30 seconds; then, press the power-control button on the operator information panel. The power LED comes on and the power-on self-test (POST) begins.
Hardware Maintenance Manual: xSeries 360 Type 8686
start automatically. This is an acceptable action.
on
If
a
1.
If the server is turned on and a power failure occurs, it restarts automatically when power is restored. You can turn on the server remotely by means of the Remote Supervisor Adapter.
When you start the server for the first time after you add or remove an internal option or an external SCSI device, you might see a message telling you that the configuration has changed. The Configuration/Setup Utility program automatically starts so that you can save the new configuration information. See “Configuration”
page 27 for details.
Some options have device drivers that you need to install. See the documentation that comes with the option for information about installing any required device drivers.
The server comes with at least one microprocessor installed on the system board. If you have installed one or more additional microprocessors, the server can now operate as an SMP server. Therefore, you might need to upgrade the operating system to support SMP. See “Using the ServerGuide CDs” on page 57 and the operating system documentation for additional information.
the server has a ServeRAID adapter installed and you have installed or removed hard disk drive, see the documentation that comes with the ServeRAID adapter
for information about reconfiguring the disk arrays.
Turning off the server
Turning off the server refers to the act of stopping the operating system, pressing the power-control button, and disconnecting the server power cords from the power source.
Statement 5:
CAUTION: The power control button on the device and the power switch on the power supply do not turn off the electrical current supplied to the device. The device also might have more than one power cord. To remove all electrical current from the device, ensure that all power cords are disconnected from the power source.
2 1
See the operating system documentation for the proper procedure to shut down the operating system before turning off the server. Each operating system is different. Some allow an immediate shutdown; others require an orderly shutdown procedure.
Complete the following steps to turn off the server:
Press the power-control button on the operator information panel. This places the server in standby mode. The power LED on the operator information panel blinks. This does not remove all power from the server.
Features overview
11
2.
1.
2.
12
Standby mode
Note: After doing this, wait at least 30 seconds before pressing the
power-control button to turn on the server again.
Disconnect the server power cords from the power source.
Note: Wait approximately 15 seconds after disconnecting the power cords for
the server to stop running. The power LED on the operator information panel will stop blinking.
Standby mode refers to the condition in which the server operating system is not running and all core logic is shut down except for the Remote Supervisor Adapter.
Complete the following steps to put the server into standby mode:
See the operating system documentation for the proper procedure to shut down the operating system.
Note: Each operating system is different. Read all the documentation about
shutting down the operating system before continuing.
Press the power-control button on the front of the server.
can put the server into standby mode remotely by means of the Remote
You Supervisor Adapter.
Hardware Maintenance Manual: xSeries 360 Type 8686
If
v
14
v
v
of
©
Diagnostics
This chapter provides basic troubleshooting information to help you resolve some common problems that might occur while setting up the server.
you cannot locate and correct the problem using the information in this chapter,
see “Symptom-to-FRU index” on page 115 for more information.
Diagnostic tools overview
The following tools are available to help you identify and resolve hardware-related problems:
POST beep codes, error messages, and error logs
The power-on self-test (POST) generates beep codes and messages to indicate successful test completion or the detection of a problem. See “POST” on page
for more information.
Diagnostic programs and error messages
The server diagnostic programs are stored in upgradable read-only memory (ROM) on the system board. These programs are the primary method of testing the major components of the server. See “Diagnostic programs and error messages” on page 15 for more information.
Level 2 Light Path Diagnostics
The server has light-emitting diodes (LEDs) to help you identify problems with server components. By following the light path, you can quickly identify the type
problem that occurred. The light path begins with the LEDs on the operator information panel. See “Level 2 Light Path Diagnostics” on page 19 for more information.
Copyright IBM Corp. 2003
13
If
If
on
1. If
2. A
to
v
v
If
v A
v An
v
v An
v A
v An
v A
14
POST
When you turn on the server, it performs a series of tests to check the operation of server components and some of the options that are installed in the server. This series of tests is called the power-on self-test, or POST.
POST finishes without detecting any problems, a single beep sounds, and the first
screen of the operating system or application program appears.
POST detects a problem, more than one beep sounds, and an error message
appears on the screen. See “Beep symptoms” on page 115 and “POST error codes”
page 134 for more information.
Notes:
you have a power-on password or administrator password set, you must type
the password and press Enter, when prompted, before POST will continue.
single problem might cause several error messages. When this occurs, work
correct the cause of the first error message. After you correct the cause of the first error message, the other error messages usually will not occur the next time you run the test.
Error logs
The POST error log contains the three most recent error codes and messages that the system generated during POST. The System Error log contains error messages that were issued during POST and all system status messages from the Remote Supervisor Adapter.
You can view the contents of the System Error log from either the Configuration/Setup Utility program or from the diagnostic programs:
Start the Configuration/Setup Utility program; then, select Error Logs from the main menu; then, select either Post Error Log or System Event/Error Log. See “Starting the Configuration/Setup Utility” on page 28 for details.
Start the diagnostic programs; select Hardware Info from the top of the diagnostic programs screen; select System Error Log from the list that appears; then, follow the instructions on the screen. See “Starting the diagnostic programs” on page 15 for details.
Small computer systems interface (SCSI) messages
you receive a SCSI error message while running the SCSISelect Utility program,
one or more of the following might be causing the problem:
failing SCSI device (adapter, drive, controller)
improper SCSI configuration
Duplicate SCSI IDs in the same SCSI chain
improperly installed SCSI terminator
defective SCSI terminator
improperly installed cable
defective cable
Hardware Maintenance Manual: xSeries 360 Type 8686
v
v
v
v
on
be
To
1.
2.
If a
3.
4.
a. If
b.
c. If
d.
To solve the problem, verify that:
The external SCSI devices are turned on. You must turn on all external SCSI devices before turning on the server.
The cables for all external SCSI devices are connected correctly. The last device in each SCSI chain is terminated properly. The SCSI devices are configured correctly.
you have verified these items and the problem remains, run the diagnostic
If programs to obtain additional information about the failing device.
Diagnostic programs and error messages
The server diagnostic programs are stored in upgradable read-only memory (ROM)
the system board. These programs are the primary method of testing the major
components of the server.
Diagnostic error messages indicate that a problem exists; they are not intended to
used to identify a failing part. Troubleshooting and servicing of complex problems that are indicated by error messages should be performed by trained service personnel.
Sometimes the first error to occur causes additional errors. In this case, the server displays more than one error message. Always follow the suggested action instructions for the first error message that appears.
Starting the diagnostic programs
start the diagnostic programs:
Turn on the server and watch the screen. When the message F2 for Diagnostics appears, press F2.
power-on password is set, the server prompts you for it. Type the power-on
password, and press enter. When the Diagnostic Programs screen appears, select either Extended or
Basic from the top of the screen.
Select the test that you want to run from the list that appears; then, follow the instructions on the screen.
Notes:
the server stops during testing and you cannot continue, restart the server and try running the diagnostic programs again. If the problem remains, flash the server with the latest diagnostics code and BIOS, and then run the test again. If the problem remains, see “Undetermined problems” on page 143.
The keyboard and mouse (pointing device) tests assume that a keyboard and mouse are attached to the server.
you run the diagnostic programs with no mouse attached to the server, you will not be able to navigate between test categories using the Next Cat and Prev Cat buttons. All other functions that are provided by mouse-selectable buttons are also available using the function keys.
You can test the USB keyboard by using the regular keyboard test. The regular mouse test can test a USB mouse. Also, you can run the USB interface test only if there are no USB devices attached.
Diagnostics
15
f.
is
is
is
cc is
is
16
the diagnostic programs do not detect any hardware errors but the problem
If remains during typical server operations, a software error might be the cause. If you suspect a software problem, refer to the information that comes with the software package.
Error code format
This section shows the format of the error codes that might appear in the detailed test log and summary log when running the diagnostic programs.
e.
You can view server configuration information (such as system configuration, memory contents, interrupt request (IRQ) use, direct memory access (DMA) use, device drivers, and so on) by selecting Hardware Info from the top of the screen.
You can press F1 while running the diagnostic programs to obtain Help information. You also can press F1 from within a help screen to obtain online documentation from which you can select different categories. To exit from the help information and return to where you left off, press Esc.
The error code format is as follows:
fff-ttt-iii-date-cc-text message
where:
fff
ttt
iii date is the date that the diagnostic test was run and the error recorded.
text message
Text message format
This section shows the format of the diagnostic text messages that might appear in the detailed test log and summary log when running the diagnostic programs.
The diagnostic text message format is as follows:
Function Name: Result (test specific string)
the three-digit function code that indicates the function being tested when the error occurred. For example, function code 089 is for the microprocessor.
the three-digit failure code that indicates the exact test failure that was encountered.
the three-digit device ID.
the check value that is used to verify the validity of the information.
the diagnostic message that indicates the reason for the problem.
where:
Function Name
Result
Hardware Maintenance Manual: xSeries 360 Type 8686
The name of the function being tested when the error occurred. This corresponds to the function code (fff) shown in the previous list.
This can be one of the following:
Passed
This result occurs when the diagnostic test is completed without any errors.
it is
be
1.
2. To
1.
2. If
on
Failed This result occurs when the diagnostic test discovers an
error.
User Aborted
This result occurs when you stop the diagnostic test before
complete.
Not Applicable
This result occurs when you specify a diagnostic test for a device that is not present.
Aborted
This result occurs when the test could not proceed, for example, because of the system configuration.
Warning
This result occurs when a possible problem is reported during the diagnostic test, such as when a device that is to
tested is not installed.
specific string
test
Additional information that you can use to analyze the problem.
Viewing the test log
When the tests have been completed, you can view the test log by selecting Utility from the top of the screen and then selecting View Test Log.
Notes:
You can view the test log only while you are in the diagnostic programs. When you exit the diagnostic programs, the test log is cleared (saved test logs are not affected). To save the test log so that you can view it later, click Save Log on the diagnostic programs screen and specify a location and name for the saved log file.
save the test log to a diskette, you must use a diskette that you have formatted yourself; this function does not work with preformatted diskettes. If the diskette has sufficient space for the test log, the diskette may contain other data.
For information about the system error log, see “Error logs” on page 14 and “Configuration/Setup Utility main menu” on page 28.
Viewing the System Event/Error log
You can also view the System Event/Error log from the diagnostic programs. See the instructions at “Error logs” on page 14.
Diagnostic error messages
Error messages may appear when you run the diagnostic programs.
Notes:
Depending on the server configuration, some of these error messages might not appear when you run the diagnostic programs.
diagnostic error messages appear that are not listed, make sure that the server has the latest levels of BIOS, Remote Supervisor Adapter, ServeRAID, and diagnostics microcode installed.
For information about specific diagnostic error codes, see “Diagnostic error codes”
page 120.
Diagnostics
17
If
v
v
1.
2.
3.
4.
5.
6.
7.
If
8.
9.
Do
18
Recovering the BIOS code
the BIOS has become damaged, such as from a power failure during a flash update, you can recover the BIOS code using the flash ROM page swap jumper (J56) and a BIOS flash diskette.
Note: Use one of the following methods to obtain a BIOS flash diskette:
Use the ServerGuide program to make a BIOS flash diskette. Download a BIOS flash diskette from the World Wide Web at
http://www.ibm.com/pc/support.
flash memory of the server consists of a primary page and a backup page. The
The J56 jumper controls which page is used to start the server. If the BIOS code in the primary page is damaged, you can use the backup page to start the server; then, start the BIOS flash diskette to restore the BIOS code to the primary page.
Complete the following steps to recover the BIOS:
Turn off the server and peripheral devices and disconnect all power cords and external cables; then, remove the cover.
Locate jumper J56 (flash ROM page swap) on the system board. See “System board locations” on page 65 for an illustration of the system board jumper locations.
Move the jumper to pins 2 and 3 to enable BIOS recovery mode. Reconnect all external cables and power cords and turn on the peripheral
devices. Insert the BIOS flash diskette into the diskette drive and restart the server. The
system begins the power-on self-test (POST). The Flash Update Utility program displays the Flash Update Menu. Select 1 -
Update POST/BIOS. When prompted as to whether you want to move the current POST/BIOS
image to the backup ROM location, press N.
Attention:
secondary page. When prompted as to whether you want to save the current code to a diskette,
press N. When prompted to choose a language, select a language (from 0 to 7) and
press Enter to accept your choice.
10.
When prompted as to whether you want to change the serial number, ensure that the correct serial number appears.
11.
When prompted as to whether you want to change the machine type, ensure that the correct machine type appears.
12.
Attention:
Remove the flash diskette from the diskette drive.
13.
Turn off the server.
14.
Move the jumper on J56 to pins 1 and 2 to return to normal startup mode.
15.
Replace the cover and start the server.
you press Y, the damaged BIOS will be copied into the
not restart the server at this time.
Hardware Maintenance Manual: xSeries 360 Type 8686
AC DC v 1
1
v 1
v 1
v 1
v
To
Power supply LEDs
The ac and dc power LEDs on a power supply provide status information about the power supply. The following illustration shows the location of the ac and dc power LEDs. For more information about interpreting the LEDs, see “Power supply LED errors” on page 133.
Power supply 3
Locking latch
Locking handle
Power supply 1
Table 2. Power supply minimum configurations
power supply
Level 2 Light Path Diagnostics
The Light Path Diagnostics LEDs help you to identify problems with server components. By following the light path, you can quickly identify and fix the type of problem that occurred. The light path begins with the LEDs on the operator information panel (also known as the front LED panel). If the server encounters a problem, it lights either the Information LED or the Fault LED.
isolate the source of the trouble, examine the Level 2 Light Path Diagnostics LEDs through the special cutout on the server top cover (shown at “Level 2 diagnostic panel LEDs” on page 20). These LEDs can indicate a problem with a single component, or a problem with one of several similar components. In the latter case, to isolate the specific failing component, remove the server top cover and look for a lit LED on the system board, memory board, or failing component.
Power supply 2
AC power LED
DC power LED
v
power supply power backplane system board processor in slot 1
Operator panel card
For example, if the FAN LED is lit, the specific failing fan is indicated by an LED viewable with the top cover removed. If the MEM LED is lit, the specific failing DIMM is indicated by an LED on the memory board.
Diagnostics
19
on
SP
1.
2.
20
Level 2 diagnostic panel LEDs
The illustration at “Removing the server top cover and bezel” on page 72 shows the location of the Level 2 diagnostic panel LEDs, visible through a special cutout area
the top cover. See “Level 2 Light Path troubleshooting” on page 21 for information about using them to identify problems. The following illustration identifies the diagnostic panel LEDs.
Over
See log (CR17)
EXP (CR8)
SP bus (CR4)
NMI (CR7)
Temp (CR9)
Fan (CR6)
Mem (CR3)
PCI (CR19)
CPU (CR18)
VRM (CR13)
DASD (CR15)
Non Redund (CR16)
The meanings of these LEDs are as follows:
EXP NMI Over Temp PCI VRM
Attached expansion I/O failure Nonmaskable interrupt System over temperature condition PCI failure VRM failure
Out of Spec (CR14)
PS3 (CR10)
PS2 (CR11)
PS1 (CR12)
Remind button (SW1)
DASD Out of Spec PS2 Remind button PS1 PS3 Non Redund CPU MEM FAN
Bus
See Log
Hard disk drive failure Power supplies being overdriven Power supply number 2 failure Remind to invoke remind mode Power supply number 1 failure Power supply number 3 failure Nonredundant power mode Microprocessor failure Memory failure Fan failure Remote Supervisor Adapter failure See event log or Remote Supervisor Adapter log for
non-optimal condition information
Notes:
The server supports replaceable voltage regulator modules (VRMs). The server supports a maximum of three power supplies.
Hardware Maintenance Manual: xSeries 360 Type 8686
Loading...
+ 168 hidden pages