IBM xSeries 342 2RX, xSeries 342 2TG, xSeries 342 1TG, xSeries 342 1RX Maintenance Manual

Hard ware Mainte n ance Man u al

xSeries 342 Model 1RX, 2RX, 1TG, 2TG
Hard ware Mainte n ance Man u al

xSeries 342 Model 1RX, 2RX, 1TG, 2TG
:
Note: Before using this information and the product it supports, be sure to read the general information under “Notices” on page
147.
First Edition (June 2001) The following paragraph does not apply to the United Kingdom or any country were such provisions are
inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION AS ISWITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This publication could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time.
This publication was developed for products and services offered in the United States of America. IBM may not offer the products, services, or features discussed in this document in other countries, and the information is subject to change without notice. Consult your local IBM representative for information on the products, services, and features available in your area.
Requests for technical information about IBM products should be made to your IBM reseller or IBM marketing representative.
© Copyright International Business Machines Corporation 2000, 2001. All rights reserved.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
About This Manual ..........v
Important Safety Information .........v
Online Support .............vi
IBM Online Addresses ..........vi
General checkout ..........1
General information .........3
Features and specifications..........3
Serverfeatures..............5
Reliability, availability, and serviceability features . . 6
Controls and indicators ...........7
Powering on the server ..........7
Powering off the server ..........7
Operator information panel .........9
Diagnostics.............11
Diagnostic tools overview .........11
Identifying problems using LEDs .......11
Power supply LEDs ..........11
Light path diagnostics ..........11
Diagnostics panel ...........11
Light path diagnostics table ........12
POST ................12
POST error messages ..........12
Errorlogs..............12
Small computer system interface messages (some
models)................13
Diagnostic programs and error messages ....13
Textmessages ............14
Starting the diagnostic programs ......14
Viewing the test log ..........15
Diagnostic error message tables.......16
Recovering BIOS code ...........16
Troubleshooting the Ethernet controller .....17
Network connection problems .......17
Ethernet controller troubleshooting chart . . . 17
Power checkout .............19
Replacing the battery ...........19
Temperature checkout ...........21
Configuring the server ........23
Using the Configuration/Setup Utility program . . 23
Starting the Configuration/Setup Utility program 23
Choices available from the Configuration/Setup
mainmenu.............24
Usingpasswords ...........27
Power-onpassword .........27
Remote-control security settings .....28
Using the SCSISelect utility program ......29
Starting the SCSISelect utility program ....29
Choices available from the SCSISelect menu . . 29
Using the PXE boot agent utility program . . 31
Installing options ..........35
Exploded view of the xSeries 342 server .....35
System board layout ...........35
System board options connectors ......36
System board internal cable connectors ....36
System board external port connectors ....37
System board switches and jumpers .....38
System board LED locations ........40
Before you begin ............42
System reliability considerations .......42
Working inside a server with power on .....43
Handling static-sensitive devices .......43
Removingthecoverandbezel........44
Working with adapters ..........46
Adapter considerations .........46
Adapter installation instructions .......47
Installing internal drives ..........49
Internal drive bays ...........49
SCSI drives .............50
SCSI IDs ..............50
Installing a hot-swap drive .........51
Installing a non-hot-swap drive........52
Installing memory modules .........53
Installing a microprocessor .........55
Installing a hot-swap power supply ......58
Installing an xSeries 3-Pack Ultra 160 Hot-Swap
Expansion Kit ............61
Replacing a hot-swap fan assembly ......61
Installing the server cover and bezel ......62
Connecting external options .........63
Cablingrequirements..........63
Setting SCSI IDs for external devices .....63
Installationprocedure..........63
Input/Output ports ...........63
Videoport..............64
Keyboard port ............64
Auxiliary-device (pointing device) port ....65
Ultra 160 SCSI ports ..........65
SCSI cabling requirements .......66
Setting SCSI IDs ...........66
SCSI connector pin-number assignments. . . 66
Serial ports .............67
Viewing or changing the serial-port
assignments............67
Serial-port connectors .........68
Universal Serial Bus ports ........68
USB cables and hubs .........68
USB-port connectors .........69
Ethernetport.............69
Configuring the Ethernet controller ....69
Failover for redundant Ethernet .....69
Ethernet port connector ........73
Integrated System Management Processor ports 73
CablingtheServer............73
© Copyright IBM Corp. 2000, 2001 iii
FRU information (service only) ....75
RemovingtheLEDcover..........75
Removing the LED board..........75
Removing the on/off reset board .......76
Removing the diskette/CDROM drive .....76
Removing the SCSI backplane ........77
Removing the hot-swap hard disk drive backplane
assembly ...............77
Removing the power supply backplane .....78
Removing the AC Distribution Box ......79
Removing the system board .........79
Voltage related system shutdown ......98
Temperature related system shutdown ....99
DASD checkout .............99
Host Built-In Self Test (BIST) ........99
Bus fault messages ...........100
Undetermined problems..........100
Parts listing (xSeries 342 Model 1RX,
2RX,1TG,2TG) ..........103
Keyboards ..............104
Powercords.............105
Symptom-to-FRU index .......81
Beepsymptoms.............81
NoBeepsymptoms............83
Information panel system error LED ......83
Diagnostic error codes ...........85
Errorsymptoms.............89
Power supply LED errors..........90
POST error codes ............91
SCSI error codes .............96
Temperatureerrormessages.........97
Fanerrormessages............97
Powererrormessages...........98
System shutdown ............98
Related service information .....107
Safety information............107
General safety ............107
Electrical safety............108
Safety inspection guide .........109
Handling electrostatic discharge-sensitive
devices ..............110
Grounding requirements.........111
Safety notices (multi-lingual translations) . . . 111
Send us your comments! .........146
Problem determination tips.........147
Notices ...............147
Trademarks..............148
iv Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
About This Manual
About this manual
This manual contains diagnostic information, a Symptom-to-FRU index, service information, error codes, error messages, and configuration information for the
®
IBM
Important: This manual is intended for trained servicers who are familiar with
Important Safety Information
Be sure to read all caution and danger statements in this book before performing any of the instructions. See Safety informationon page 107.
Leia todas as instruções de cuidado e perigo antes de executar qualquer operação.
xSeries 342.
IBM PC Server products.
Prenez connaissance de toutes les consignes de type Attention et
© Copyright IBM Corp. 2000, 2001 v
Danger avant de procéder aux opérations décrites par les instructions.
Lesen Sie alle Sicherheitshinweise, bevor Sie eine Anweisung ausführen.
Accertarsi di leggere tutti gli avvisi di attenzione e di pericolo prima di effettuare qualsiasi operazione.
Online Support
IBM Online Addresses
Lea atentamente todas las declaraciones de precaución y peligro ante de llevar a cabo cualquier operación.
Use the World Wide Web (WWW) to download Diagnostic, BIOS Flash, and Device Driver files.
File download address is:
http://www.ibm.com/pc/files.html
The HMM manuals online address is:
http://www.us.pc.ibm.com/cdt/hmm.html
The IBM PC Company Support Page is:
http://www.ibm.com/pc/support
The IBM PC Company Home Page is:
http://www.ibm.com/pc
vi Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
General checkout
The server diagnostic programs are stored in upgradable read-only memory (ROM) on the system board. These programs are the primary method of testing the major components of the server: the system board, Ethernet controller, video controller, RAM, keyboard, mouse (pointing device), diskette drive, serial ports, and hard drives. You can also use them to test some external devices. See Diagnostic programs and error messageson page 13.
Also, if you cannot determine whether a problem is caused by the hardware or by the software, you can run the diagnostic programs to confirm that the hardware is working properly.
When you run the diagnostic programs, a single problem might cause several error messages. When this occurs, work to correct the cause of the first error message. After the cause of the first error message is corrected, the other error messages might not occur the next time you run the test.
A failed system might be part of a shared DASD cluster (two or more systems sharing the same external storage device(s)). Prior to running diagnostics, verify that the failing system is not part of a shared DASD cluster.
A system might be part of a cluster if:
v The customer identifies the system as part of a cluster. v One or more external storage units are attached to the system and at least one of
the attached storage units is additionally attached to another system or unidentifiable source.
v One or more systems are located near the failing system.
If the failing system is suspected to be part of a shared DASD cluster, all diagnostic tests can be run except diagnostic tests which test the storage unit (DASD residing in the storage unit) or the storage adapter attached to the storage unit.
Notes:
1. For systems that are part of a shared DASD cluster, run one test at a time in looped mode. Do not run all tests in looped mode, as this could enable the DASD diagnostic tests.
2. If multiple error codes are displayed, diagnose the first error code displayed.
3. If the computer hangs with a POST error, go to POST error codeson page 91.
4. If the computer hangs and no error is displayed, go to Undetermined
problemson page 100.
5. Power supply problems, see Power supply LED errorson page 90.
6. Safety information, see Safety informationon page 107.
7. For intermittent problems, check the error log; see Error logson page 12.
1. IS THE SYSTEM PART OF A CLUSTER?
YES. Schedule maintenance with the customer. Shut down all systems related to the cluster. Run storage test. NO. Go to step 2.
2. THE SYSTEM IS NOT PART OF A CLUSTER.
v Power-off the computer and all external devices.
© Copyright IBM Corp. 2000, 2001 1
v Check all cables and power cords. v Set all display controls to the middle position. v Power-on all external devices. v Power-on the computer. v Record any POST error messages displayed on the screen. If an error is
displayed, look up the first error in the POST error codeson page 91.
v Check the information LED panel System Error LED; if on, see Information
panel system error LEDon page 83.
v Check the System Error Log. If an error was recorded by the system, see
Symptom-to-FRU indexon page 81.
v Start the Diagnostic Programs. See Starting the diagnostic programson
page 14.
v Check for the following responses:
a. One beep. b. Readable instructions or the Main Menu.
3. DID YOU RECEIVE BOTH OF THE CORRECT RESPONSES?
NO. Find the failure symptom in Symptom-to-FRU indexon page 81. YES. Run the Diagnostic Programs. If necessary, refer to Starting the diagnostic
programson page 14.
If you receive an error, go to Symptom-to-FRU indexon page 81.
If the diagnostics completed successfully and you still suspect a problem, see Undetermined problemson page 100.
2 Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
General information
Your IBM
®
symmetric multiprocessing (SMP). It is ideally suited for networking environments that require superior microprocessor performance, efficient memory management, flexibility, and large amounts of reliable data storage.
Performance, ease of use, reliability, and expansion capabilities were key considerations during the design of your server. These design features make it possible for you to customize the system hardware to meet your needs today, while providing flexible expansion capabilities for the future.
Your xSeries 342 comes with a three-year limited warranty and IBM Server StartUp Support. If you have access to the World Wide Web, you can obtain up-to-date information about your server model and other IBM server products at the following World Wide Web address: http://www.ibm.com/pc/us/netfinity/
Features and specifications
The following table provides a summary of the features and specifications for your xSeries 342:
Microprocessor:
v Intel v 256 KB or 512 KB Level-2 cache v Supports up to two microprocessors
®
Pentium III
xSeries 342 server is a high-performance server that supports
Memory:
v Maximum: 4 GB v Type: ECC, SDRAM, PC133, Registered DIMMs v Slots: Four (two-way interleaved)
Drives standard:
v Diskette: 1.44 MB v CD-ROM: 24X IDE
Expansion bays:
v Hot-swap: Three slim high v Non-hot-swap: Two 5.25-inch, replaceable with a three slim-high hot-swap drive
expansion option
PCI expansion slots:
v One 33 MHz/32-bit v Two 33 MHz/64-bit v Two 66 MHz/64-bit
Hot-swap power supplies:
v 270 Watt (115-230 V ac) v Minimum: One v Maximum: Two, second power supply provides redundant power
Redundant cooling:
v Three hot-swap fans
Video:
© Copyright IBM Corp. 2000, 2001 3
v S3 video controller v Compatible with SVGA and VGA v 8 MB video memory
Size (3U):
v Height: 128 mm (5 in.) v Depth: 695 mm (27.3 in.) v Width: 440 mm (17.3 in.) v Weight: 21.3 to 29.5 (47 to 65 lbs.) depending upon configuration
Integrated functions:
v Dual channel Ultra 160 SCSI controller v One 10BASE-T/100BASE-TX/100BASE-FX, Intel Ethernet controller with alert on
LAN
and Wake on LAN®support
v Two serial ports v Two Universal Serial Bus ports v Keyboard port v Mouse port v Video port v Integrated System Management (ISM) Processor v Two ISM (RJ-45) connectors v One system management Serial C port
Acoustical noise emissions:
v Sound power, idling (open bay): 6.6 bel maximum v Sound power, operating: 6.8 bel maximum v Sound pressure, operating: 53 dBa maximum
Environment:
v Air temperature:
Server on: 10to 35C (50to 95F). Altitude: 0 to 914 m (2998 ft.)Server on: 10to 32C (50to 89.6F). Altitude: 914 m (2998 ft.) to 2133 m
(6998 ft.)
– Server off: 10to 43C (50to 109.4F). Maximum altitude: 2133 m (6998 ft.)
v Humidity:
Server on: 8% to 80%Server off: 8% to 80%
Heat output:
v Approximate heat output in British Thermal Units (BTU) per hour
Minimum configuratrion 375 BTU (110 watts)Maximum configuration 1300 BTU (380 watts)
Electrical input:
v Sine-wave input (50-60 Hz) required v Input voltage low range:
Minimum: 100 V acMaximum: 127 V ac
v Input voltage high range:
Minimum: 200 V acMaximum: 240 V ac
v Input kilovolt-amperes (kVA) approximately:
Minimum: 0.08 kVA (0.076 kW)Maximum: 0.38 kVA
*KB equals approximately 1000 bytes. MB equals approximately 1000000 bytes. GB equals approximately 1000000000 bytes.
4 Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
Server features
The xSeries 342 is designed to be cost-effective, powerful, and flexible. Your server offers:
v Impressive performance using an innovative approach to SMP
Your server supports up to two Intel Pentium III microprocessors. Your server comes with one microprocessor installed; you can install an additional microprocessor to enhance performance and provide SMP capability.
v Large data-storage and hot-swap capabilities
All models of the server support up to three hot-swap hard disk drives. This hot-swap feature enables you to remove and replace hard disk drives without turning off the server. The x-Series 3-Pack Ultra 160 Hot-Swap Expansion Kit option is available to add three additional drives.
v Optional PCI adapters
Your server uses peripheral component interconnect (PCI) bus architecture to provide compatibility with a wide range of existing hardware devices and software applications. Your server supports up to five PCI adapters in the expansion slots on the system board.
v Redundant cooling capability
The redundant cooling capability of the hot-swap fans in your server allow continued operation if one of the fans fails. You can also replace a failing hot-swap fan without turning off the server.
v Optional redundant power capability
You can install an additional 270-watt power supply in your server to provide redundant power for your server. The Power Non-Redundant (NON) light emitting diode (LED) in the group of diagnostic LEDs on the system board is lit when the power load is 270 watts or greater with two power supplies installed.
v Large system memory
The memory bus in your server supports up to 4 GB (GB equals approximately 1 000 000 000 bytes) of system memory. The memory controller provides error correcting code (ECC) support for up to four industry standard PC133, 3.3 V, 168-pin, 8-byte, registered, synchronous-dynamic-random access memory (SDRAM) dual inline memory modules (DIMMs).
v Integrated System Management (ISM) Processor
The IBM Integrated System Management Processor provides environmental monitoring for your server. This Integrated System Management Processor supports the Automatic Server Restart (ASR) feature, and it can issue system alerts using the Alert on LAN features of the integrated Ethernet controller. Future firmware code releases for the Integrated System Management Processor will support additional functions and features. These features will include dial-in support using the dedicated system management serial port C, alert fowarding through the Integrated System Management Processor connectors, error logging, support for communication between the Integrated System Management Processor and more robust IBM system management adapters and controllers.
v Integrated network environment support
Your server comes with an Ethernet controller on the system board. This Ethernet controller has an interface for connecting to 10-Mbps or 100-Mbps networks. The server automatically selects between 10BASE-T and 100BASE-TX. The controller provides full-duplex (FDX) capability, which allows simultaneous transmission and reception of data on the Ethernet local area network (LAN).
v Redundant network-interface card
General information 5
The addition of an optional, redundant network interface card (NIC) provides a failover capability to a redundant Ethernet connection. If a problem occurs with the primary Ethernet connection, all Ethernet traffic associated with this primary connection is automatically switched to the redundant NIC. This switching occurs without data loss and without user intervention.
v Optional digital linear tape drive
The addition of an optional digital linear tape drive (DLT) allows quick backup of large amounts of data.
v IBM ServerGuide
CDs
The ServerGuide CDs included with your server provide programs to help you set up your server and install the network operating system (NOS). The ServerGuide program detects the hardware options installed, and provides the correct configuration programs and device drivers. In addition, the ServerGuide CDs include a variety of application programs for your server. See SERVERGUIDE for more information.
Reliability, availability, and serviceability features
Three of the most important features in server design are reliability, availability, and serviceability (RAS). These factors help to ensure the integrity of the data stored on your server; that your server is available when you want to use it; and that should a failure occur, you can easily diagnose and repair the failure with minimal inconvenience.
The following is an abbreviated list of the RAS features that your server supports. v Menu-driven setup, system configuration, RAID configuration, and diagnostic
programs
v Power-on self-test (POST) v ROM resident diagnostics v Integrated System Management Processor v Predictive failure alerts v Microprocessor built-in self-test (BIST), internal error signal monitoring,
configuration checking, CPU/VRM failure identification through Light Path Diagnostics technology
v Diagnostic support of ServeRAID
adapters and Ethernet adapters
v Cable detection v Hot-swap drive bays v Error codes and messages available with Remote Supervisor Adapter v System error logging available with Remote Supervisor Adapter v Upgradable BIOS, diagnostics, and system management code v Automatic restart after a power failure v Parity checking on the SCSI and PCI buses v Error checking and correcting (ECC) memory v Redundant hot-swap power supply option v Redundant hot-swap cooling v Redundant Ethernet capabilities (with optional adapter) v Vital Product Data (VPD) on processor complex, system board, power backplane,
SCSI backplane, and each power supply
v Operator information panel and group of diagnostic LEDs on the system board v Remind button to change System Error LEDs to blippingat a duty cycle rate
of 250ms every 2 seconds for nonvital alerts
6 Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
Controls and indicators
The most commonly used controls and status indicators are on the front panel of the server.
System power light (green)
Power-control-button shield (if installed)
Power control button
Hard disk drive activity light (green) Hard disk drive status light (amber)
System Power Light: When this green light is on, system power is present in the server. When this light flashes, the server is in standby mode (the system power supply is turned off and AC current is present). When this light is off, either a power supply, AC power, or a light has failed.
Reset button
Controls and indicators
Operator information panel
Serial number
Attention: If this light is off, it does not mean there is no electrical current present
in the server. The light might be burned out. To remove all electrical current from the server, you must unplug the server power cords from the electrical outlets or from the UPS.
Power-control button shield: You can install this circular disk over the power-control button to prevent accidental manual power-off. This disk is provided with your server.
Power-control Button: Press this button to manually turn the server on or off.
Powering on the server
You can start the server in several ways: v You can turn on the server by pressing the power-control button on the front of
the server.
v If the server is turned on, a power failure occurs, and unattended- start mode is
enabled in the Configuration/Setup utility program, the server will start automatically when power is restored.
v If AC power is present, the server is off, and the wake-up feature is enabled in
the Configuration/Setup utility program, the wake-up feature will turn on the server at the set time.
v The Integrated System Management Processor can also turn on the server.
Powering off the server
Statement 5:
General information 7
Controls and indicators
CAUTION:
The power control button on the device and the power switch on the power supply do not turn off the electrical current supplied to the device. The device also might have more than one power cord. To remove all electrical current from the device, ensure that all power cords are disconnected from the power source.
2
1
The server can be turned off as follows: v You can turn off the server by pressing the power-control button on the front of
the server. Pressing the power-control button starts an orderly shutdown of the operating system, if this feature is supported by your operating system, and places the server in standby mode.
Note: After turning off the server, wait at least 5 seconds before pressing the
v You can press and hold the power-control button for more than 4 seconds to
cause an immediate shutdown of the server and place the server in standby mode. You can use this feature if the operating system hangs.
v You can disconnect the server power cords from the electrical outlets to shut off
all power to the server.
power-control button to power the server on again.
Note: Wait about 15 seconds after disconnecting the power cords for your
system to stop running. Watch for the System Power light on the operator information panel to stop blinking.
v If the system was turned on by the wake-up feature or Wake on LAN feature,
you can turn it off by either a software routine or by the fail-safe, power-down counter.
v The Integrated System Management Processor can turn off the server.
Reset Button: Press this button to reset the server and run the power-on self-test (POST).
Operator Information Panel: The lights on this panel give status information for your server. See Operator information panelon page 9 for more information.
Hard Disk Drive Status Light: Each of the hot-swap drive bays has a Hard Disk Status light. When this amber light is on continuously, the drive has failed (only if RAID is installed).
If a ServeRAID adapter is installed and this light flashes slowly (one flash per second), the drive is being rebuilt. When the light flashes rapidly (three flashes per second), the controller is identifying the drive.
Hard Disk Drive Activity Light: Each of the hot-swap drive bays has a Hard Disk Activity light. When this green light is flashing, the controller is accessing the
8
Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
drive.
Controls and indicators
AC power LED (green)
DC power LED (green)
AC Power Light: This light provides status information about the power supply. During normal operation, both the AC and DC Power lights are on. For any other combination of lights, see Power supply LED errorson page 90.
DC Power Light: This light provides status information about the power supply. During normal operation, both the AC and DC Power lights are on. For any other combination of lights, see Power supply LED errorson page 90.
Operator information panel
SCSI Hard Drive Activity Light (green)
Ethernet Link Status Light (green)
System Error Light (amber)
SCSI Hard Disk Drive Activity Light: This green light is on when there is activity on a hard disk drive.
TX
RX
LINK
OK
Ethernet Transmit/ Receive Activity Light (green)
Information Light (amber)
Ethernet Transmit/Receive Activity Light: When this green light is on, there is transmit or receive activity to or from the server. This light stays on even if the server power is turned off.
General information 9
Controls and indicators
Ethernet Link Status Light: When this green light is on, there is an active connection on the Ethernet port. The light stays on even if the server power is turned off.
Information Light: This amber light is on when the system error log contains information about certain conditions in your server that might affect performance.
System Error Light: This amber light is lit when a system error occurs. An LED on the diagnostic LED panel may also be on to further isolate the error.
10
Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
Diagnostics
This section provides basic troubleshooting information to help you resolve some common problems that might occur with your server.
Diagnostic tools overview
The following tools are available to help you identify and resolve hardware-related problems:
v POST beep codes, error messages, and error logs
The power-on self-test (POST) generates beep codes and messages to indicate successful test completion or the detection of a problem. See POSTon page 12 for more information.
v Diagnostic programs and error messages
The server diagnostic programs are stored in upgradable read-only memory (ROM) on the system board. These programs are the primary method of testing the major components of your server. See Diagnostic programs and error messageson page 13 for more information.
v Light path diagnostics
Your server has light-emitting diodes (LEDs) to help you identify problems with server components. These LEDs are part of the light-path diagnostics that are built into your server. By following the path of lights, you can quickly identify the type of system error that occurred. See Light path diagnosticsfor more information.
Identifying problems using LEDs
Your server has LEDs to help you identify problems with some server components. These LEDs are part of the light path diagnostics built into the server. By following the path of lights, you can identify the type of system error that occurred. See the following sections for more information.
Power supply LEDs
The AC and DC Power LEDs on the power supply provide status information about the power supply. See Power supply LED errorson page 90.
Light path diagnostics
You can use the light path diagnostics built into your server to quickly identify the type of system error that occurred. The diagnostics panel is under the wind tunnel.Your server is designed so that any LEDs that are illuminated remain illuminated when the server shuts down as long as the AC power source is good and the power supplies can supply +5V DC current to the server. This feature helps you isolate the problem if an error causes the server to shut down. See Light path diagnostics tableon page 12.
Diagnostics panel
The following illustration shows the LEDs on the diagnostics panel on the system board. See Light path diagnostics tableon page 12 for information on identifying
© Copyright IBM Corp. 2000, 2001 11
problems using these LEDs.
POST
MEM
CPU PCI A PCI B PCI C
VRM
DASD
SP
PS1 PS2 PS3
NON
OVER
NMI
TEMP
FAN
REMIND
Light path diagnostics table
The System Error LED on the operator information panel is lit when certain system errors occur. If the System Error LED on your server is lit, see to table in Information panel system error LEDon page 83 to determine the cause of the error and the action you should take.
When you turn on the server, it performs a series of tests to check the operation of server components and some of the options installed in the server. This series of tests is called the power-on self-test, or POST.
If POST finishes without detecting any problems, a single beep sounds, the first screen of your operating system or application program appears.
If POST detects a problem, more than one beep sounds and an error message appears on your screen. See Beep symptomson page 81 and POST error messagesfor more information.
Notes:
1. If you have a power-on password, or administrator password set (with Remote Supervisor Adapter installed), you must type the password and press Enter, when prompted, before POST will continue.
2. A single problem might cause several error messages. When this occurs, work to correct the cause of the first error message. After you correct the cause of the first error message, the other error messages usually will not occur the next time you run the test.
POST error messages
The table, POST error codeson page 91, provides information about the POST error messages that can appear during startup.
Error logs
The POST error log contains the three most recent error codes and messages that the system generated during POST. The System Error Log contains error messages issued during POST and all system status messages from the IBM Remote Supervisor Adapter, if installed.
12 Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
To view the contents of the error logs, start the Configuration/Setup Utility program; then, select Error Logs from the main menu.
Small computer system interface messages (some models)
If you receive a SCSI error message while using the SCSISelect Utility, use the following list to determine the possible cause of the error and what action to take.
Note: If your server does not have a hard disk drive, ignore any message that
indicates that the drive is not installed.
One or more of the following might be causing the problem.
v A failing SCSI device (adapter or drive) v An improper SCSI configuration v Duplicate SCSI IDs in the same SCSI chain v An improperly installed SCSI terminator v A defective SCSI terminator v An improperly installed cable v A defective cable
Verify that: v The external SCSI devices are turned on. External devices must be turned on
before the server.
v The cables for all external SCSI devices are connected correctly. v The last device in each SCSI chain is terminated properly. v The SCSI devices are configured correctly.
You will get these messages only when running the SCSISelect Utility. See SCSI error codeson page 96.
Diagnostic programs and error messages
The server diagnostic programs are stored in upgradable read-only memory (ROM) on the system board. These programs are the primary method of testing the major components of your server.
Diagnostic error messages indicate that a problem exists; they are not intended to be used to identify a failing part. Troubleshooting and servicing of complex problems that are indicated by error messages should be performed by trained service personnel.
Sometimes the first error to occur causes additional errors. In this case, the server displays more than one error message. Always follow the suggested action instructions for the first error message that appears.
The following sections contain the error codes that might appear in the detailed test log and summary log when running the diagnostic programs.
The error code format is as follows:
fff-ttt-iii-date-cc-text message
where:
Diagnostics 13
fff is the three-digit function code that indicates the function being tested
when the error occurred. For example, function code 089 is for the microprocessor.
ttt is the three-digit failure code that indicates the exact test failure that was
encountered.
iii is the three-digit device ID. date is the date that the diagnostic test was run and the error recorded. cc is the check digit that is used to verify the validity of the information. text message
is the diagnostic message that indicates the reason for the problem.
Text messages
The diagnostic text message format is as follows:
Function Name: Result (test specific string)
where:
Function Name
is the name of the function being tested when the error occurred. This corresponds to the function code (fff) given in the previous list.
Result can be one of the following: Passed
This result occurs when the diagnostic test completes without any errors.
Failed This result occurs when the diagnostic test discovers an error. User Aborted
This result occurs when you stop the diagnostic test before it is complete.
Not Applicable
This result occurs when you specify a diagnostic test for a device that is not present.
Aborted
This result occurs when the test could not proceed because of the system configuration.
Warning
This result occurs when a possible problem is reported during the diagnostic test, such as when a device that is to be tested is not installed.
Test Specific String
This is additional information that you can use to analyze the problem.
Starting the diagnostic programs
You can press F1 while running the diagnostic programs to obtain Help information. You also can press F1 from within a help screen to obtain online documentation from which you can select different categories. To exit Help and return to where you left off, press Esc.
To start the diagnostic programs:
1. Turn on the server and watch the screen.
Note: To run the diagnostic programs, you must start the server with the
14 Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
highest level password that is set. That is, if an administrator password
is set, you must enter the administrator password, not the power-on password, to run the diagnostic programs.
2. When the message F2 for Diagnostics appears, press F2.
3. Type in the appropriate password; then, press Enter.
4. Select either Extended or Basic from the top of the screen.
5. When the Diagnostic Programs screen appears, select the test you want to run
from the list that appears; then, follow the instructions on the screen.
Notes:
a. If the server stops during testing and you cannot continue, restart the server
and try running the diagnostic programs again.
b. The keyboard and mouse (pointing device) tests assume that a keyboard
and mouse are attached to the server.
c. If you run the diagnostic programs with either no mouse or a USB mouse
attached to your server, you will not be able to navigate between test categories using the Next Cat and Prev Cat buttons. All other functions provided by mouse-selectable buttons are also available using the function keys.
d. You can test the USB keyboard by using the regular keyboard test. Also,
you can run the USB Interface test only if there are no USB devices attached.
e. You can view server configuration information (such as system
configuration, memory contents, interrupt request (IRQ) use, direct memory access (DMA) use, device drivers, and so on) by selecting Hardware Info from the top of the screen.
When the tests have completed, you can view the Test Log by selecting Utility from the top of the screen.
If the hardware checks out OK but the problem persists during normal server operations, a software error might be the cause. If you suspect a software problem, refer to the information that comes with the software package.
Viewing the test log
The test log will not contain any information until after the diagnostic program has run.
Note: If you already are running the diagnostic programs, begin with step 3
To view the test log:
1. Turn on the server and watch the screen. If the server is on, shut down your operating system and restart the server.
2. When the message F2 for Diagnostics appears, press F2. If a power-on password or administrator password is set, the server prompts
you for it. Type in the appropriate password; then, press Enter.
3. When the Diagnostic Programs screen appears, select Utility from the top of the screen.
4. Select View Test Log from the list that appears; then, follow the instructions on the screen.
The system maintains the test-log data while the server is powered on. When you turn off the power to the server, the test log is cleared.
Diagnostics 15
Diagnostic error message tables
For descriptions of the error messages that might appear when you run the diagnostic programs see Diagnostic error codeson page 85.
Attention: If diagnostic error messages appear that are not listed in the tables, make sure that your server has the latest levels of BIOS, Integrated System Management Processor, ServeRAID, and diagnostics microcode installed.
Recovering BIOS code
If your BIOS code has become damaged, such as from a power failure during a flash update, you can recover your BIOS using the recovery boot block and a BIOS flash diskette.
Note: You can obtain a BIOS flash diskette from one of the following sources:
v Use the ServerGuide program to make a BIOS flash diskette. v Download a BIOS flash diskette from the World Wide Web. Go to
http://www.ibm.com/pc/support/, click IBM Server Support, and make the selections for your server.
v Contact your IBM service representative.
The flash memory of your server contains a protected area that cannot be overwritten. The recovery boot block is a section of code in this protected area that enables the server to start up and to read a flash diskette. The flash utility recovers the system BIOS from the BIOS recovery files on the diskette.
To recover the BIOS:
1. Turn off the server and peripheral devices and disconnect all external cables and power cords; then, remove the cover.
2. Locate the boot-block jumper block (J16) on the system board.
3. Place a jumper on pins 2 and 3 to enable BIOS backup page.
4. Insert the BIOS flash diskette into the diskette drive.
5. Restart the server.
6. The system completes the power-on self-test (POST). Select 1 -- Update
POST/BIOS from the menu that contains various flash (update) options.
7. When you are asked if you would like to move the current POST/BIOS image to the backup ROM location, type N.
Attention: Typing Y will copy the corrupted BIOS into the secondary page.
8. When you are asked if you would like to save the current code to a diskette, select N.
9. You will be asked to choose which language you wish to use. Select your language (0-7) and press Enter to accept your choice. You will be prompted to remove the diskette and press Enter to restart the system. Remove the flash diskette from the diskette drive.
10. Turn off the server.
11. Remove the jumper on the boot-block jumper block or move it to pins 1 and 2
to return to normal startup mode.
12. Restart the server. The system should start up normally.
16 Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
Troubleshooting the Ethernet controller
This section provides troubleshooting information for problems that might occur with the 10/100 Mbps Ethernet controller.
Network connection problems
If the Ethernet controller cannot connect to the network, check the following: v Make sure that the cable is installed correctly.
The network cable must be securely attached at all connections. If the cable is attached but the problem persists, try a different cable.
If you set the Ethernet controller to operate at 100 Mbps, you must use Category 5 cabling.
If you directly connect two workstations (without a hub), or if you are not using a hub with X ports, use a crossover cable.
Note: To determine whether a hub has an X port, check the port label. If the
label contains an X, the hub has an X port.
v Determine if the hub supports auto-negotiation. If not, try configuring the
integrated Ethernet controller manually to match the speed and duplex mode of the hub.
v Check the Ethernet controller lights on the operator information panel.
These lights indicate whether a problem exists with the connector, cable, or hub. – The Ethernet Link Status light illuminates when the Ethernet controller
receives a LINK pulse from the hub. If the light is off, there might be a bad connector or cable, or a problem with the hub.
– The Ethernet Transmit/Receive Activity light illuminates when the Ethernet
controller sends or receives data over the Ethernet Network. If the Ethernet Transmit/Receive Activity light is off, make sure that the hub and network are operating and that the correct device drivers are loaded.
– The Ethernet Speed 100 Mbps light illuminates when the Ethernet controller
LAN speed is 100 Mbps.
v Make sure that you are using the correct device drivers, supplied with your
server.
v Check for operating system-specific causes for the problem. v Make sure that the device drivers on the client and server are using the same
protocol.
v Test the Ethernet controller.
How you test the Ethernet controller depends on which operating system you are using (see the Ethernet controller device driver README file).
Ethernet controller troubleshooting chart
You can use the following troubleshooting chart to find solutions to 10/100 Mbps Ethernet controller problems that have definite symptoms.
Diagnostics 17
Table 1. Ethernet troubleshooting chart
Ethernet controller problem Suggested Action
The server stops running when loading device drivers.
Ethernet Link Status light does not light.
The PCI BIOS interrupt settings are incorrect.
Check the following: v Determine if the interrupt (IRQ) setting assigned to the Ethernet controller is also
assigned to another device in the Configuration/Setup Utility program. Although interrupt sharing is allowed for PCI devices, some devices do not
function well when they share an interrupt with a dissimilar PCI device. Try changing the IRQ assigned to the Ethernet controller or the other device. For example, for NetWare Versions 3 and 4 it is recommended that disk controllers not share interrupts with LAN controllers.
v Make sure that you are using the most recent device driver available from the
Wo rl d Wi d e We b.
v Run the network diagnostic program.
If the problem remains, go to Starting the diagnostic programson page 14 to run the diagnostic programs.
Check the following:
v Make sure that the hub is turned on. v Check all connections at the Ethernet controller and the hub. v Check the cable. A crossover cable is required unless the hub has an X
designation.
v Use another port on the hub. v If the hub does not support auto-negotiation, manually configure the Ethernet
controller to match the hub.
v If you manually configured the duplex mode, make sure that you also manually
configure the speed.
v Run diagnostics on the LEDs.
If the problem remains, go to Starting the diagnostic programson page 14 to run the diagnostic programs.
The Ethernet Transmit/Receive Activity light does not light.
Data is incorrect or sporadic. Check the following:
The Ethernet controller stopped working when another adapter was added to the server.
Check the following: Note: The Ethernet Transmit/Receive Activity LED illuminates only when data is sent to or by this Ethernet controller.
v Make sure that you have loaded the network device drivers. v The network might be idle. Try sending data from this workstation. v Run diagnostics on the LEDs. v The function of this LED can be changed by device driver load parameters. If
necessary, remove any LED parameter settings when you load the device drivers.
v Make sure that you are using Category 5 cabling when operating the server at 100
Mbps.
v Make sure that the cables do not run close to noise-inducing sources like
fluorescent lights.
Check the following:
v Make sure that the cable is connected to the Ethernet controller. v Make sure that your PCI system BIOS is current. v Reseat the adapter. v Determine if the interrupt (IRQ) setting assigned to the Ethernet adapter is also
assigned to another device in the Configuration/Setup Utility program. Although interrupt sharing is allowed for PCI devices, some devices do not
function well when they share an interrupt with a dissimilar PCI device. Try changing the IRQ assigned to the Ethernet adapter or the other device.
If the problem remains, go to Starting the diagnostic programson page 14 to run the diagnostic programs.
18 Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
Table 1. Ethernet troubleshooting chart (continued)
Ethernet controller problem Suggested Action
The Ethernet controller stopped working without apparent cause.
Check the following:
v Run diagnostics for the Ethernet controller. v Try a different connector on the hub. v Reinstall the device drivers. Refer to your operating-system documentation and to
the ServerGuide information.
If the problem remains, go to Starting the diagnostic programson page 14 to run the diagnostic programs.
Power checkout
Power problems can be difficult to troubleshoot. For instance, a short circuit can exist anywhere on any of the power distribution busses. Usually a short circuit will cause the power subsystem to shut down because of an overcurrent condition.
A general procedure for troubleshooting power problems is as follows:
1. Power off the system and disconnect the AC cord(s).
2. Check for loose cables in the power subsystem. Also check for short circuits, for
instance if there is a loose screw causing a short circuit on a circuit board.
3. Remove adapters and disconnect the cables and power connectors to all internal and external devices until system is at minimum configuration required for power on (see Minimum operating requirementson page 101).
4. Reconnect the AC cord and power on the system. If the system powers up successfully, replace adapters and devices one at a time until the problem is isolated. If system does not power up from minimal configuration, replace FRUs of minimal configuration one at a time until the problem is isolated.
To use this method it is important to know the minimum configuration required for a system to power up (see page 101). For specific problems, see Power error messageson page 98.
Replacing the battery
IBM has designed this product with your safety in mind. The lithium battery must be handled correctly to avoid possible danger. If you replace the battery, you must adhere to the following instructions.
v Statement 2
Diagnostics 19
CAUTION:
When replacing the lithium battery, use only IBM Part Number 33F8354 or an equivalent type battery recommended by the manufacturer. If your system has a module containing a lithium battery, replace it only with the same module type made by the same manufacturer. The battery contains lithium and can explode if not properly used, handled, or disposed of.
Do not:
v Throw or immerse into water. v Heat to more than 100°C (212°F) v Repair or disassemble
Dispose of the battery as required by local ordinances or regulations.
Note: In the U.S., call 1-800-IBM-4333 for information about battery disposal.
If you replace the original lithium battery with a heavy-metal battery or a battery with heavy-metal components, be aware of the following environmental consideration. Batteries and accumulators that contain heavy metals must not be disposed of with normal domestic waste. They will be taken back free of charge by the manufacturer, distributor, or representative, to be recycled or disposed of in a proper manner.
Note: Before you begin be sure to read Before you beginon page 42. Follow any
special handling and installation instructions supplied with the replacement battery.
Note: After you replace the battery, you must reconfigure your server and reset
the system date and time.
To replace the battery:
1. Review the information in Before you beginon page 42 and any special handling and installation instructions supplied with the replacement battery.
2. Turn off the server and peripheral devices and disconnect all external cables and power cords; then, remove the server cover.
3. Remove the battery: a. Use one finger to lift the battery clip over the battery. b. Use one finger to slightly slide the battery from its socket. The spring
mechanism behind the battery will push the battery out toward you as you slide it from the socket.
c. Use your thumb and index finger to pull the battery from under the battery
clip.
d. Ensure that the battery clip is touching the base of the battery socket by
pressing gently on the clip.
20 Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
4. Insert the new battery: a. Tilt the battery so that you can insert it into the socket, under the battery
clip.
b. As you slide it under the battery clip, press the battery down into the
socket.
5. Reinstall the server cover and connect the cables.
6. Turn the server on.
7. Start the Configuration/Setup Utility program and set configuration
parameters.
v Set the system date and time. v Set the power-on password. v Reconfigure your server.
Temperature checkout
Proper cooling of the system is important for proper operation and system reliability. For a typical xSeries 342 server, you should make sure:
v Each of the drive bays has either a drive or a filler panel installed v Each of the power supply bays has either a power supply or a filler panel
installed
v The top cover is in place during normal operation v There is at least 50 mm (2 inches) of ventilated space at the sides of the server
and 100 mm (4 inches) at the rear of the server
v The top cover is removed for no longer than 30 minutes while the server is
operating
v The processor housing cover covering the processor and memory area is
removed for no longer that ten minutes while the server is operating
v A removed hot-swap drive is replaced within two minutes of removal v Cables for optional adapters are routed according to the instructions provided
with the adapters (ensure that cables are not restricting air flow)
v The fans are operating correctly and the air flow is good v A failed fan is replaced within 48 hours
In addition, ensure that the environmental specifications for the system are met. See Features and specificationson page 3.
Diagnostics 21
For more information on specific temperature error messages, see Temperature error messageson page 97.
22 Hardware Maintenance Manual: xSeries 342 Model 1RX, 2RX, 1TG, 2TG
Loading...
+ 128 hidden pages