Lenovo 3822, 3821, 3719, 3823 User Manual

Download

Page 1

Hardware Maintenance Manual

ThinkServer TD200x Machine Types: 3719, 3821, 3822, and 3823

Page 2

Page 3

ThinkServer TD200x Types 3719, 3821, 3822, and 3823

Hardw are Maintenance Man ual

Page 4

Note: Before using this information and the product it supports, read the general information in Appendix B, “Notices,” on page 279 and the Warranty and Support Information document on the Lenovo®ThinkServer Documentation DVD.

First Edition (July 2009)

LENOVO products, data, computer software, and services have been developed exclusively at private expense and are sold to governmental entities as commercial items as defined by 48 C.F.R. 2.101 with limited and restricted rights to use, reproduction and disclosure.

LIMITED AND RESTRICTED RIGHTS NOTICE: If products, data, computer software, or services are delivered pursuant a General Services Administration ″GSA″ contract, use, reproduction, or disclosure is subject to restrictions set forth in Contract No. GS-35F-05925.

Page 5

Chapter 1. About this manual ...................1

Important Safety Information ....................1

Important information about replacing RoHS compliant FRUs ........2

Turkish statement of compliance ...................3

Chapter 2. Safety information ...................5

Guidelines for trained service technicians ...............6

Inspecting for unsafe conditions ..................6

Guidelines for servicing electrical equipment .............6

Safety statements ........................8

Chapter 3. General information ..................15

Features and technologies ....................15

Specifications .........................17

Software ...........................18

EasyStartup .........................19

EasyManage.........................19

Chapter 4. General Checkout ...................21

Checkout procedure .......................21

About the checkout procedure ..................21

Performing the checkout procedure ................22

Diagnosing a problem ......................22

Undocumented problems .....................25

Chapter 5. Diagnostics .....................27

Diagnostic tools ........................27

Event logs ..........................27

Viewing event logs through the Setup utility .............28

Viewing event logs without restarting the server ............28

POST error codes........................30

System-event log ........................38

Integrated management module error messages ............38

Troubleshooting tables ......................64

DVD drive problems ......................64

General problems .......................65

Hard disk drive problems ....................65

Intermittent problems......................66

Keyboard, mouse, or pointing-device problems ............67

Memory problems .......................68

Microprocessor problems ....................69

Monitor problems .......................70

Optional-device problems ....................72

Power problems .......................73

Serial port problems ......................74

Software problems ......................75

Universal Serial Bus (USB) port problems ..............75

EasyLED diagnostics ......................76

Remind button ........................87

Power-supply LEDs ......................88

Diagnostic programs, messages, and error codes ............90

Running the diagnostic programs .................90

Diagnostic text messages ....................90

Page 6

Viewing the test log ......................91

Diagnostic messages .....................91

Recovering from a Lenovo ThinkServer Server Firmware update failure . . . 122

Solving power problems .....................123

Solving Ethernet controller problems ................123

Solving undetermined problems ..................124

Problem determination tips ....................125

Chapter 6. Locating Server Controls and connectors .........127

Front view ..........................127

Operator information panel ...................127

EasyLED diagnostics panel ...................129

Rear view ..........................130

System-board internal connectors .................131

System-board external connectors .................133

System-board switches and jumpers ................133

System-board LEDs ......................135

SAS backplane connectors ....................138

Power-supply LEDs.......................139

Internal LEDs, connectors, and jumpers ...............140

System-board internal connectors ................141

System-board switches and jumpers ...............144

System-board LEDs .....................145

System-board external connectors ................146

2.5-inch hard disk drive backplane connectors ............147

Server power features ......................147

Turning on the server .....................147

Turning off the server .....................148

Chapter 7. Installing optional devices and replacing customer replaceable

units ...........................149

Server components .......................149

Opening the bezel .......................150

Closing the bezel .......................151

Removing the bezel ......................152

Installing the bezel .......................154

Opening the bezel media door...................155

Closing the bezel media door ...................156

Removing the left-side cover ...................157

Installing the left-side cover ....................158

Opening the power-supply cage ..................158

Closing the power-supply cage ..................160

Turning the stabilizing feet ....................162

Internal cable routing and connectors ................163

Removing the air baffle .....................169

Installing the air baffle ......................170

Removing the fan-cage assembly .................171

Installing the fan-cage assembly ..................172

Removing the battery ......................172

Installing the battery ......................173

Removing a hot-swap power supply.................174

Installing a hot-swap power supply .................175

Installing redundant power supply and fans ..............177

Removing a voltage regulator module ................179

Installing a voltage regulator module ................180

Removing the front adapter-retention bracket .............180

iv ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 7

Installing the front adapter-retention bracket ..............181

Removing the rear adapter retention bracket .............182

Installing the rear adapter retention bracket ..............183

Removing an adapter ......................184

Installing an adapter ......................184

Removing the DVD drive .....................187

Installing a DVD (optical) drive...................188

Removing an optional tape drive ..................189

Installing a USB or SATA tape drive .................190

|| ||

Removing the USB cable and EasyLED panel .............193

Installing the USB cable and EasyLED panel .............194

Removing a 2.5-inch hot-swap hard disk drive .............195

Installing a 2.5-inch hot-swap hard disk drive .............196

Removing a 2.5-inch disk drive backplane ..............198

Installing a 2.5-inch disk drive backplane ...............200

Removing the 2.5-inch disk drive cage................202

Installing the 2.5-inch disk drive cage ................204

Removing the operator information panel assembly ...........205

Installing the operator information panel assembly ...........206

Removing an extender card....................207

Installing an extender card ....................209

Removing a memory module ...................210

Installing a memory module ....................211

Independent channel mode ...................212

Memory mirroring mode ....................212

Removing a hot-swap fan ....................217

Installing a hot-swap fan .....................218

Removing a microprocessor and heat sink ..............218

Installing a microprocessor and heat sink ...............220

Thermal grease .......................226

Removing a heat-sink retention module ...............227

Installing a heat-sink retention module ................228

Removing a microprocessor retention module .............229

Installing a microprocessor retention module .............230

Removing the system board ...................231

Installing the system board ....................232

Completing the installation ....................233

Connecting the cables.....................234

Updating the server configuration.................234

Chapter 8. Parts Listing, TD200x Machine Types 3719, 3821, 3822, and

3823 ...........................237

Power cords .........................247

Chapter 9. Configuring the server.................251

Using the Setup Utility ......................252

Starting the Setup Utility ....................252

Setup Utility menu choices ...................252

Passwords .........................255

Using the Boot Selection Menu program ...............258

RAID controllers ........................258

Using the LSI Configuration Utility program .............259

Using the WebBIOS utility ...................261

Using the ThinkServer EasyStartup DVD...............263

Before you use the ThinkServer EasyStartup DVD..........263

Configuring RAID ......................263

Contents v

Page 8

EasyStartup overview .....................264

Installing your operating system without using EasyStartup .......266

Enabling the Broadcom Gigabit Ethernet Utility program .........266

Configuring the Gigabit Ethernet controller ..............266

Updating the firmware ......................267

Using the EasyUpdate Firmware Updater tool ............267

Starting the backup server firmware.................268

Using the Integrated Management Module ..............268

Using the remote presence capability and blue-screen capture .......269

Obtaining the IP address for the Web interface access.........269

Logging on to the Web interface .................270

Advanced Settings Utility program .................270

Installing ThinkServer EasyManage software .............271

Installation requirements ....................271

Installation order .......................271

Installing Windows 2003 components on the Core Server ........272

Installing Windows 2008 32-bit components .............272

Uninstalling the LANDesk Software Agent .............273

Appendix A. Getting help and technical assistance ..........275

Before you call ........................275

Getting help and information from the World Wide Web .........275

Calling for service .......................275

Using other services ......................276

Purchasing additional services...................277

Lenovo product service .....................277

Appendix B. Notices ......................279

Trademarks..........................280

Important notes ........................280

Product recycling and disposal ..................281

Compliance with Republic of Turkey Directive on the Restriction of Hazardous

Substances .........................282

Recycling statements for Japan ..................282

Battery return program .....................283

German Ordinance for Work gloss statement .............284

Electronic emission notices ....................284

Federal Communications Commission (FCC) statement ........284

Industry Canada Class A emission compliance statement ........285

Avis de conformité à la réglementation d’Industrie Canada .......285

Australia and New Zealand Class A statement ............285

United Kingdom telecommunications safety requirement ........285

European Union EMC Directive conformance statement ........285

Germany Class A compliance statement ..............285

Japan Voluntary Control Council for Interference (VCCI) statement ....287

Taiwan Class A warning statement ................287

People’s Republic of China Class A warning statement.........287

Korea Class A warning statement ................287

Index ............................289

vi ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 9

Chapter 1. About this manual

This Hardware Maintenance Manual contains information to help you solve problems that might occur in your server. It describes the diagnostic tools that come with the server, error codes and suggested actions, and instructions for replacing failing components.

Replaceable components are of three types:

| | |

| |

v Self-service customer replaceable unit (CRU): Replacement of self-service

CRUs is your responsibility. If Lenovo installs a self-service CRU at your request, you will be charged for the installation.

v Optional-service customer replaceable unit: You may install an

optional-service CRU yourself or request Lenovo to install it, at no additional charge, under the type of warranty service that is designated for the server.

v Field replaceable unit (FRU): FRUs must be installed only by trained service

technicians.

The most recent version of this document is available at http://www.lenovo.com/ support.

Before servicing a Lenovo product, be sure to read the Safety Information. See Chapter 2, “Safety information,” on page 5.

For information about the terms of the warranty and getting service and assistance, see the Warranty and Support Information document.

Important Safety Information

Be sure to read all caution and danger statements in this book before performing any of the instructions.

Veuillez lire toutes les consignes de type DANGER et ATTENTION du présent document avant d’exécuter les instructions.

Lesen Sie unbedingt alle Hinweise vom Typ ″ACHTUNG″ oder ″VORSICHT″ in dieser Dokumentation, bevor Sie irgendwelche Vorgänge durchführen

Leggere le istruzioni introdotte da ATTENZIONE e PERICOLO presenti nel manuale prima di eseguire una qualsiasi delle istruzioni

Certifique-se de ler todas as instruções de cuidado e perigo neste manual antes de executar qualquer uma das instruções

Es importante que lea todas las declaraciones de precaución y de peligro de este manual antes de seguir las instrucciones.

Page 10

Important information about replacing RoHS compliant FRUs

RoHS, The Restriction of Hazardous Substances in Electrical and Electronic Equipment Directive (2002/95/EC) is a European Union legal requirement affecting the global electronics industry. RoHS requirements must be implemented on Lenovo products placed on the market and sold in the European Union after June 2006. Products on the market before June 2006 are not required to have RoHS compliant parts. If the parts are not compliant originally, replacement parts can also be noncompliant, but in all cases, if the parts are compliant, the replacement parts must also be compliant.

Note: RoHS and non-RoHS FRU part numbers with the same fit and function are

identified with unique FRU part numbers.

Lenovo plans to transition to RoHS compliance well before the implementation date and expects its suppliers to be ready to support Lenovo’s requirements and schedule in the EU. Products sold in 2005, will contain some RoHS compliant FRUs. The following statement pertains to these products and any product Lenovo produces containing RoHS compliant parts.

RoHS compliant ThinkCentre parts have unique FRU part numbers. Before or after June, 2006, failed RoHS compliant parts must always be replaced using RoHS compliant FRUs, so only the FRUs identified as compliant in the system HMM or direct substitutions for those FRUs can be used.

Products marketed before June 2006 Products marketed after June 2006

Current or original part

Non-RoHS Can be Non-RoHS Must be RoHS Must be RoHS Non-RoHS Can be RoHS Non-RoHS Can sub to RoHS RoHS Must be RoHS

Replacement FRU Current or original

part

Replacement FRU

Note: A direct substitution is a part with a different FRU part number that is

automatically shipped by the distribution center at the time of order.

2 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 11

Turkish statement of compliance

The Lenovo product meets the requirements of the Republic of Turkey Directive on the Restriction of the Use of Certain Hazardous Substances in Electrical and Electronic Equipment (EEE).

Türkiye EEE Yönetmeliğine Uygunluk Beyanı

Bu Lenovo ürünü, “Elektrik ve Elektronik Eşyalarda Bazı Zararlı Maddelerin Kullanımının Sınırlandırılmasına Dair Yönetmelik (EEE)” direktiflerine uygundur.

EEE Yönetmeliğ

T.C. Çevre ve Orman Bakanlığı'nın

ine Uygundur.

Chapter 1. About this manual 3

Page 12

4 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 13

Chapter 2. Safety information

Before installing this product, read the Safety Information.

Antes de instalar este produto, leia as Informações de Segurança.

Pred instalací tohoto produktu si prectete prírucku bezpecnostních instrukcí.

Læs sikkerhedsforskrifterne, før du installerer dette produkt.

Lees voordat u dit product installeert eerst de veiligheidsvoorschriften.

Ennen kuin asennat tämän tuotteen, lue turvaohjeet kohdasta Safety Information.

Avant d’installer ce produit, lisez les consignes de sécurité.

Vor der Installation dieses Produkts die Sicherheitshinweise lesen.

Prima di installare questo prodotto, leggere le Informazioni sulla Sicurezza.

Les sikkerhetsinformasjonen (Safety Information) før du installerer dette produktet.

Antes de instalar este produto, leia as Informações sobre Segurança.

Antes de instalar este producto, lea la información de seguridad.

Läs säkerhetsinformationen innan du installerar den här produkten.

Page 14

Guidelines for trained service technicians

This section contains information for trained service technicians.

Inspecting for unsafe conditions

Use the information in this section to help you identify potential unsafe conditions in a Lenovo product that you are working on. Each Lenovo product, as it was designed and manufactured, has required safety items to protect users and service technicians from injury. The information in this section addresses only those items. Use good judgment to identify potential unsafe conditions that might be caused by non-Lenovo alterations or attachment of non-Lenovo features or options that are not addressed in this section. If you identify an unsafe condition, you must determine how serious the hazard is and whether you must correct the problem before you work on the product.

Consider the following conditions and the safety hazards that they present: v Electrical hazards, especially primary power. Primary voltage on the frame can

cause serious or fatal electrical shock.

v Explosive hazards, such as a damaged CRT face or a bulging capacitor. v Mechanical hazards, such as loose or missing hardware.

To inspect the product for potential unsafe conditions, complete the following steps:

1. Make sure that the power is off and the power cord is disconnected.

2. Make sure that the exterior cover is not damaged, loose, or broken, and

observe any sharp edges.

3. Check the power cord:

v Make sure that the third-wire ground connector is in good condition. Use a

meter to measure third-wire ground continuity for 0.1 ohm or less between the external ground pin and the frame ground.

v Make sure that the power cord is the correct type. v Make sure that the insulation is not frayed or worn.

4. Remove the cover.

5. Check for any obvious non-Lenovo alterations. Use good judgment as to the

safety of any non-Lenovo alterations.

6. Check inside the server for any obvious unsafe conditions, such as metal filings,

contamination, water or other liquid, or signs of fire or smoke damage.

7. Check for worn, frayed, or pinched cables.

8. Make sure that the power-supply cover fasteners (screws or rivets) have not

been removed or tampered with.

Guidelines for servicing electrical equipment

Observe the following guidelines when servicing electrical equipment: v Check the area for electrical hazards such as moist floors, nongrounded power

extension cords, power surges, and missing safety grounds.

v Use only approved tools and test equipment. Some hand tools have handles that

are covered with a soft material that does not provide insulation from live electrical currents.

v Regularly inspect and maintain your electrical hand tools for safe operational

condition. Do not use worn or broken tools or testers.

6 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 15

v Do not touch the reflective surface of a dental mirror to a live electrical circuit.

The surface is conductive and can cause personal injury or equipment damage if it touches a live electrical circuit.

v Some rubber floor mats contain small conductive fibers to decrease electrostatic

discharge. Do not use this type of mat to protect yourself from electrical shock.

v Do not work alone under hazardous conditions or near equipment that has

hazardous voltages.

v Locate the emergency power-off (EPO) switch, disconnecting switch, or electrical

outlet so that you can turn off the power quickly in the event of an electrical accident.

v Disconnect all power before you perform a mechanical inspection, work near

power supplies, or remove or install main units.

v Before you work on the equipment, disconnect the power cord. If you cannot

disconnect the power cord, have the customer power-off the wall box that supplies power to the equipment and lock the wall box in the off position.

v Never assume that power has been disconnected from a circuit. Check it to

make sure that it has been disconnected.

v If you have to work on equipment that has exposed electrical circuits, observe

the following precautions: – Make sure that another person who is familiar with the power-off controls is

near you and is available to turn off the power if necessary.

– When you are working with powered-on electrical equipment, use only one

hand. Keep the other hand in your pocket or behind your back to avoid creating a complete circuit that could cause an electrical shock.

– When you use a tester, set the controls correctly and use the approved probe

leads and accessories for that tester.

– Stand on a suitable rubber mat to insulate you from grounds such as metal

floor strips and equipment frames.

v Use extreme care when you measure high voltages. v To ensure proper grounding of components such as power supplies, pumps,

blowers, fans, and motor generators, do not service these components outside of their normal operating locations.

v If an electrical accident occurs, use caution, turn off the power, and send another

person to get medical aid.

Chapter 2. Safety information 7

Page 16

Safety statements

Important:

Each caution and danger statement in this document is labeled with a number. This number is used to cross reference an English-language caution or danger statement with translated versions of the caution or danger statement in the Safety Information document.

For example, if a caution statement is labeled "Statement 1," translations for that caution statement are in the Safety Information document under "Statement 1."

Be sure to read all caution and danger statements in this document before you perform the procedures. Read any additional safety information that comes with the server or optional device before you install the device.

Attention: Use No. 26 AWG or larger UL-listed or CSA certified telecommunication line cord.

8 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 17

Statement 1:

DANGER

Electrical current from power, telephone, and communication cables is hazardous.

To avoid a shock hazard: v Do not connect or disconnect any cables or perform installation,

maintenance, or reconfiguration of this product during an electrical storm.

v Connect all power cords to a properly wired and grounded electrical

outlet.

v Connect to properly wired outlets any equipment that will be attached to

this product.

v When possible, use one hand only to connect or disconnect signal

cables.

v Never turn on any equipment when there is evidence of fire, water, or

structural damage.

v Disconnect the attached power cords, telecommunications systems,

networks, and modems before you open the device covers, unless instructed otherwise in the installation and configuration procedures.

v Connect and disconnect cables as described in the following table when

installing, moving, or opening covers on this product or attached devices.

To Connect: To Disconnect:

1. Turn everything OFF.

2. First, attach all cables to devices.

3. Attach signal cables to connectors.

4. Attach power cords to outlet.

5. Turn device ON.

1. Turn everything OFF.

2. First, remove power cords from outlet.

3. Remove signal cables from connectors.

4. Remove all cables from devices.

Chapter 2. Safety information 9

Page 18

Statement 2:

CAUTION: When replacing the lithium battery, use only a type battery recommended by the manufacturer. If your system has a module containing a lithium battery, replace it only with the same module type made by the same manufacturer. The battery contains lithium and can explode if not properly used, handled, or disposed of.

Do not:

v Throw or immerse into water v Heat to more than 100°C (212°F) v Repair or disassemble

Dispose of the battery as required by local ordinances or regulations.

10 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 19

Statement 3:

CAUTION: When laser products (such as CD-ROMs, DVD drives, fiber optic devices, or transmitters) are installed, note the following:

v Do not remove the covers. Removing the covers of the laser product could

result in exposure to hazardous laser radiation. There are no serviceable parts inside the device.

v Use of controls or adjustments or performance of procedures other than

those specified herein might result in hazardous radiation exposure.

DANGER

Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following.

Laser radiation when open. Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam.

Class 1 Laser Product Laser Klasse 1 Laser Klass 1 Luokan 1 Laserlaite Appareil A Laser de Classe 1

Chapter 2. Safety information 11

Page 20

Statement 4:

≥ 18 kg (39.7 lb) ≥ 32 kg (70.5 lb) ≥ 55 kg (121.2 lb)

CAUTION: Use safe practices when lifting.

Statement 5:

CAUTION: The power control button on the device and the power switch on the power supply do not turn off the electrical current supplied to the device. The device also might have more than one power cord. To remove all electrical current from the device, ensure that all power cords are disconnected from the power source.

12 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 21

Statement 8:

CAUTION: Never remove the cover on a power supply or any part that has the following label attached.

Hazardous voltage, current, and energy levels are present inside any component that has this label attached. There are no serviceable parts inside these components. If you suspect a problem with one of these parts, contact a service technician.

Statement 26:

CAUTION: Do not place any object on top of rack-mounted devices.

Attention: This server is suitable for use on an IT power distribution system

whose maximum phase-to-phase voltage is 240 V under any distribution fault condition.

Important: This product is not suitable for use with visual display workplace devices according to Clause 2 of the German Ordinance for Work with Visual Display Units.

Chapter 2. Safety information 13

Page 22

14 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 23

Chapter 3. General information

This chapter provides general information that applies to all machine types supported by this publication.

Features and technologies

The TD200x server offers the following features and technologies:

v UEFI-compliant server firmware

The server firmware offers several features, including Unified Extensible Firmware Interface (UEFI) 2.1 compliance, enhanced RAS capabilities, and BIOS compatibility support. UEFI replaces the basic input/output system (BIOS) and defines a standard interface between the operating system, platform firmware, and external devices. UEFI-compliant servers are capable of starting UEFI-compliant operating systems, BIOS-based operating systems, and BIOS-based adapters as well as UEFI-compliant adapters.

Note: The server does not support DOS.

v Integrated Management Module

The integrated management module (IMM) combines service processor functions, video controller, and remote presence function in a single chip. The IMM provides advanced service-processor control, monitoring, and alerting function. If an environmental condition exceeds a threshold or if a system component fails, the IMM lights LEDs to help you diagnose the problem, records the error in the event log, and alerts you to the problem. The IMM also provides a virtual presence capability for remote server management capabilities. The IMM provides remote server management through industry-standard interfaces:

– Intelligent Platform Management Interface (IPMI) version 2.0 – Simple Network Management Protocol (SNMP) version 3 – Common Information Model (CIM) – Web browser

v Remote presence capability and blue-screen capture

The remote presence feature provides the following functions: – Remotely viewing video with graphics resolutions up to 1600 x 1200 at 85 Hz,

regardless of the system state

– Remotely accessing the server, using the keyboard and mouse from a remote

client

– Mapping the CD or DVD drive, diskette drive, and USB flash drive on a

remote client, and mapping ISO and diskette image files as virtual drives that are available for use by the server

– Uploading a diskette image to the IMM memory and mapping it to the server

as a virtual drive

The blue-screen capture feature captures the video display contents before the IMM restarts the server when the IMM detects an operating-system hang condition. A system administrator can use the blue-screen capture to assist in determining the cause of the hang condition.

v Preboot diagnostics programs

The preboot diagnostics programs are stored on the integrated USB memory. It collects and analyzes system information to aid in diagnosing server problems. The diagnostics programs collect the following information about the server:

Page 24

– System configuration – Network interfaces and settings – Installed hardware – EasyLED diagnostics status – Service processor status and configuration – Vital product data, firmware, and UEFI (formerly BIOS) configuration – Hard disk drive health – RAID controller configuration – Event logs for service processors The diagnostic programs create a merged log that includes events from all

collected logs. The information is collected into a file that you can send to Lenovo service and support. Additionally, you can view the information locally through a generated text report file. You can also copy the log to a removable media and view the log from a Web browser.

For additional information about preboot diagnostics, see “Running the diagnostic programs” on page 90.

v EasyStartup DVD

The ThinkServer EasyStartup program guides you through the configuration of the hardware, the RAID controller, and the installation of the operating system and device drivers.

v EasyManage DVD

The ThinkServer EasyManage program helps you manage and administer your servers and clients through remote problem notification as well as monitoring and alerting.

v Integrated network support

The server comes with one integrated Broadcom 5709C series Gigabit Ethernet controller, which supports connection to a 10 Mbps, 100 Mbps, or 1000 Mbps network. For more information, see “Enabling the Broadcom Gigabit Ethernet Utility program” on page 266.

v Intelligent Platform Management Interface (IPMI) 2.0

IPMI 2.0 support providing secure remote power-on/power-off and several standard alerts for components such as fans, voltage, and temperature.

v Large data-storage capacity and hot-swap capability

The server supports up to eight or 16 (depending on your model) 2.5-inch hot-swap hard disk drives in the hot-swap bays. With the hot-swap feature, you can add, remove, or replace hard disk drives without turning off the server.

v Large system-memory capacity

The server supports up to 64 GB of system memory. The memory controller supports error correcting code (ECC) for up to 16 single-sided industry-standard third-generation double-data-rate 3 (DDR3) 800, 1066, and 1333, 240-pin, registered, synchronous dynamic random access memory (SDRAM) dual inline memory modules (DIMMs).

v EasyLED diagnostics

EasyLED diagnostics provides LEDs to help you diagnose problems. For more information, see “EasyLED diagnostics panel” on page 129.

v Memory mirroring

Memory mirroring improves the availability of memory by writing information to the main memory and redundant locations in a mirrored pair of DIMMs.

v PCI-32 adapter capabilities

16 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 25

The server has one slot for a PCI-32 adapter.

v PCI Express x8 adapter capabilities

The server has five slots for PCI Express x8 adapters. Three of these slots accept x8 adapters, but the adapters will operate as x4 adapters.

v PCI Express x16 adapter capabilities

The server has one slot for PCI Express x16 adapter, which will operate as an x8 adapter.

v Redundant cooling and power capabilities

The server supports up to two 920-watt hot-swap power supplies. If the server came with only one power supply, you can install an additional power supply with three redundant hot-swap cooling fans to add redundant power and cooling capabilities. If the maximum load on the server is less than 920 watts and a problem occurs with one of the power supplies, the other power supply can meet the power requirements. The redundant cooling of the fans enables continued operation if one of the fans fails.

v RAID support

The server supports an internal RAID SAS Controller, which is required for you to use the hot-swap hard disk drives and to create redundant array of independent disks (RAID) configurations.

v Symmetric multiprocessing (SMP)

The server supports up to two Intel

Xeon®quad-core microprocessors. If the server comes with only one microprocessor, you can install an additional microprocessor to enhance performance and provide SMP capability.

v Systems-management capabilities

The server contains an Integrated Management Module (IMM) which enables you to manage the functions of the server locally and remotely and provides remote presence and blue-screen capture capability. The IMM also provides system monitoring and event recording.

v TCP/IP offload engine (TOE) support

The Ethernet controllers in the server support TOE, which is a technology that offloads the TCP/IP flow from the microprocessors and I/O subsystem to increase the speed of the TCP/IP flow. When an operating system that supports TOE is running on the server and TOE is enabled, the server supports TOE operation. See the operating-system documentation for information about enabling TOE.

Specifications

Note: As of the date of this document, the Linux

operating system does not

support TOE.

The following information is a summary of the features and specifications of the server. Depending on the server model, some features might not be available, or some specifications might not apply.

Chapter 3. General information 17

Page 26

Table 1. Features and specifications

Microprocessor:

v Intel Xeon dual-core or quad-core with

integrated memory controller and Quick Path Interconnect (QPI) architecture

v Designed for LGA 1366 socket v Scalable up to four cores v 32 KB instruction cache, 32 KB data cache,

and 8 MB cache that is shared among the cores

v Support for up to two microprocessors, second

microprocessor with pluggable VRM

v Support for Intel Extended Memory 64

Technology (EM64T)

Note: Use the Setup Utility to determine the type and speed of the microprocessors. For a list of supported microprocessors, see http://www.lenovo.com/thinkserver and click

Options.

Memory:

v 16 DIMM connectors (eight per

microprocessor)

v Minimum: 2 GB DIMM per microprocessor v Maximum: 64 GB v Type: Registered ECC DDR3 800, 1066, and

1333 MHz DIMMs only

v Sizes: 1 GB single-rank, 2 GB single-rank or

dual-rank, 4 GB dual-rank (PC3-10600R-999)

Drives:

v S ATA:

– DVD (standard) – DVD/CD-RW (optional) – Maximum of two devices can be installed

v Diskette (optional): External USB 1.44 MB v Supported hard disk drives:

– Serial Attached SCSI (SAS)

Expansion bays:

v 16 hot-swap SAS 2.5-inch bays v Three half-high 5.25-inch bays (one DVD drive

installed) Note: Full-high devices such as an optional tape drive will occupy two half-high

5.25-inch bays.

PCI and PCI-X expansion slots:

v Six PCI expansion slots on system board

– Two PCI Express x8 (x4 link) – Two PCI Express x8 (x8 link) – One PCI Express x16 (x8 link) – One PCI 32-bit

v One PCI Express x8 (x4 link) on the extender

card

Power supply: Note: To upgrade to two 920-watt hot-swap

power supplies, install the redundant power and cooling option kit. Kit includes one hot-swap 920-watt power-supply and three hot-swap fans. v Standard: One 920-watt 110 V or 240 V ac

input dual-rated power supply

v Upgradeable to two 920-watt hot-swap power

supplies

Hot-swap fans:

v Three (standard) v Upgradeable to six fans (for redundant

cooling)

Note: To upgrade to redundant cooling, install the redundant power and cooling option kit. Kit includes one 920-watt hot-swap power-supply and three hot-swap fans.

Size:

v Tower

– Height: 440 mm (17.3 inches) – Depth: 767 mm (30.2 inches) – Width: 218 mm (8.6 inches) – Weight: approximately 38 kg (84 lb.) when

fully configured or 20 kg (42 lb.) minimum

Integrated functions:

v Integrated management module (IMM), which

provides service processor control and monitoring functions, video controller, remote keyboard, video, mouse, and remote hard disk drive capabilities

v Dedicated or shared management network

connections

v Six-port Serial ATA (SATA) controller v Serial over LAN (SOL) and serial redirection

over Telnet or Secure Shell (SSH)

v Support for remote management presence v One systems-management RJ-45 for

connection to a dedicated systems-management network

v EasyLED diagnostics v Six Universal Serial Bus (USB) ports

standard (v2.0 supporting v1.1) – Four on rear of server – Two on front of server

v One internal USB tape connector v One Broadcom dual-port 10/100/1000

Ethernet controller with Wake on LAN support and TCP/IP Offload Engine (TOE) support

v One serial connector, shared with the IMM

Note: In messages and documentation, the term service processor refers to the integrated management module (IMM).

Video controller:

v Matrox G200 video on system board v Compatible with SVGA and VGA v 8 MB DDR2 SDRAM video memory

Note: Maximum video resolution 1600 x 1200 at 85 MHz

RAID controllers:

v ServeRAID-BR10i SAS/SATA Controller that

supports RAID levels 0, 1, 1E (standard)

v Upgradeable to ServeRAID-MR10i SAS/SATA

Controller, which supports RAID levels 0, 1, 5, 6, 10

v Upgradeable to ServeRAID-MR10is SAS/SATA

Controller, which supports RAID levels 0, 1, 5, 6, 10

Acoustical noise emissions:

v Sound power, idle: 5.5 bel declared v Sound power, operating: 6.0 bel declared

Environment:

v Air temperature:

– Server on: 10° to 35° C (50.0° to 95.0° F);

altitude: 0 to 914.4 m (3000 ft.)

– Server off: -40° to 60° C (-40.0° to 140.4° F);

maximum altitude: 2133.6 m (7000 ft.)

v Humidity:

– Server on: 8% to 80% – Server off: 8% to 80%

Heat output:

Approximate heat output in British thermal units (Btu) per hour: v Minimum configuration: 2013 Btu per hour (590

watts)

v Maximum configuration: 3610 Btu per hour

(1058 watts)

Electrical input:

v Sine-wave input (50-60 Hz) required v Input voltage low range:

– Minimum: 100 V ac – Maximum: 127 V ac

v Input voltage high range:

– Minimum: 200 V ac – Maximum: 240 V ac

v Approximate input kilovolt-amperes (kVA):

– Minimum: 0.60 kVA – Maximum: 1.10 kVA

Notes:

1. Power consumption and heat output vary depending on the number and type of optional features that are installed and the power-management optional features that are in use.

2. These levels were measured in controlled acoustical environments according to the procedures that are specified by the American National Standards Institute (ANSI) S12.10 and ISO 7779 and are reported in accordance with ISO 9296. Actual sound-pressure levels in a given location might exceed the average stated values because of room reflections and other nearby noise sources. The declared sound-power levels indicate an upper limit, below which a large number of computers will operate.

Software

Lenovo provides software to help get your server up and running.

18 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 27

EasyStartup

EasyManage

The ThinkServer EasyStartup program simplifies the process of your RAID controller and installing supported Microsoft systems and device drivers on your server. The EasyStartup program is provided with your server on DVD. The DVD is self starting (bootable). The user guide for the EasyStartup program is on the DVD and can be accessed directly from the program interface. For additional information, see “Using the ThinkServer EasyStartup DVD” on page 263.

The ThinkServer EasyManage Core Server provides centralized hardware and software inventory management and secure automated system management through a centralized console. The ThinkServer EasyManage Agent enables other clients on the network to be managed by the centralized console. The ThinkServer EasyManage Core Server is supported on Microsoft Windows Server 2003 and Microsoft Windows Server 2008 (32-bit) products. The ThinkServer EasyManage Agent is supported on 32-bit and 64-bit Windows, Red Hat, and SUSE operating systems.

Windows®and Linux operating

Chapter 3. General information 19

Page 28

20 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 29

Chapter 4. General Checkout

You can solve many problems without outside assistance by following the troubleshooting procedures in this Hardware Maintenance Manual and on the Lenovo Web site. This document describes the diagnostic tests that you can perform, troubleshooting procedures, and explanations of error messages and error codes. The documentation that comes with your operating system and software also contains troubleshooting information.

Checkout procedure

The checkout procedure is the sequence of tasks that you should follow to diagnose a problem in the server.

About the checkout procedure

Before you perform the checkout procedure for diagnosing hardware problems, review the following information:

v Read the safety information that begins on page vii. v The diagnostic programs provide the primary methods of testing the major

components of the server, such as the system board, Ethernet controller, keyboard, mouse (pointing device), serial ports, and hard disk drives. You can also use them to test some external devices. If you are not sure whether a problem is caused by the hardware or by the software, you can use the diagnostic programs to confirm that the hardware is working correctly.

v When you run the diagnostic programs, a single problem might cause more than

one error message. When this happens, correct the cause of the first error message. The other error messages usually will not occur the next time you run the diagnostic programs.

| | | |

Exception: If multiple error codes or EasyLED diagnostics LEDs indicate a microprocessor error, the error might be in a microprocessor or in a microprocessor socket. See “Microprocessor problems” on page 69 for information about diagnosing microprocessor problems.

v Before you run the diagnostic programs, you must determine whether the failing

server is part of a shared hard disk drive cluster (two or more servers sharing external storage devices). If it is part of a cluster, you can run all diagnostic programs except the ones that test the storage unit (that is, a hard disk drive in the storage unit) or the storage adapter that is attached to the storage unit. The failing server might be part of a cluster if any of the following conditions is true:

– You have identified the failing server as part of a cluster (two or more servers

sharing external storage devices).

– One or more external storage units are attached to the failing server and at

least one of the attached storage units is also attached to another server or unidentifiable device.

– One or more servers are located near the failing server.

Important: If the server is part of a shared hard disk drive cluster, run one test at a time. Do not run any suite of tests, such as “quick” or “normal” tests, because this might enable the hard disk drive diagnostic tests.

Page 30

v If the server is halted and a POST error code is displayed, see “POST error

codes” on page 30. If the server is halted and no error message is displayed, see “Troubleshooting tables” on page 64 and “Solving undetermined problems” on page 124.

v For information about power-supply problems, see “Solving power problems” on

page 123 and “Power-supply LEDs” on page 88.

v For intermittent problems, check the system-event log; see “Event logs” on page

27, “System-event log” on page 38, and “Diagnostic programs, messages, and error codes” on page 90.

Performing the checkout procedure

To perform the checkout procedure, complete the following steps:

1. Is the server part of a cluster?

v No: Go to step 2. v Yes: Shut down all failing servers that are related to the cluster. Go to step 2.

2. Complete the following steps: a. Turn off the server and all external devices. b. Check all cables and power cords.

| | |

c. Check all internal and external devices for compatibility at

http://www.lenovo.com/thinkserver and then click Options. Open the Server

Options Guide.pdf. d. Set all display controls to the middle positions. e. Turn on all external devices. f. Turn on the server. If the server does not start, see “Troubleshooting tables”

on page 64.

g. Check the system-error LED on the operator information panel (see

Chapter 6, “Locating Server Controls and connectors,” on page 127). If it is

flashing, check the EasyLED diagnostics LEDs (see “EasyLED diagnostics”

on page 76). h. Check for the following results:

v Successful completion of POST

v Successful completion of startup, indicated by a readable display of the

operating-system desktop

3. Are there readable instructions on the main menu? v No: Find the failure symptom in “Troubleshooting tables” on page 64; if

necessary, see “Solving undetermined problems” on page 124.

v Yes: Run the diagnostic programs (see “Running the diagnostic programs” on

page 90). – If you receive an error, see “Diagnostic messages” on page 91. – If the diagnostic programs were completed successfully and you still

suspect a problem, see “Solving undetermined problems” on page 124.

Diagnosing a problem

Before you contact Lenovo or an approved warranty service provider, follow these procedures in the order in which they are presented to diagnose a problem with your server:

1. Determine what has changed.

22 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 31

Determine whether any of the following items were added, removed, replaced, or updated before the problem occurred:

v Lenovo ThinkServer Server Firmware (server firmware) v Device drivers v Firmware v Hardware components v Software

If possible, return the server to the condition it was in before the problem occurred.

2. Collect data. Thorough data collection is necessary for diagnosing hardware and software

problems. a. Document error codes and system-board LEDs.

v System error codes: See “Viewing the test log” on page 91 for

information about error codes.

v Software or operating-system error codes: See the documentation for

the software or operating system for information about a specific error code. See the manufacturer's Web site for documentation.

v EasyLED diagnostics LEDs: See “EasyLED diagnostics” on page 76 for

information about EasyLED diagnostics LEDs that are lit.

v System-board LEDs: See “System-board LEDs” on page 135 for

information about system-board LEDs that are lit.

“EasyLED diagnostics” on page 76

b. Collect system data.

Run Dynamic System Analysis (DSA) to collect information about the hardware, firmware, software, and operating system. Have this information available when you contact Lenovo or an approved warranty service provider. For instructions for running the DSA program, see “Running the diagnostic programs” on page 90.

If you have to download the latest version of DSA , complete the following steps.

Note: Changes are made periodically to the Lenovo Web site. The actual procedure might vary slightly from what is described in this document.

1) Go to: http://www.lenovo.com/support.

2) Enter your product number (machine type and model number) or select Servers and Storage from the Select your product list.

3) Select Servers and Storage from the Brand list.

4) From Family list, select ThinkServer TD200x, and click Continue.

5) Click Downloads and drivers and look at the list for the Preboot DSA CD image.

3. Follow the problem-resolution procedures. The four problem-resolution procedures are presented in the order in which they

are most likely to solve your problem. Follow these procedures in the order in which they are presented:

a. Check for and apply code updates.

Most problems that appear to be caused by faulty hardware are actually caused by Lenovo ThinkServer Server Firmware (server firmware), system firmware, device firmware, or device drivers that are not at the latest levels.

Chapter 4. General Checkout 23

Page 32

Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

1) Determine the existing code levels. In DSA, click Firmware/VPD to view system firmware levels, or click

Software to view operating-system levels.

2) Download and install updates of code that is not at the latest level. To display a list of available updates for your server, complete the

following steps.

Note: Changes are made periodically to the Lenovo Web site. The actual procedure might vary slightly from what is described in this document.

a) Go to: http://www.lenovo.com/support. b) Enter your product number (machine type and model number) or

select Servers and Storage from the Select your product list. c) Select Servers and Storage from the Brand list. d) From Family list, select ThinkServer TD200x, and click Continue. e) Click System TD200x to display the list of downloadable files for the

server.

b. Check for and correct an incorrect configuration.

If the server is incorrectly configured, a system function can fail to work when you enable it; if you make an incorrect change to the server configuration, a system function that has been enabled can stop working.

1) Make sure that all installed hardware and software are supported. See http://www.lenovo.com/thinkserver to verify that the server supports

the installed operating system, optional devices, and software levels. If any hardware or software component is not supported, uninstall it to determine whether it is causing the problem. You must remove nonsupported hardware before you contact Lenovo or an approved warranty service provider for support.

2) Make sure that the server, operating system, and software are

installed and configured correctly.

Many configuration problems are caused by loose power or signal cables or incorrectly seated adapters. You might be able to solve the problem by turning off the server, reconnecting cables, reseating adapters, and turning the server back on. For information about performing the checkout procedure, see “Checkout procedure” on page

21. If the problem is associated with a specific function (for example, if a

RAID hard disk drive is marked offline in the RAID array), see the documentation for the associated controller and management or controlling software to verify that the controller is correctly configured.

Problem determination information is available for many devices such as RAID and network adapters.

For problems with operating systems or Lenovo software or devices, complete the following steps.

24 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 33

Note: Changes are made periodically to the Lenovo Web site. The

actual procedure might vary slightly from what is described in this document.

a) Go to: http://www.lenovo.com/support. b) Enter your product number (machine type and model number) or

select Servers and Storage from the Select your product list. c) Select Servers and Storage from the Brand list. d) From Family list, select ThinkServer TD200x, and click Continue. e) Under Support & downloads, click Documentation, Install, and

Use to search for related documentation.

| | |

c. Check for troubleshooting procedures, and hints and tips.

Troubleshooting procedures, and hints and tips document known problems and suggested solutions. To search for troubleshooting procedures, and hints and tips, complete the following steps.

| |

| | | | | |

Note: Changes are made periodically to the Lenovo Web site. The actual procedure might vary slightly from what is described in this document.

1) Go to: http://www.lenovo.com/support.

2) Enter your product number (machine type and model number) or select Servers and Storage from the Select your product list.

3) Select Servers and Storage from the Brand list.

4) From Family list, select ThinkServer TD200x, and click Continue.

5) Under Support & downloads, click Troubleshoot.

6) Select the troubleshooting procedure or hints and tips that applies to your problem:

v Troubleshooting procedures are under Diagnostic. v Hints and tips are under Troubleshoot.

d. Check for and replace defective hardware.

If a hardware component is not operating within specifications, it can cause unpredictable results. Most hardware failures are reported as error codes in a system or operating-system log. For more information, see “Troubleshooting tables” on page 64 and Chapter 7, “Installing optional devices and replacing customer replaceable units,” on page 149. Hardware errors are also indicated by EasyLED diagnostics LEDs.

A single problem might cause multiple symptoms. Follow the troubleshooting procedure for the most obvious symptom. If that procedure does not diagnose the problem, use the procedure for another symptom, if possible.

If the problem remains, contact Lenovo or an approved warranty service provider for assistance with additional problem determination and possible hardware replacement. Be prepared to provide information about any error codes and collected data.

Undocumented problems

If you have completed the diagnostic procedure and the problem remains, the problem might not have been previously identified by Lenovo. After you have verified that all code is at the latest level, all hardware and software configurations are valid, and no EasyLED diagnostics LEDs or log entries indicate a hardware component failure, contact Lenovo or an approved warranty service provider for assistance. Be prepared to provide information about any error codes and collected data and the problem determination procedures that you have used.

Chapter 4. General Checkout 25

Page 34

26 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 35

Chapter 5. Diagnostics

This chapter describes the diagnostic tools that are available to help you solve problems that might occur in the server.

If you cannot diagnose and correct a problem by using the information in this chapter, see Appendix A, “Getting help and technical assistance,” on page 275 for more information.

Diagnostic tools

The following tools are available to help you diagnose and solve hardware-related problems:

v POST error messages

The power-on self-test (POST) generates messages to indicate successful test completion or the detection of a problem. See “POST error codes” on page 30 for more information.

v Event logs

For information about the POST event log, the system-event log, the integrated management module (IMM) event log, and the DSA log, see “Event logs” and “System-event log” on page 38.

v Troubleshooting tables

These tables list problem symptoms and actions to correct the problems. See “Troubleshooting tables” on page 64.

v EasyLED diagnostics

| |

Use the EasyLED diagnostics to diagnose system errors quickly. See “EasyLED diagnostics” on page 76 for more information.

v Diagnostic programs, messages, and error codes

The diagnostic programs are the primary method of testing the major components of the server. See “Diagnostic programs, messages, and error codes” on page 90 for more information.

Event logs

Error codes and messages are displayed in the following types of event logs: v POST event log: This log contains the three most recent error codes and

messages that were generated during POST. You can view the POST event log through the Setup utility.

v System-event log: This log contains all IMM, POST, and system management

interrupt (SMI) events. You can view the system-event log through the Setup utility and through the Dynamic System Analysis (DSA) program (as the IPMI event log).

The system-event log is limited in size. When it is full, new entries will not overwrite existing entries; therefore, you must periodically save and then clear the system-event log through the Setup utility when the IMM logs an event that indicates that the log is more than 75% full. When you are troubleshooting, you might have to save and then clear the system-event log to make the most recent events available for analysis.

Messages are listed on the left side of the screen, and details about the selected message are displayed on the right side of the screen. To move from one entry to the next, use the Up Arrow (↑) and Down Arrow (↓) keys.

Page 36

Some IMM sensors cause assertion events to be logged when their setpoints are reached. When a setpoint condition no longer exists, a corresponding deassertion event is logged. However, not all events are assertion-type events.

v Integrated management module (IMM) event log: This log contains a filtered

subset of all IMM, POST, and system management interrupt (SMI) events. You can view the IMM event log through the IMM Web interface and through the Dynamic System Analysis (DSA) program (as the ASM event log).

v DSA log: This log is generated by the Dynamic System Analysis (DSA) program,

and it is a chronologically ordered merge of the system-event log (as the IPMI event log), the IMM event log (as the ASM event log), and the operating-system event logs. You can view the DSA log through the DSA program.

Viewing event logs through the Setup utility

To view the POST event log or system-event log, complete the following steps:

1. Turn on the server.

2. When the prompt <F1> Setup is displayed, press F1. If you have set both a power-on password and an administrator password, you must type the administrator password to view the event logs.

3. Select System Event Logs and use one of the following procedures:

v To view the POST event log, select POST Event Viewer. v To view the system-event log, select System Event Log.

Viewing event logs without restarting the server

If the server is not hung, methods are available for you to view one or more event logs without having to restart the server.

| | | |

You can use the DSA Preboot to view the system event log (as the IPMI event log), the IMM event log (as the ASM event log), or the merged DSA log. You must restart the server to use DSA Preboot to view those logs. To install a DSA Preboot CD image, complete the following steps:

Note: Changes are made periodically to the Lenovo Web site. The actual procedure might vary slightly from what is described in this document.

1. Go to: http://www.lenovo.com/support.

2. Enter your product number (machine type and model number) or select Servers and Storage from the Select your product list.

3. Select Servers and Storage from the Brand list.

4. From Family list, select ThinkServer TD200x, and click Continue.

5. Click Downloads and drivers and look at the list for the Preboot DSA CD image.

You can view the IMM event log through the Event Log link in the integrated management module (IMM) Web interface.

The following table describes the methods that you can use to view the event logs, depending on the condition of the server. The first two conditions generally do not require that you restart the server.

28 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 37

Table 2. Methods for viewing event logs

Condition Action

The server is not hung and is connected to a network.

Use any of the following methods: v Run Portable or Installable DSA to view

the event logs or create an output file that you can send to Lenovo service and support.

v Type the IP address of the IMM and go to

the Event Log page.

v Use IPMItool to view the system-event log.

The server is not hung and is not connected to a network.

The server is hung.

Use IPMItool locally to view the system-event log.

v If DSA Preboot is installed, restart the

server and press F2 to start DSA Preboot and view the event logs.

v If DSA Preboot is not installed, insert the

DSA Preboot CD and restart the server to start DSA Preboot and view the event logs.

v Alternatively, you can restart the server

and press F1 to start the Setup utility and view the POST event log or system-event log. For more information, see “Viewing event logs through the Setup utility” on page 28.

Chapter 5. Diagnostics 29

Page 38

POST error codes

When you turn on the server, it performs a series of tests to check the operation of the server components and some optional devices in the server. This series of tests is called the power-on self-test, or POST.

If a power-on password is set, you must type the password and press Enter, when you are prompted, for POST to run.

If POST is completed without detecting any problems, the server startup is completed.

If POST detects a problem, an error message is sent to the POST event log.

The following table describes the POST error codes and suggested actions to correct the detected problems. These errors can appear as severe, warning, or informational.

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Error code Description Action

0010002 Microprocessor not supported

0011000 Invalid microprocessor type

1. Reseat the following components one at a time, in the order shown, restarting the server each time:

a. (Trained service technician only)

Microprocessor 1

b. (Trained service technician only)

Microprocessor 2 (if one is installed)

2. (Trained service technician only) Remove microprocessor 2 and restart the server.

3. (Trained service technician only) Remove microprocessor 1 and install microprocessor 2 in the microprocessor 1 connector. Restart the server. If the error is corrected, microprocessor 1 is bad and must be replaced.

4. Replace the following components one at a time, in the order shown, restarting the server each time:

a. (Trained service technician only)

Microprocessor 1

b. (Trained service technician only)

Microprocessor 2

c. (Trained service technician only) System

board

1. Update the firmware (see “Updating the firmware” on page 267).

2. (Trained service technician only) Remove and replace the affected microprocessor (error LED is lit) with a supported type.

30 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 39

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Error code Description Action

0011002 Microprocessor mismatch

1. Run the Setup utility and view the microprocessor information to compare the installed microprocessor specifications.

2. (Trained service technician only) Remove and replace one of the microprocessors so that they both match.

0011004 Microprocessor failed BIST

1. Update the firmware (see “Updating the firmware” on page 267).

2. (Trained service technician only) Reseat microprocessor 2.

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. (Trained service technician only)

Microprocessor

b. (Trained service technician only) System

board

001100A Microcode update failed

1. Update the server firmware (see “Updating the firmware” on page 267).

2. (Trained service technician only) Replace the microprocessor.

0050001 DIMM disabled

1. If the server fails the POST memory test, reseat the DIMMs.

2. Remove and replace any DIMM for which the associated error LED is lit (see “Removing a memory module” on page 210 and “Installing a memory module” on page 211).

3. Run the Setup utility to enable all the DIMMs.

4. Run the DSA memory test.

0051003 Uncorrectable DIMM error

1. If the server failed the POST memory test, reseat the DIMMs.

2. Remove and replace any DIMM for which the associated error LED is lit (see “Removing a memory module” on page 210 and “Installing a memory module” on page 211).

3. Run the Setup utility to enable all the DIMMs.

4. Run the DSA memory test.

0051006 DIMM mismatch detected Make sure that the DIMMs match and are installed in

the correct sequence (see “Installing a memory module” on page 211).

Chapter 5. Diagnostics 31

Page 40

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Error code Description Action

0051009 No memory detected

005100A No usable memory detected

0058001 PFA threshold exceeded

0058007 DIMM population is unsupported

0058008 DIMM failed memory test

00580A1 Invalid DIMM population for mirroring mode

1. Make sure that the server contains DIMMs.

2. Reseat the DIMMs.

3. Install DIMMs in the correct sequence (see “Installing a memory module” on page 211).

1. Make sure that the server contains DIMMs.

2. Reseat the DIMMs.

3. Install DIMMs in the correct sequence (see “Installing a memory module” on page 211).

4. Clear CMOS memory to re-enable all the memory connectors.

1. Update the firmware (see“Updating the firmware” on page 267).

2. Reseat the DIMMs and run the memory test.

3. Replace the failing DIMM, which is indicated by a lit LED on the system board.

1. Reseat the DIMMs, and then restart the server.

2. Remove the lowest-numbered DIMM pair of those that are identified, replace it with an identical pair of known good DIMMs, and then restart the server. Repeat as necessary. If the failures continue, go to step 4.

3. Return the removed DIMMs, one pair at a time, to their original connectors, restarting the server after each pair, until a pair fails. Replace the DIMMs in the failed pair with identical known good DIMMs, restarting the server after each DIMM is installed. Replace the failed DIMM. Repeat this step until you have tested all removed DIMMs.

4. (Trained service technician only) Replace the system board.

1. Reseat the DIMMs, and then restart the server.

2. Replace the following components one at a time, in the order shown, restarting the server each time:

a. DIMMs b. (Trained service technician only) System

board

1. If a fault LED is lit, resolve the failure.

2. Install the DIMMs in the correct sequence (see “Installing a memory module” on page 211).

32 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 41

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Error code Description Action

00580A4 Memory population changed Information only. Memory has been added, moved, or

changed.

00580A5 Mirror failover complete Information only. Memory redundancy has been lost.

Check the event log for uncorrected DIMM failure events.

0068002 CMOS battery cleared

1. Reseat the battery.

2. Clear the CMOS memory (see “System-board switches and jumpers” on page 144).

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Battery b. (Trained service technician only) System

board

2011000 PCI-X PERR

1. Check the extender card LEDs.

2. Reseat all affected adapters and extender cards.

3. Update the PCI device firmware.

4. Remove the adapters from the extender card.

5. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Extender card b. (Trained service technician only) System

board

2011001 PCI-X SERR

1. Check the extender-card LEDs.

2. Reseat all affected adapters and extender cards.

3. Update the PCI device firmware.

4. Remove the adapters from the extender card.

5. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Extender card b. (Trained service technician only) System

board

Chapter 5. Diagnostics 33

Page 42

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Error code Description Action

2018001 PCI Express uncorrected or uncorrected

error

2018002 Option ROM resource allocation failure Informational message that some devices might not

3xx0007 (xx can be 00 - 19)

3038003 Firmware corrupted

3048005 Booted secondary (backup) server firmware

Firmware fault detected, system halted

image

1. Check the extender-card LEDs.

2. Reseat all affected adapters and extender cards.

3. Update the PCI device firmware.

4. Remove both adapters from the extender card.

5. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Extender card b. (Trained service technician only) System

board

be initialized.

1. If possible, rearrange the order of the adapters in the PCI slots to change the load order of the optional-device ROM code.

2. Run the Setup utility, select Start Options, and change the boot priority to change the load order of the optional-device ROM code.

3. Run the Setup utility and disable some other resources, if their functions are not being used, to make more space available. Select Devices and I/O Ports to disable any of the integrated devices.

4. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Each adapter b. (Trained service technician only) System

board

1. Recover the server firmware to the latest level.

2. Undo any recent configuration changes, or clear CMOS memory to restore the settings to the default values.

3. Remove any recently installed hardware.

1. Run the Setup utility, select Load Default Settings, and save the settings to recover the server firmware.

2. (Trained service technician only) Replace the system board.

Information only. The backup switch was used to boot the secondary bank.

34 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 43

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Error code Description Action

3048006 Booted secondary (backup) server firmware

image because of ABR

1. Run the Setup utility, select Load Default Settings, and save the settings to recover the primary server firmware settings.

2. Turn off the server and remove it from the power source.

3. Reconnect the server to the power source, and then turn on the server.

305000A RTC date/time is incorrect

1. Adjust the date and time settings in the Setup utility, and then restart the server.

2. Reseat the battery.

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Battery b. (Trained service technician only) System

board

3058001 System configuration invalid

1. Run the Setup utility, and select Save Settings.

2. Run the Setup utility, select Load Default Settings, and save the settings.

3. Reseat the following components one at a time in the order shown, restarting the server each time:

a. Battery b. Failing device (if the device is a FRU, it must

be reseated by a trained service technician only)

4. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Battery b. Failing device (if the device is a FRU, it must

be replaced by a trained service technician only)

c. (Trained service technician only) System

board

Chapter 5. Diagnostics 35

Page 44

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Error code Description Action

3058004 Three boot failures

| |

3108007 System configuration restored to default

settings

3138002 Boot configuration error

3808000 IMM communication failure

3808002 Error updating system configuration to IMM

3808003 Error retrieving system configuration from

IMM

3808004 IMM system event log full

1. Undo any recent system changes, such as new settings or newly installed devices.

2. Make sure that the server is attached to a reliable power source.

3. Remove all hardware that is not listed on the ThinkServer ready Web site.

4. Make sure that the operating system is not corrupted.

5. Run the Setup utility, save the configuration, and then restart the server.

Information only. This is message is usually associated with the CMOS battery clear event.

1. Remove any recent configuration changes that you made in the Setup utility.

2. Run the Setup utility, select Load Default Settings, and save the settings.

1. Remove power from the server for 30 seconds, and then reconnect the server to power and restart it.

2. Update the IMM firmware.

3. (Trained service technician only) Replace the system board.

1. Remove power from the server, and then reconnect the server to power and restart it.

2. Run the Setup utility and select Save Settings.

3. Update the firmware.

1. Remove power from the server, and then reconnect the server to power and restart it.

2. Run the Setup utility and select Save Settings.

3. Update the IMM firmware.

v When out-of-band, use the IMM Web interface or

IPMItool to clear the logs from the operating system.

v When using the local console:

1. Run the Setup utility.

2. Select System Event Logs.

3. Select Clear System Event Log.

4. Restart the server.

36 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 45

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Error code Description Action

3818001 Core Root of Trust Measurement (CRTM)

update failed

1. Run the Setup utility, select Load Default Settings, and save the settings.

2. (Trained service technician only) Replace the system board.

3818002 Core Root of Trust Measurement (CRTM)

update aborted

1. Run the Setup utility, select Load Default Settings, and save the settings.

2. (Trained service technician only) Replace the system board.

3818003 Core Root of Trust Measurement (CRTM)

flash lock failed

1. Run the Setup utility, select Load Default Settings, and save the settings.

2. (Trained service technician only) Replace the system board.

3818004 Core Root of Trust Measurement (CRTM)

system error

1. Run the Setup utility, select Load Default Settings, and save the settings.

2. (Trained service technician only) Replace the system board.

3818005 Current Bank Core Root of Trust

Measurement (CRTM) capsule signature invalid

1. Run the Setup utility, select Load Default Settings, and save the settings.

2. (Trained service technician only) Replace the system board.

3818006 Opposite bank CRTM capsule signature

invalid

1. Switch the firmware bank to the backup bank.

2. Run the Setup utility, select Load Default Settings, and save the settings.

3. Switch the bank back to the current bank.

4. (Trained service technician only) Replace the system board.

3818007 CRTM update capsule signature invalid

1. Run the Setup utility, select Load Default Settings, and save the settings.

2. (Trained service technician only) Replace the system board.

Chapter 5. Diagnostics 37

Page 46

System-event log

The system-event log contains messages of three types:

Information

Information messages do not require action; they record significant system-level events, such as when the server is started.

Warning

Warning messages do not require immediate action; they indicate possible problems, such as when the recommended maximum ambient temperature is exceeded.

Error Error messages might require action; they indicate system errors, such as

when a fan is not detected.

Each message contains date and time information, and it indicates the source of the message (POST or the IMM).

Integrated management module error messages

The following table describes the IMM error messages and suggested actions to correct the detected problems. For more information about IMM, see the IMM User’s Guide on the Web.

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Message Severity Description Action

Numeric sensor Ambient Temp going high (upper critical) has asserted.

Numeric sensor Ambient Temp going high (upper non-recoverable) has asserted.

Numeric sensor Planar 3.3V going low (lower critical) has asserted.

Numeric sensor Planar 3.3V going high (upper critical) has asserted.

Numeric sensor Planar 5V going low (lower critical) has asserted.

Numeric sensor Planar 5V going high (upper critical) has asserted.

Numeric sensor Planar 12V going low (lower critical) has asserted.

Numeric sensor Planar 12V going high (upper critical) has asserted.

Error An upper critical sensor

going high has asserted.

Error An upper nonrecoverable

sensor going high has asserted.

Error A lower critical sensor going

low has asserted.

Error An upper critical sensor

going high has asserted.

Error A lower critical sensor going

low has asserted.

Error An upper critical sensor

going high has asserted.

Error A lower critical sensor going

low has asserted.

Error An upper critical sensor

going high has asserted.

Reduce the ambient temperature.

(Trained service technician only) Replace the system board.

Check the power-supply LED on the EasyLED panel (see “EasyLED diagnostics” on page

76). Check the power-supply LED on

the EasyLED panel (see “EasyLED diagnostics” on page

76).

38 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 47

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Numeric sensor Planar VBAT going low (lower critical) has asserted.

Numeric sensor Fan n Tach going low (lower critical) has asserted. (n = fan number)

Error A lower critical sensor going

low has asserted.

Error A lower critical sensor going

low has asserted.

Replace the 3 V battery.

1. Reseat the failing fan n, which is indicated by a lit LED on the fan.

2. Replace the failing fan.

(n = fan number)

The Processor CPU nStatus has Failed with IERR. (n = microprocessor number)

Error A processor failed - IERR

condition has occurred.

1. Make sure that the latest levels of firmware and device drivers are installed for all adapters and standard devices, such as Ethernet, SCSI, and SAS. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

2. Run the DSA program for the hard disk drives and other I/O devices.

3. (Trained service technician only) Replace microprocessor n.

(n = microprocessor number)

An Over-Temperature Condition has been detected on the Processor CPU nStatus. (n = microprocessor number)

Error An overtemperature

condition has occurred for microprocessor n. (n = microprocessor number)

1. Make sure that the fans are operating, that there are no obstructions to the airflow, that the air baffle is in place and correctly installed, and that the server cover is installed and completely closed.

2. Make sure that the heat sink for microprocessor nis installed correctly.

3. (Trained service technician only) Replace microprocessor n.

(n = microprocessor number)

Chapter 5. Diagnostics 39

Page 48

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

The Processor CPU nStatus has Failed with FRB1/BIST condition. (n = microprocessor number)

The Processor CPU nStatus has a Configuration Mismatch. (n = microprocessor number)

Error A processor failed -

FRB1/BIST condition has occurred.

Error A processor configuration

mismatch has occurred.

1. Check for a server firmware update. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

2. Make sure that the installed microprocessors are compatible with each other (see “Installing a microprocessor and heat sink” on page 220 for information about microprocessor requirements).

3. (Trained service technician only) Reseat microprocessor n.

4. (Trained service technician only) Replace microprocessor n.

(n = microprocessor number)

1. Make sure that the installed microprocessors are compatible with each other (see “Installing a microprocessor and heat sink” on page 220 for information about microprocessor requirements).

2. (Trained service technician only) Replace the incompatible microprocessor.

40 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 49

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

An SM BIOS Uncorrectable CPU complex error for Processor CPU nStatus has asserted. (n = microprocessor number)

Error An SMBIOS uncorrectable

CPU complex error has asserted.

2. Make sure that the installed microprocessors are compatible with each other (see “Installing a microprocessor and heat sink” on page 220 for information about microprocessor requirements).

3. (Trained service technician only) Reseat microprocessor n.

4. (Trained service technician only) Replace microprocessor n.

(n = microprocessor number)

Sensor CPU nOverTemp has transitioned to critical from a less severe state. (n = microprocessor number)

Error A sensor has changed to

Critical state from a less severe state.

2. Make sure that the heat sink for microprocessor n is installed correctly.

3. (Trained service technician only) Replace microprocessor n.

(n = microprocessor number)

Chapter 5. Diagnostics 41

Page 50

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Sensor CPU nOverTemp has transitioned to non-recoverable from a less severe state. (n = microprocessor number)

Sensor CPU nOverTemp has transitioned to critical from a non-recoverable state. (n = microprocessor number)

Sensor CPU nOverTemp has transitioned to non-recoverable. (n = microprocessor number)

Error A sensor has changed to

Nonrecoverable state from a less severe state.

Error A sensor has changed to

Critical state from Nonrecoverable state.

Error A sensor has changed to

Nonrecoverable state.

2. Make sure that the heat sink for microprocessor n is installed correctly.

3. (Trained service technician only) Replace microprocessor n.

(n = microprocessor number)

2. Make sure that the heat sink for microprocessor nis installed correctly.

3. (Trained service technician only) Replace microprocessor n.

(n = microprocessor number)

2. Make sure that the heat sink for microprocessor nis installed correctly.

3. (Trained service technician only) Replace microprocessor n.

(n = microprocessor number)

42 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 51

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

A diagnostic interrupt has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)

Error An operator information

panel NMI/diagnostic interrupt has occurred.

If the NMI button on the system board has not been pressed, complete the following steps:

1. Make sure that the NMI button is not pressed.

2. Replace the operator information panel cable.

3. Replace the operator information panel.

A bus timeout has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)

Error A bus timeout has occurred.

1. Remove the adapter from the PCI slot that is indicated by a lit LED.

2. Replace the extender card.

3. Remove all PCI adapters.

4. (Trained service technicians only) Replace the system board.

A software NMI has occurred on system %1. (%1 = CIM_ComputerSystem.

Error A software NMI has

occurred.

1. Check the device driver.

2. Reinstall the device driver.

ElementName) The System %1 encountered a

POST Error. (%1 = CIM_ComputerSystem. ElementName)

Error A POST error has occurred.

(Sensor = ABR Status)

1. Recover the server firmware from the backup page (see “Recovering from a Lenovo ThinkServer Server Firmware update failure” on page 122).

2. Update the server firmware to the latest level. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

Chapter 5. Diagnostics 43

Page 52

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

The System %1 encountered a POST Error. (%1 = CIM_ComputerSystem. ElementName)

A Uncorrectable Bus Error has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)

Error A POST error has occurred.

(Sensor = Firmware Error)

Error A bus uncorrectable error

has occurred. (Sensor = Critical Int PCI)

1. Update the server firmware on the primary page. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

2. (Trained service technician only) Replace the system board.

1. Check the system-event log.

2. Check the PCI error LEDs.

3. Remove the adapter from the indicated PCI slot.

4. Check for a server firmware update. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

5. (Trained service technician only) Replace the system board.

44 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 53

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

A Uncorrectable Bus Error has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)

Error A bus uncorrectable error

has occurred. (Sensor = Critical Int CPU)

1. Check the system-event log.

2. Check the microprocessor error LEDs.

3. Remove the failing microprocessor from the system board.

5. Make sure that the two microprocessors are matching.

6. (Trained service technician only) Replace the system board.

A Uncorrectable Bus Error has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)

Error A bus uncorrectable error

has occurred. (Sensor = Critical Int DIM)

1. Check the system-event log.

2. Check the DIMM error LEDs.

3. Remove the failing DIMM from the system board.

5. Make sure that the installed DIMMs are supported and configured correctly.

6. (Trained service technician only) Replace the system board.

Chapter 5. Diagnostics 45

Page 54

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Sensor Sys Board Fault has transitioned to critical from a less severe state.

The Power Supply (Power Supply: n) has Failed. (n = power supply number)

Sensor PS n Fan Fault has transitioned to critical from a less severe state. (n = power supply number)

Error A sensor has changed to

Critical state from a less severe state.

Error Power supply nhas failed.

(n = power supply number)

Error A sensor has changed to

Critical state from a less severe state.

1. Check the system-event log.

2. Check for an error LED on the system board.

3. Replace any failing device.

5. (Trained service technician only) Replace the system board.

1. If the power-on LED is lit, complete the following steps:

a. Reduce the server to the

minimum configuration.

b. Reinstall the components

one at a time, restarting the server each time.

c. If the error recurs, replace

the component that you just reinstalled.

2. Reseat power supply n.

3. Replace power supply n.

(n = power supply number)

1. Make sure that there are no obstructions, such as bundled cables, to the airflow from the power-supply fan.

2. Replace power supply n.

(n = power supply number)

46 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 55

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Sensor Pwr Rail A Fault has transitioned to non-recoverable.

Error A sensor has changed to

Nonrecoverable state.

1. Turn off the server and disconnect it from power.

2. (Trained service technician only) Remove the PCI adapter and microprocessor

1. Reinstall the microprocessor in socket 1 and restart the server.

3. Restart the server.

4. Reinstall each device, one at a time, starting the server each time to isolate the failing device.

5. Replace the failing device.

6. (Trained service technician only) Replace the system board.

Sensor Pwr Rail B Fault has transitioned to non-recoverable.

Error A sensor has changed to

Nonrecoverable state.

1. Turn off the server and disconnect it from power.

2. (Trained service technician only) Remove the PCI adapter and microprocessor

3. Restart the server.

4. Reinstall each device, one at a time, starting the server each time to isolate the failing device.

5. Replace the failing device.

6. (Trained service technician only) Replace the system board.

Chapter 5. Diagnostics 47

Page 56

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Sensor Pwr Rail C Fault has transitioned to non-recoverable.

Sensor Pwr Rail D Fault has transitioned to non-recoverable.

Sensor Pwr Rail E Fault has transitioned to non-recoverable.

Error A sensor has changed to

Nonrecoverable state.

Error A sensor has changed to

Nonrecoverable state.

Error A sensor has changed to

Nonrecoverable state.

1. Turn off the server and disconnect it from power.

2. Remove the hard disk drives, hard disk drive backplanes, and DIMMs in connectors 1 through 8.

3. Restart the server.

4. Reinstall each device, one at a time, starting the server each time to isolate the failing device.

5. Replace the failing device.

6. (Trained service technician only) Replace the system board.

1. Turn off the server and disconnect it from power.

2. Remove the optical drive and the DIMMs in connectors 9 through 16.

3. Restart the server.

4. Reinstall the microprocessor in socket 1 and restart the server.

5. (Trained service technician only) Replace the failing microprocessor.

6. (Trained service technician only) Replace the system board.

1. Turn off the server and disconnect it from power.

2. (Trained service technician only) Remove the optical drive and the PCI adapter.

3. Restart the server.

4. Reinstall each device, one at a time, starting the server each time to isolate the failing device.

5. Replace the failing device.

6. (Trained service technician only) Replace the system board.

48 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 57

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Sensor Pwr Rail F Fault has transitioned to non-recoverable.

Error A sensor has changed to

Nonrecoverable state.

1. Turn off the server and disconnect it from power.

2. Remove the hard disk drives and the hard disk drive backplanes.

3. Restart the server.

4. Reinstall each device, one at a time, starting the server each time to isolate the failing device.

5. Replace the failing device.

6. (Trained service technician only) Replace the system board.

Sensor PS n Therm Fault has transitioned to critical from a less severe state. (n = power supply number)

Error A sensor has changed to

Critical state from a less severe state.

1. Make sure that there are no obstructions, such as bundled cables, to the airflow from the power-supply fan.

2. Replace power supply n.

(n = power supply number)

Sensor PSn 12V OV Fault has transitioned to non-recoverable. (n = power supply number)

Error A sensor has changed to

Nonrecoverable state.

1. Check the power-supply LED on the EasyLED panel (see “EasyLED diagnostics” on page 76).

2. Remove the power supplies.

3. Replace power supply n.

4. (Trained service technician only) Replace the system board.

(n = power supply number)

Sensor PSn 12V UV Fault has transitioned to non-recoverable.

Error A sensor has changed to

Nonrecoverable state.

1. Check the power-supply LED on the EasyLED panel (see “EasyLED diagnostics” on page 76).

2. Remove the power supplies.

3. Replace power supply n.

4. (Trained service technician only) Replace the system board.

(n = power supply number)

Chapter 5. Diagnostics 49

Page 58

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Sensor PSn 12V OC Fault has transitioned to non-recoverable. (n = power supply number)

Sensor PS n VCO Fault has transitioned to non-recoverable. (n = power supply number)

Redundancy Power Unit has been reduced.

Redundancy Cooling Zone 1 has been reduced.

Redundancy Cooling Zone 2 has been reduced.

Error A sensor has changed to

Nonrecoverable state.

Error A sensor has changed to

Nonrecoverable state.

Error Redundancy has been lost

and is insufficient to continue operation.

Error Redundancy has been lost

and is insufficient to continue operation.

Error Redundancy has been lost

and is insufficient to continue operation.

1. Check the power-supply LED on the EasyLED panel (see “EasyLED diagnostics” on page 76).

2. Remove the power supplies.

3. Replace power supply n.

4. (Trained service technician only) Replace the system board.

(n = power supply number)

1. Check the power-supply LED on the EasyLED panel (see “EasyLED diagnostics” on page 76).

2. Replace the failing power supply.

(n = power supply number)

1. Check the LEDs for both power supplies.

2. Follow the actions in “Power-supply LEDs” on page

88.

1. Make sure that the connector on fan 1 and fan 4 (if installed) is not damaged.

2. Make sure that the fan connectors on the system board are not damaged.

3. Make sure that the fan cage is correctly installed.

4. Reseat the fan.

5. Replace the fan.

1. Make sure that the connector on fan 2 and fan 5 (if installed) is not damaged.

2. Make sure that the fan connectors on the system board are not damaged.

3. Make sure that the fan cage is correctly installed.

4. Reseat the fan.

5. Replace the fan.

50 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 59

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Redundancy Cooling Zone 3 has been reduced.

Error Redundancy has been lost

and is insufficient to continue operation.

1. Make sure that the connector on fan 3 and fan 6 (if installed) is not damaged.

2. Make sure that the fan connectors on the system board are not damaged.

3. Make sure that the fan cage is correctly installed.

4. Reseat the fan.

5. Replace the fan.

Sensor RAID Error has transitioned to critical from a less severe state.

Error A sensor has changed to

Critical state from a less severe state.

1. Check the hard disk drive LEDs.

2. Reseat the hard disk drive for which the status LED is lit.

3. Replace the defective hard disk drive.

The Drive n Status has been removed from unit Drive 0 Status.

Error A drive has been removed. Reseat hard disk drive n.

(n = hard disk drive number)

(n = hard disk drive number) The Drive n Status has been

disabled due to a detected fault. (n = hard disk drive number)

Error A drive has been disabled

because of a fault.

1. Run the hard disk drive diagnostic test on drive n.

2. Reseat the following components:

a. Hard disk drive b. Cable from the system

board to the backplane

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Hard disk drive b. Cable from the system

board to the backplane

c. Hard disk drive backplane

(n = hard disk drive number)

Array %1 is in critical condition. (%1 = CIM_ComputerSystem. ElementName)

Array %1 has failed. (%1 = CIM_ComputerSystem. ElementName)

Error An array is in Critical state.

(Sensor = Drive n Status) (n = hard disk drive number)

Error An array is in Failed state.

(Sensor = Drive n Status) (n = hard disk drive number)

Replace the hard disk drive that is indicated by a lit status LED.

Chapter 5. Diagnostics 51

Page 60

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Memory uncorrectable error detected for DIMM All DIMMs on Memory Subsystem All DIMMs.

Memory Logging Limit Reached for DIMM All DIMMs on Memory Subsystem All DIMMs.

Memory DIMM Configuration Error for All DIMMs on Memory Subsystem All DIMMs.

Memory uncorrectable error detected for DIMM One of the DIMMs on Memory Subsystem One of the DIMMs.

Error A memory uncorrectable

error has occurred.

Error The memory logging limit

has been reached.

Error A DIMM configuration error

has occurred.

Error A memory uncorrectable

error has occurred.

1. If the server failed the POST memory test, reseat the DIMMs.

2. Replace any DIMM that is indicated by a lit error LED. Note: You do not have to replace DIMMs by pairs.

3. Run the Setup utility to enable all the DIMMs.

4. Run the DSA memory test.

1. Update the server firmware to the latest level. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

2. Reseat the DIMMs and run the DSA memory test.

3. Replace any DIMM that is indicated by a lit error LED.

Make sure that DIMMs are installed in the correct sequence and have the same size, type, speed, and technology.

1. If the server failed the POST memory test, reseat the DIMMs.

2. Replace any DIMM that is indicated by a lit error LED. Note: You do not have to replace DIMMs by pairs.

3. Run the Setup utility to enable all the DIMMs.

4. Run the DSA memory test.

52 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 61

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Memory Logging Limit Reached for DIMM One of the DIMMs on Memory Subsystem One of the DIMMs.

Error The memory logging limit

has been reached.

2. Reseat the DIMMs and run the DSA memory test.

3. Replace any DIMM that is indicated by a lit error LED.

Memory DIMM Configuration Error for One of the DIMMs on Memory Subsystem One of the DIMMs.

Error A DIMM configuration error

has occurred.

Make sure that DIMMs are installed in the correct sequence and have the same size, type, speed, and technology.

Memory uncorrectable error detected for DIMM n Status on Memory Subsystem DIMM n Status. (n = DIMM number)

Error A memory uncorrectable

error has occurred.

1. If the server failed the POST memory test, reseat the DIMMs.

2. Replace any DIMM that is indicated by a lit error LED. Note: You do not have to replace DIMMs by pairs.

3. Run the Setup utility to enable all the DIMMs.

4. Run the DSA memory test.

5. (Trained service technician only) Replace the system board.

Memory Logging Limit Reached for DIMM nStatus on Memory Subsystem DIMMnStatus. (n = DIMM number)

Error The memory logging limit

has been reached.

2. Reseat the DIMMs and run the DSA memory test.

3. Replace any DIMM that is indicated by a lit error LED.

Chapter 5. Diagnostics 53

Page 62

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Memory DIMM Configuration Error for DIMM nStatus on Memory Subsystem DIMM nStatus. (n = DIMM number)

Sensor DIMM n Temp has transitioned to critical from a less severe state. (n = DIMM number)

A PCI PERR has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)

Error A DIMM configuration error

has occurred.

Error A sensor has changed to

Critical state from a less severe state.

Error A PCI PERR has occurred.

(Sensor = PCI Slot n; n = PCI slot number)

Make sure that DIMMs are installed in the correct sequence and have the same size, type, speed, and technology.

1. Make sure that the fans are operating, that there are no obstructions to the airflow, that the air baffles are in place and correctly installed, and that the server cover is installed and completely closed.

2. If a fan has failed, complete the action for a fan failure.

3. Replace DIMM n.

(n = DIMM number)

1. Check the extender-card LEDs.

2. Reseat the affected adapters and extender card.

3. Update the server and adapter firmware (UEFI and IMM). Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

4. Remove the adapter from slot n.

5. Replace the PCIe adapter.

6. Replace extender card n.

(n = PCI slot number)

54 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 63

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

A PCI SERR has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)

Error A PCI SERR has occurred.

(Sensor = PCI Slot n; n = PCI slot number)

1. Check the extender-card LEDs.

2. Reseat the affected adapters and extender card.

4. Remove the adapter from slot n.

5. Replace the PCIe adapter.

6. Replace extender card n.

(n = PCI slot number)

A PCI PERR has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)

Error A PCI PERR has occurred.

(Sensor = One of PCI Err)

1. Check the extender-card LEDs.

2. Reseat the affected adapters and riser card.

4. Remove both adapters.

5. Replace the PCIe adapter.

6. Replace the extender card.

7. (Trained service technician only) Replace the system board.

Chapter 5. Diagnostics 55

Page 64

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

A PCI SERR has occurred on system %1. (%1 = CIM_ComputerSystem. ElementName)

Fault in slot System board on system %1. (%1 = CIM_ComputerSystem. ElementName)

Error A PCI SERR has occurred.

(Sensor = One of PCI Err)

Error

1. Check the extender-card LEDs.

2. Reseat the affected adapters and extender card.

4. Remove both adapters.

5. Replace the PCIe adapter.

6. Replace the extender card.

7. (Trained service technician only) Replace the system board.

1. Check the extender-card LEDs.

2. Reseat the affected adapters and extender card.

4. Remove both adapters.

5. Replace the PCIe adapter.

6. Replace the extender card.

7. (Trained service technician only) Replace the system board.

56 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 65

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Redundancy Bckup Mem Status has been reduced.

Error Redundancy has been lost

and is insufficient to continue operation.

1. Check the system-event log for DIMM failure events (uncorrectable or PFA) and correct the failures.

2. Re-enable mirroring in the Setup utility.

IMM Network Initialization Complete. Info An IMM network has

No action; information only.

completed initialization.

Certificate Authority %1 has detected a %2 Certificate Error. (%1 = Lenovo_CertificateAuthority. CADistinguishedName; %2 = CIM_PublicKeyCertificate. ElementName)

Error A problem has occurred with

the SSL Server, SSL Client, or SSL Trusted CA certificate that has been imported into the IMM. The imported certificate must contain a

1. Make sure that the certificate that you are importing is correct.

2. Try importing the certificate again.

public key that corresponds to the key pair that was previously generated by the

Generate a New Key and Certificate Signing Request link.

Ethernet Data Rate modified from %1 to %2 by user %3.

Info A user has modified the

Ethernet port data rate.

No action; information only.

(%1 = CIM_EthernetPort.Speed; %2 = CIM_EthernetPort.Speed; %3 = user ID)

Ethernet Duplex setting modified from %1 to %2 by user %3.

Info A user has modified the

Ethernet port duplex setting.

No action; information only.

(%1 = CIM_EthernetPort.FullDuplex; %2 = CIM_EthernetPort.FullDuplex; %3 = user ID)

Ethernet MTU setting modified from %1 to %2 by user %3.

Info A user has modified the

Ethernet port MTU setting.

No action; information only.

(%1 = CIM_EthernetPort. ActiveMaximumTransmissionUnit; %2 = CIM_EthernetPort. ActiveMaximumTransmissionUnit; %3 = user ID)

Ethernet Duplex setting modified from %1 to %2 by user %3. (%1 = CIM_EthernetPort.

Info A user has modified the

Ethernet port MAC address setting.

No action; information only.

NetworkAddresses; %2 = CIM_EthernetPort. NetworkAddresses; %3 = user ID)

Ethernet interface %1 by user %2. (%1 = CIM_EthernetPort.EnabledState;

Info A user has enabled or

disabled the Ethernet interface.

No action; information only.

%2 = user ID)

Chapter 5. Diagnostics 57

Page 66

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Hostname set to %1 by user %2. (%1 = CIM_DNSProtocolEndpoint. Hostname; %2 = user ID)

IP address of network interface modified from %1 to %2 by user %3. (%1 = CIM_IPProtocolEndpoint. IPv4Address; %2 = CIM_StaticIPAssignment SettingData.IPAddress; %3 = user ID)

IP subnet mask of network interface modified from %1 to %2 by user %3s. (%1 = CIM_IPProtocolEndpoint. SubnetMask; %2 = CIM_StaticIPAssignment SettingData.SubnetMask; %3 = user ID)

IP address of default gateway modified from %1 to %2 by user %3s. (%1 = CIM_IPProtocolEndpoint. GatewayIPv4Address; %2 = CIM_StaticIPAssignment SettingData. DefaultGatewayAddress; %3 = user ID)

OS Watchdog response %1 by %2. (%1 = Enabled or Disabled; %2 = user ID)

DHCP[%1] failure, no IP address assigned. (%1 = IP address, xxx.xxx.xxx.xxx)

Remote Login Successful. Login ID: %1 from %2 at IP address %3. (%1 = user ID; %2 = ValueMap(CIM_ProtocolEndpoint. ProtocolIFType; %3 = IP address, xxx.xxx.xxx.xxx)

Info A user has modified the host

name of the IMM.

Info A user has modified the IP

address of the IMM.

Info A user has modified the IP

subnet mask of the IMM.

Info A user has modified the

default gateway IP address of the IMM.

Info A user has enabled or

disabled an OS Watchdog.

Info A DHCP server has failed to

assign an IP address to the IMM.

Info A user has successfully

logged in to the IMM.

No action; information only.

1. Make sure that the network cable is connected.

2. Make sure that there is a DHCP server on the network that can assign an IP address to the IMM.

No action; information only.

58 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 67

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Attempting to %1 server %2 by user %3. (%1 = Power Up, Power Down,

Info A user has used the IMM to

perform a power function on the server.

No action; information only.

Power Cycle, or Reset; %2 = Lenovo_ComputerSystem. ElementName; %3 = user ID)

Security: Userid: '%1' had %2 login failures from WEB client at IP address %3. (%1 = user ID; %2 = MaximumSuccessiveLoginFailures (currently set to 5 in the firmware); %3 = IP address, xxx.xxx.xxx.xxx)

Security: Login ID: '%1' had %2 login failures from CLI at %3. (%1 = user ID; %2 = MaximumSuccessiveLoginFailures (currently set to 5 in the firmware); %3 = IP address, xxx.xxx.xxx.xxx)

Remote access attempt failed. Invalid userid or password received. Userid is '%1' from WEB browser at IP address %2. (%1 = user ID; %2 = IP address, xxx.xxx.xxx.xxx)

Remote access attempt failed. Invalid userid or password received. Userid is '%1' from TELNET client at IP address %2. (%1 = user ID; %2 = IP address, xxx.xxx.xxx.xxx)

The Chassis Event Log (CEL) on system %1 cleared by user %2.

Error A user has exceeded the

maximum number of unsuccessful login attempts from a Web browser and has been prevented from logging in for the lockout period.

Error A user has exceeded the

maximum number of unsuccessful login attempts from the command-line interface and has been prevented from logging in for the lockout period.

Error A user has attempted to log

in from a Web browser by using an invalid login ID or password.

Error A user has attempted to log

in from a Telnet session by using an invalid login ID or password.

Info A user has cleared the IMM

event log.

1. Make sure that the correct login ID and password are being used.

2. Have the system administrator reset the login ID or password.

1. Make sure that the correct login ID and password are being used.

2. Have the system administrator reset the login ID or password.

1. Make sure that the correct login ID and password are being used.

2. Have the system administrator reset the login ID or password.

1. Make sure that the correct login ID and password are being used.

2. Have the system administrator reset the login ID or password.

No action; information only.

(%1 = CIM_ComputerSystem. ElementName; %2 = user ID)

IMM reset was initiated by user %1. (%1 = user ID)

Info A user has initiated a reset

of the IMM.

No action; information only.

Chapter 5. Diagnostics 59

Page 68

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

ENET[0] DHCP-HSTN=%1, DN=%2, IP@=%3, SN=%4, GW@=%5, DNS1@=%6. (%1 = CIM_DNSProtocolEndpoint. Hostname; %2 = CIM_DNSProtocolEndpoint. DomainName; %3 = CIM_IPProtocolEndpoint. IPv4Address; %4 = CIM_IPProtocolEndpoint. SubnetMask; %5 = IP address,

xxx.xxx.xxx.xxx; %6 = IP address, xxx.xxx.xxx.xxx)

ENET[0] IP-Cfg:HstName=%1, IP@%2, NetMsk=%3, GW@=%4. (%1 = CIM_DNSProtocolEndpoint. Hostname; %2 = CIM_StaticIPSettingData. IPv4Address; %3 = CIM_StaticIPSettingData. SubnetMask; %4 = CIM_StaticIPSettingData. DefaultGatewayAddress)

LAN: Ethernet[0] interface is no longer active.

LAN: Ethernet[0] interface is now active.

DHCP setting changed to by user %1. (%1 = user ID)

IMM: Configuration %1 restored from a configuration file by user %2. (%1 = CIM_ConfigurationData. ConfigurationName; %2 = user ID)

Watchdog %1 Screen Capture Occurred. (%1 = OS Watchdog or Loader Watchdog)

Info The DHCP server has

assigned an IMM IP address and configuration.

Info An IMM IP address and

configuration have been assigned using client data.

Info The IMM Ethernet interface

has been disabled.

Info The IMM Ethernet interface

has been enabled.

Info A user has changed the

DHCP mode.

Info A user has restored the IMM

configuration by importing a configuration file.

Error An operating-system error

has occurred, and the screen capture was successful.

No action; information only.

1. Reconfigure the watchdog timer to a higher value.

2. Make sure that the IMM Ethernet over USB interface is enabled.

3. Reinstall the RNDIS or cdc_ether device driver for the operating system.

4. Disable the watchdog.

5. Check the integrity of the installed operating system.

60 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 69

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Watchdog %1 Failed to Capture Screen. (%1 = OS Watchdog or Loader Watchdog)

Error An operating-system error

has occurred, and the screen capture failed.

1. Reconfigure the watchdog timer to a higher value.

2. Make sure that the IMM Ethernet over USB interface is enabled.

3. Reinstall the RNDIS or cdc_ether device driver for the operating system.

4. Disable the watchdog.

5. Check the integrity of the installed operating system.

6. Update the IMM firmware. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

Running the backup IMM main application.

Error The IMM has resorted to

running the backup main application.

Update the IMM firmware. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

Please ensure that the IMM is flashed with the correct firmware. The IMM is unable to match its firmware to the server.

Error The server does not support

the installed IMM firmware version.

Update the IMM firmware to a version that the server supports. Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.

IMM reset was caused by restoring default values.

Info The IMM has been reset

because a user has restored

No action; information only.

the configuration to its default settings.

IMM clock has been set from NTP server %1. (%1 = Lenovo_NTPService.ElementName)

Info The IMM clock has been set

to the date and time that is provided by the Network Time Protocol server.

No action; information only.

Chapter 5. Diagnostics 61

Page 70

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

SSL data in the IMM configuration data is invalid. Clearing configuration data region and disabling SSL+H25.

Flash of %1 from %2 succeeded for user %3. (%1 = CIM_ManagedElement. ElementName; %2 = Web or LegacyCLI; %3 = user ID)

Flash of %1 from %2 failed for user %3. (%1 = CIM_ManagedElement. ElementName; %2 = Web or LegacyCLI; %3 = user ID)

The Chassis Event Log (CEL) on system %1 is 75% full. (%1 = CIM_ComputerSystem. ElementName)

The Chassis Event Log (CEL) on system %1 is 100% full. (%1 = CIM_ComputerSystem. ElementName)

%1 Platform Watchdog Timer expired for %2. (%1 = OS Watchdog or Loader Watchdog; %2 = OS Watchdog or Loader Watchdog)

IMM Test Alert Generated by %1. (%1 = user ID)

Error There is a problem with the

certificate that has been imported into the IMM. The imported certificate must contain a public key that corresponds to the key pair that was previously generated through the

Generate a New Key and Certificate Signing Request link.

Info A user has successfully

updated one of the following firmware components:

v IMM main application v IMM boot ROM v Server firmware v Diagnostics v Integrated service

processor

Info An attempt to update a

firmware component from the interface and IP address has failed.

Info The IMM event log is 75%

full. When the log is full, older log entries are replaced by newer ones.

Info The IMM event log is full.

When the log is full, older log entries are replaced by newer ones.

Error A Platform Watchdog Timer

Expired event has occurred.

Info A user has generated a test

alert from the IMM.

1. Make sure that the certificate that you are importing is correct.

2. Try to import the certificate again.

No action; information only.

Try to update the firmware again.

To avoid losing older log entries, save the log as a text file and clear the log.

1. Reconfigure the watchdog timer to a higher value.

2. Make sure that the IMM Ethernet over USB interface is enabled.

3. Reinstall the RNDIS or cdc_ether device driver for the operating system.

4. Disable the watchdog.

5. Check the integrity of the installed operating system.

No action; information only.

62 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 71

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Security: Userid: '%1' had %2 login failures from an SSH client at IP address %3. (%1 = user ID; %2 = MaximumSuccessiveLoginFailures (currently set to 5 in the firmware); %3 = IP address, xxx.xxx.xxx.xxx)

Error A user has exceeded the

maximum number of unsuccessful login attempts from SSH and has been prevented from logging in for the lockout period.

1. Make sure that the correct login ID and password are being used.

2. Have the system administrator reset the login ID or password.

Chapter 5. Diagnostics 63

Page 72

Troubleshooting tables

Use the troubleshooting tables to find solutions to problems that have identifiable symptoms.

If you cannot find a problem in these tables, see “Running the diagnostic programs” on page 90 for information about testing the server.

| |

If you have just added new software or a new optional device and the server is not working, complete the following steps before you use the troubleshooting tables:

1. Check the operator information panel and the EasyLED diagnostics LEDs (see “EasyLED diagnostics” on page 76).

2. Remove the software or device that you just added.

3. Run the diagnostic tests to determine whether the server is running correctly.

4. Reinstall the new software or new device.

DVD drive problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

The DVD drive is not recognized.

A DVD is not working correctly.

1. Make sure that: v The SATA channel to which the DVD drive is attached (primary or

secondary) is enabled in the Setup utility.

v All cables and jumpers are installed correctly. v The signal cable and connector are not damaged and the connector pins are

not bent.

v The correct device driver is installed for the DVD drive.

2. Run the DVD drive diagnostic programs.

3. Reseat the following components: a. DVD drive b. DVD drive cables

4. Replace the following components one at a time, in the order shown, restarting the server each time:

a. DVD drive b. DVD drive and cables c. (Trained service technician only) System board

1. Clean the DVD.

2. Run the DVD drive diagnostic programs.

3. Reseat the DVD drive.

4. Replace the DVD drive.

64 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 73

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

The DVD drive tray is not working.

1. Make sure that the server is turned on.

2. Insert the end of a straightened paper clip into the manual tray-release opening.

3. Reseat the DVD drive.

4. Replace the DVD drive.

General problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

A cover lock is broken, an LED is not working, or a similar problem has occurred.

If the part is a CRU, replace it. If the part is a FRU, the part must be replaced by a trained service technician.

Hard disk drive problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

Not all drives are recognized by the hard disk drive diagnostic tests.

The server stops responding during the hard disk drive diagnostic test.

A hard disk drive was not detected while the operating system was being started.

Remove the drive that is indicated by the diagnostic tests; then, run the hard disk drive diagnostic tests again. If the remaining drives are recognized, replace the drive that you removed with a new one.

Remove the hard disk drive that was being tested when the server stopped responding, and run the diagnostic test again. If the hard disk drive diagnostic test runs successfully, replace the drive that you removed with a new one.

Reseat all hard disk drives and cables; then, run the hard disk drive diagnostic tests again.

Chapter 5. Diagnostics 65

Page 74

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

A hard disk drive passes the diagnostic Fixed Disk Test, but the problem remains.

Run the diagnostic SCSI Fixed Disk Test (see “Running the diagnostic programs” on page 90). Note: This test is not available on servers that have RAID arrays or servers that have SATA hard disk drives.

Intermittent problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

A problem occurs only occasionally and is difficult to diagnose.

1. Make sure that: v All cables and cords are connected securely to the rear of the server and

attached devices.

v When the server is turned on, air is flowing from the fan grille. If there is no

airflow, the fan is not working. This can cause the server to overheat and shut down.

2. Check the system-event log or IMM log (see “Event logs” on page 27).

3. See “Solving undetermined problems” on page 124.

66 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 75

Keyboard, mouse, or pointing-device problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

All or some keys on the keyboard do not work.

| |

The mouse or pointing device

does not work.

| | |

1. Make sure that:

v The keyboard cable is securely connected. v The server and the monitor are turned on.

2. See http://www.lenovo.com/thinkserver and then click Options. Open the Server Options Guide.pdf for keyboard compatibility.

3. If you are using a USB keyboard, run the Setup utility and enable keyboardless operation to prevent the 301 POST error message from being displayed during startup.

4. If you are using a USB keyboard and it is connected to a USB hub, disconnect the keyboard from the hub and connect it directly to the server.

5. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Keyboard b. (Trained service technician only) System board

1. Make sure that: v The mouse or pointing device is compatible with the server. See

http://www.lenovo.com/thinkserver and then click Options. Open the Server Options Guide.pdf.

v The mouse or pointing-device cable is securely connected to the server. v The mouse or pointing-device device drivers are installed correctly. v The server and the monitor are turned on. v The mouse is enabled in the Setup utility.

2. If you are using a USB mouse or pointing device and it is connected to a USB hub, disconnect the mouse or pointing device from the hub and connect it directly to the server.

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Mouse or pointing device b. (Trained service technician only) System board

Chapter 5. Diagnostics 67

Page 76

Memory problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

The amount of system memory that is displayed is less than the amount of installed physical memory.

Multiple rows of DIMMs in a branch are identified as failing.

1. Make sure that:

v No error LEDs are lit on the operator information panel or on the DIMM. v Memory mirroring does not account for the discrepancy. v The memory modules are seated correctly. v You have installed the correct type of memory. v If you changed the memory, you updated the memory configuration in the

Setup utility.

v All banks of memory are enabled. The server might have automatically

disabled a memory bank when it detected a problem, or a memory bank might have been manually disabled.

2. Check the POST event log for DIMM error messages: v If a DIMM was disabled by a system-management interrupt (SMI), replace

the DIMM.

v If a DIMM was disabled by the user or by POST, run the Setup utility and

enable the DIMM.

3. Run memory diagnostics (see “Running the diagnostic programs” on page 90).

4. Make sure that there is no memory mismatch when the server is at the minimum memory configuration (two 512 MB DIMMs; see the information about the minimum required configuration on page “Solving undetermined problems” on page 124).

5. Add one pair of DIMMs at a time, making sure that the DIMMs in each pair are matching.

6. Reseat the DIMMs.

7. Replace the components in step 6, one at a time, in the order shown, restarting the server each time.

1. Reseat the DIMMs; then, restart the server.

2. Replace the lowest-numbered DIMMs with identical known good DIMMs; then, restart the server. Repeat as necessary. If the failures continue after all identified pairs are replaced, go to step4.

3. Return the removed DIMMs, one pair at a time, to their original connectors, restarting the server after each pair, until a pair fails. Replace each DIMM in the failed pair with an identical known good DIMM, restarting the server after you reinstall each DIMM. Replace the failed DIMM. Repeat step 3 until you have tested all removed DIMMs.

4. (Trained service technician only) Replace the system board.

68 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 77

Microprocessor problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

The server emits a continuous beep during POST, indicating that the startup (boot) microprocessor is not working correctly.

1. Correct any errors that are indicated by the EasyLED diagnostics LEDs (see “EasyLED diagnostics” on page 76).

2. Make sure that the server supports all the microprocessors and that the microprocessors match in speed and cache size.

3. (Trained service technician only) Reseat microprocessor 1

4. (Trained service technician only) If there is no indication of which microprocessor has failed, isolate the error by testing with one microprocessor at a time.

5. Replace the following components one at a time, in the order shown, restarting the server each time:

a. (Trained service technician only) Microprocessor 2 b. VRM 2 c. (Trained service technician only) System board

6. (Trained service technician only) If multiple error codes or EasyLED diagnostics LEDs indicate a microprocessor error, reverse the locations of two microprocessors to determine whether the error is associated with a microprocessor or with a microprocessor socket.

v If the error is associated with a microprocessor, replace the microprocessor. v If the error is associated with a VRM, replace the VRM. v If the error is associated with a microprocessor socket, replace the system

board.

Chapter 5. Diagnostics 69

Page 78

Monitor problems

Some Lenovo monitors have their own self-tests. If you suspect a problem with your monitor, see the documentation that comes with the monitor for instructions for testing and adjusting the monitor.

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

Testing the monitor

The screen is blank.

The monitor works when you turn on the server, but the screen goes blank when you start some application programs.

1. Make sure that the monitor cables are firmly connected.

2. Try using a different monitor on the server, or try using the monitor that is being tested on a different server.

3. Run the diagnostic programs. If the monitor passes the diagnostic programs, the problem might be a video device driver.

4. (Trained service technician only) Replace the system board.

1. If the server is attached to a KVM switch, bypass the KVM switch to eliminate it as a possible cause of the problem: connect the monitor cable directly to the correct connector on the rear of the server.

2. Make sure that: v The server is turned on. If there is no power to the server, see “Power

problems” on page 73.

v The monitor cables are connected correctly. v The monitor is turned on and the brightness and contrast controls are

adjusted correctly.

v No POST errors are generated when the server is turned on.

3. Make sure that the correct server is controlling the monitor, if applicable.

4. See “Solving undetermined problems” on page 124.

1. Make sure that: v The application program is not setting a display mode that is higher than the

capability of the monitor.

v You installed the necessary device drivers for the application.

2. Run video diagnostics (see “Running the diagnostic programs” on page 90). v If the server passes the video diagnostics, the video is good; see “Solving

undetermined problems” on page 124.

v (Trained service technician only) If the server fails the video diagnostics,

replace the system board.

70 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 79

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

The monitor has screen jitter, or the screen image is wavy, unreadable, rolling, or distorted.

1. If the monitor self-tests show that the monitor is working correctly, consider the location of the monitor. Magnetic fields around other devices (such as transformers, appliances, fluorescent lights, and other monitors) can cause screen jitter or wavy, unreadable, rolling, or distorted screen images. If this happens, turn off the monitor.

Attention: Moving a color monitor while it is turned on might cause screen discoloration.

Move the device and the monitor at least 305 mm (12 in.) apart, and turn on the monitor.

Notes:

a. To prevent diskette drive read/write errors, make sure that the distance

between the monitor and any external diskette drive is at least 76 mm (3 in.).

b. Non-Lenovo monitor cables might cause unpredictable problems.

2. Reseat the monitor.

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Monitor b. (Trained service technician only) System board

Wrong characters appear on the screen.

1. If the wrong language is displayed, update the server firmware with the correct language (see “Updating the firmware” on page 267).

2. Reseat the monitor

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Monitor b. (Trained service technician only) System board

Chapter 5. Diagnostics 71

Page 80

Optional-device problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

An optional device that was just

installed does not work.

| | | | | | |

An optional device that used to work does not work now.

1. Make sure that: v The device is designed for the server (See http://www.lenovo.com/thinkserver

and then click Options. Open the Server Options Guide.pdf).

v You followed the installation instructions that came with the device and the

device is installed correctly.

v You have not loosened any other installed devices or cables. v You updated the configuration information in the Setup utility. Whenever

memory or any other device is changed, you must update the configuration.

2. Reseat the device that you just installed.

3. Replace the device that you just installed.

1. Make sure that all of the hardware and cable connections for the device are secure.

2. If the device comes with test instructions, use those instructions to test the device.

3. If the failing device is a SCSI device, make sure that:

v The cables for all external SCSI devices are connected correctly. v The last device in each SCSI chain, or the end of the SCSI cable, is

terminated correctly.

v Any external SCSI device is turned on. You must turn on an external SCSI

device before you turn on the server.

4. Reseat the failing device.

5. Replace the failing device.

72 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 81

Power problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

The power-control button does not work (the server does not start). Note: The power-control button will not function until 3 minutes after the server has been connected to ac power.

The server does not turn off.

1. Make sure that the power-control button is working correctly: a. Disconnect the server power cords. b. Reconnect the power cords. c. (Trained service technician only) Reseat the operator information panel

cables, and then repeat steps 1a and 1b. If the server starts, reseat the operator information panel. If the problem remains, replace the operator information panel.

2. Make sure that: v The power cords are correctly connected to the server and to a working

electrical outlet.

v The type of memory that is installed is correct. v The DIMM is fully seated. v The LEDs on the power supply do not indicate a problem. v The microprocessors are installed in the correct sequence.

3. Reseat the following components: a. DIMMs b. (Trained service technician only) Power switch connector c. (Trained service technician only) Power backplane

4. Replace the following components one at a time, in the order shown, restarting the server each time:

a. DIMMs b. (Trained service technician only) Power switch connector c. (Trained service technician only) Power backplane d. (Trained service technician only) System board

5. If you just installed an optional device, remove it, and restart the server. If the server now starts, you might have installed more devices than the power supply supports.

6. See “Power-supply LEDs” on page 88.

7. See “Solving undetermined problems” on page 124.

1. Determine whether you are using an Advanced Configuration and Power Interface (ACPI) or a non-ACPI operating system. If you are using a non-ACPI operating system, complete the following steps:

a. Press Ctrl+Alt+Delete. b. Turn off the server by pressing the power-control button for 5 seconds. c. Restart the server. d. If the server fails POST and the power-control button does not work,

disconnect the power cord for 20 seconds; then, reconnect the power cord and restart the server.

2. If the problem remains or if you are using an ACPI-aware operating system, suspect the system board.

Chapter 5. Diagnostics 73

Page 82

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

The server unexpectedly shuts down, and the LEDs on the operator information panel are not lit.

See “Solving undetermined problems” on page 124.

Serial port problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

The number of serial ports that are identified by the operating system is less than the number of installed serial ports.

A serial device does not work.

1. Make sure that: v Each port is assigned a unique address in the Setup utility and none of the

serial ports is disabled.

v The serial port adapter (if one is present) is seated correctly.

2. Reseat the serial port adapter.

3. Replace the serial port adapter.

1. Make sure that:

v The device is compatible with the server. v The serial port is enabled and is assigned a unique address. v The device is connected to the correct connector.

2. Reseat the following components: a. Failing serial device b. Serial cable

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Failing serial device b. Serial cable c. (Trained service technician only) System board

74 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 83

Software problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

You suspect a software problem.

1. To determine whether the problem is caused by the software, make sure that: v The server has the minimum memory that is needed to use the software. For

memory requirements, see the information that comes with the software. If you have just installed an adapter or memory, the server might have a memory-address conflict.

v The software is designed to operate on the server. v Other software works on the server. v The software works on another server.

2. If you receive any error messages while you use the software, see the information that comes with the software for a description of the messages and suggested solutions to the problem.

3. Contact the software vendor.

Universal Serial Bus (USB) port problems

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Symptom Action

A USB device does not work.

1. Run USB diagnostics (see “Running the diagnostic programs” on page 90).

2. Make sure that:

v The correct USB device driver is installed. v The operating system supports USB devices. v A standard PS/2 keyboard or mouse is not connected to the server. If it is, a

USB keyboard or mouse will not work during POST.

3. Make sure that the USB configuration optional devices are set correctly in the Setup utility (see “Setup Utility menu choices” on page 252 for more information).

4. If you are using a USB hub, disconnect the USB device from the hub and connect it directly to the server.

Chapter 5. Diagnostics 75

Page 84

EasyLED diagnostics

EasyLED diagnostics is a system of LEDs on various external and internal components of the server. When an error occurs, LEDs are lit throughout the server. By viewing the LEDs in a particular order, you can often identify the source of the error.

When LEDs are lit to indicate an error, they remain lit when the server is turned off, provided that the server is still connected to power and the power supply is operating correctly.

Before you work inside the server to view the EasyLED diagnostics LEDs, read the safety information that begins on page 5.

If an error occurs, view the EasyLED diagnostics LEDs in the following order:

1. Look at the operator information panel LEDs on the front of the server. v If an operator information panel LED is lit, it indicates that information about a

suboptimal condition in the server is available in the system-event log.

v If the system-error LED is lit, it indicates that an error has occurred; go to

step 2 on page 77.

The following illustration shows the operator information panel LEDs that are visible through the bezel.

1 System power-on LED 2 Hard disk drive activity LED 3 System-locator LED 4 System-information LED 5 System-error LED

The following table lists the operator information panel LEDs, the problems that they indicate, and actions to solve the problems.

76 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 85

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See the Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Lit EasyLED diagnostics LEDs with the system-error or information LED also lit Description

System power (green)

Hard disk drive activity (green) When this LED is flashing rapidly, it indicates that there is activity on a

System information (amber) When this amber LED is lit, it indicates that information about a

System error (amber) When this LED is lit, it indicates that a system error has occurred. Use

v Off: AC power is not present, or the power supply or the LED itself

has failed.

v Flashing rapidly (4 times per second): The server is turned off

and is not ready to be turned on. The power-control button is disabled. Approximately 3 minutes after the server is connected to ac power, the power-control button becomes active.

v Flashing slowly (once per second): The server is turned off and is

ready to be turned on. You can press the power-control button to turn on the server.

v Lit: The server is turned on. v Fading on and off: The server is in a reduced-power state. To

wake the server, press the power-control button or use the IMM Web interface.

hard disk drive.

suboptimal condition in the server is available in the IMM event log or in the system-event log. Check the EasyLED panel for more information.

the EasyLED panel and the system service label to further isolate the error.

2. Look at the EasyLED panel on the front of the server. Lit LEDs on the EasyLED panel indicate the type of error that has occurred.

The following illustration shows the EasyLED panel LEDs that are visible through the bezel.

Chapter 5. Diagnostics 77

Page 86

1 Server processor bus 8 Power supply 2 Microprocessor 9 Fan 3 VRM 10 PCI bus 4 Microprocessor/memory configuration 11 System board 5 Memory 12 Temperature 6 NMI 13 System-event log 7 Hard disk drive/RAID 14 USB ports

The following table lists the EasyLED diagnostics LEDs, the problems that they indicate, and actions to solve the problems.

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See the Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Lit EasyLED diagnostics LED with the system-error or information LED also lit Description Action

System-event log (LOG)

Temperature The system temperature has

A system error occurred. View the contents of the system-event log (see “Event

logs” on page 27).

exceeded a threshold level.

1. See the system-event log for the source of the fault (see “System-event log” on page 38).

2. Make sure that the airflow in the server is not blocked.

3. Make sure that the room temperature is neither too hot nor too cold (see “Environment” in “Specifications” on page 17).

78 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 87

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See the Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Lit EasyLED diagnostics LED with the system-error or information LED also lit Description Action

System board (BRD) An error occurred on the system

board.

1. Check the LEDs on the system board to identify the component that is causing the error. The BRD LED can be lit for the following conditions:

v Failed or missing battery v Failed voltage regulator

2. Check the system-event log for information about the error.

3. Replace any failed or missing replaceable components, such as the battery.

4. (Trained service technician only) If a voltage regulator has failed, replace the system board.

PCI bus A PCI adapter has failed.

1. See the system-event log (see “System-event log” on page 38).

2. Check the LEDs on the PCI slots to identify the component that is causing the error, and reseat the failing adapter.

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Failing adapter b. (Trained service technician only) System board

Fan A fan has failed or is operating too

slowly.

1. Reinstall the removed fan.

2. If an individual fan LED is lit, replace the fan.

3. (Trained service technician only) Replace the system board.

Power supply A power supply has failed or has

been removed. Note: In a redundant power configuration, the dc power LED on one power supply might be off.

1. Check the individual power-supply LEDs.

2. Reseat the following components: a. Power supply b. (Trained service technician only) Power-supply

cage cables

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Power supply b. (Trained service technician only) Power-supply

cage

Chapter 5. Diagnostics 79

Page 88

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See the Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Lit EasyLED diagnostics LED with the system-error or information LED also lit Description Action

DASD/RAID A hard disk drive, SAS controller, or

RAID adapter error has occurred.

Notes:

1. This LED is also lit when a hard disk drive is removed from the server.

2. The error LED on the failing hard disk drive is also lit.

3. Check the system-event log for a RAID error.

NMI A hardware error has been reported

to the operating system.

1. Reinstall the removed drive.

2. Reseat the following components: a. Failing hard disk drive b. SAS hard disk drive backplane c. SAS signal and power cables d. System board e. ServeRAID adapter

3. Replace the components listed in step 2 one at a time, in the order shown, restarting the server each time.

1. See the system-event log (see “System-event log” on page 38).

2. If the PCI LED is lit, follow the instructions for that LED.

3. If the MEM LED is lit, follow the instructions for that LED.

4. Restart the server.

80 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 89

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See the Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Lit EasyLED diagnostics LED with the system-error or information LED also lit Description Action

Memory (MEM) A memory error has occurred.

Note: The error LED on the DIMM is also lit.

1. Determine whether the CNFG LED is also lit, which indicates that the memory configuration is invalid. Reinstall the DIMMs in a supported configuration.

2. If the CNFG LED is not lit, one of the following conditions might be present:

v The server did not start and a failing DIMM LED

is lit: a. Check for a PFA log event in the

system-event log. b. Reseat the DIMM. c. Move the DIMM to a different slot or replace

the DIMM. d. (Trained service technician only) Replace

the system board.

v The server started, the failing DIMM is disabled,

and the LED is lit: a. If the LEDs are lit by two DIMMs, check the

system-event log for a PFA event on one of

the DIMMs, and then replace that DIMM.

Otherwise, replace both DIMMs. b. If the LED is lit by only one DIMM, replace

that DIMM. c. Re-enable the DIMM, using the Setup utility.

Microprocessor/ Memory Configuration (CNFG)

A hardware configuration error has occurred. (This LED is used with the MEM, VRM, and CPU LEDs.)

1. (The system error LED, CPU LED, and this LED are lit when POST detects a microprocessor mismatch.) Remove and install two microprocessors of the same cache size, type, and clock speed.

2. (The system error LED, MEM LED, and this LED are lit when POST detects an invalid memory configuration.) Remove and install supported DIMMs (see “Installing a memory module” on page

211).

3. (The system error LED, VRM LED, and this LED are lit when POST detects a missing VRM.) Install a VRM for microprocessor 2 (see “Installing a voltage regulator module” on page 180).

4. Check the system error log for information indicating incompatible components.

Chapter 5. Diagnostics 81

Page 90

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See the Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Lit EasyLED diagnostics LED with the system-error or information LED also lit Description Action

VRM A VRM has failed.

Microprocessor (CPU) A microprocessor has failed, or an

invalid microprocessor configuration is installed. Note: (Trained service technician only) Make sure that the microprocessors are installed in the correct sequence.

1. Check the system-event log to determine the reason for the lit LED (for a VRM).

2. Determine whether the CNFG LED is also lit. If the CNFG LED is lit, the memory configuration is invalid. Reseat the VRM.

3. If the CNFG LED is not lit, reseat the following components:

a. Failing VRM b. (Trained service technician only)

Microprocessor associated with the VRM

4. Replace the following components one at a time, in the order shown, restarting the server each time:

a. Failing VRM b. (Trained service technician only)

Microprocessor associated with the VRM

c. (Trained service technician only) System board

1. Check the system-event log to determine the reason for the lit LED.

2. Determine whether the CNFG LED is also lit. If the CNFG LED is not lit, a microprocessor has failed.

a. Make sure that the failing microprocessor,

which is indicated by the CPU1 or CPU2 error LED on the system board, is installed correctly.

b. Replace the following components one at a

time, in the order shown, restarting the server each time:

1) (Trained service technician only) Failing microprocessor

2) (Trained service technician only) System board

c. If the CNFG LED is lit and the CPU mismatch

LED on the system board is also lit, an invalid microprocessor configuration is installed:

1) Make sure that the microprocessors are compatible with each other. They must match in speed and cache size. Use the Setup utility to compare the microprocessor information.

2) (Trained service technician only) Replace the incompatible microprocessor.

82 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 91

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See the Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Lit EasyLED diagnostics LED with the system-error or information LED also lit Description Action

Service processor bus (SP BUS)

The IMM detects an internal error.

1. Disconnect the server from ac power; then, reconnect the server to power and restart the server.

2. Update the IMM firmware.

Look at the system service label on the top of the server, which gives an overview of internal components that correspond to the LEDs on the EasyLED panel. This information can often provide enough information to diagnose the error.

Chapter 5. Diagnostics 83

Page 92

3. Remove the server cover and look inside the server for lit LEDs. Certain components inside the server have LEDs that are lit to indicate the location of a problem.

The following illustration shows the LEDs on the system board.

1 PCI slot 1 error LED 9 Battery error LED 2 PCI slot 2 error LED 10 System-board error LED 3 PCI slot 3 error LED 11 VRM fail LED 4 HS heartbeat LED 12 CPU 1 error LED 5 PCI slot 4 error LED 13 DIMMs1-8error LEDs (starting

from the bottom)

6 PCI slot 5 error LED 14 DIMMs9-16error LEDs (starting

from the bottom)

7 PCI slot 6 error LED 15 CPU 2 error LED 8 IMM heartbeat LED 16 CPU mismatch LED

84 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 93

The system board is equipped with a PCI extender card that provides either one or two additional expansion slots. The following illustration shows the LEDs on the PCI Express extender card, if one is installed.

The following illustration shows the LEDs on the PCI-X extender card, if one is installed.

The following table describes the LEDs on the system board and extender card and suggested actions to correct the detected problems.

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Lit EasyLED diagnostics LED with the system-error or information LED also lit Description Action

DIMM 1 to DIMM 16 error LEDs

CPU 1 error LED Microprocessor 1 has failed, is

A DIMM has failed or is incorrectly installed.

missing, or has been incorrectly installed. Note: (Trained service technician only) Make sure that the microprocessors are installed in the correct sequence; see “Installing a microprocessor and heat sink” on page 220.

1. Remove the DIMM that is indicated by a lit error LED.

2. Reseat the DIMM.

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. DIMM b. (Trained service technician only) System board

1. Check the system-event log to determine the reason for the lit LED.

2. (Trained service technician) Reseat the failing microprocessor.

3. Replace the following components one at a time, in the order shown, restarting the server each time:

a. (Trained service technician only) Failing

microprocessor

b. (Trained service technician only) System board

Chapter 5. Diagnostics 85

Page 94

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Lit EasyLED diagnostics LED with the system-error or information LED also lit Description Action

CPU 2 error LED Microprocessor 2 has failed, is

CPU mismatch LED A mismatched microprocessor has

been installed. Note: All microprocessors must have the same speed and cache size.

VRM failure LED Microprocessor 2 VRM has failed or

is incorrectly installed.

System-board error LED

Battery failure LED Battery low.

System-board CPU VRD, power voltage regulators, or both have failed.

1. Check the system-event log to determine the reason for the lit LED.

2. Find the failing, missing, or mismatched microprocessor by checking the LEDs on the system board.

3. (Trained service technician) Reseat the failing microprocessor.

4. Replace the following components one at a time, in the order shown, restarting the server each time:

a. (Trained service technician only) Failing

microprocessor

b. (Trained service technician only) System board

1. Run the Setup utility and view the microprocessor information to compare the installed microprocessor specifications.

2. (Trained service technician only) Remove and replace one of the microprocessors so that they both match.

1. Reseat the VRM

2. Replace the following components one at a time, in the order shown, restarting the server each time:

a. VRM b. (Trained service technician only) System board

3. Replace the VRM

(Trained service technician only) Replace the system board.

1. Replace the CMOS lithium battery, if necessary.

2. (Trained service technician only) Replace the system board.

86 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 95

v Follow the suggested actions in the order in which they are listed in the Action column until the problem

is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to

determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a

trained service technician.

Lit EasyLED diagnostics LED with the system-error or information LED also lit Description Action

IMM heartbeat LED Indicates the status of the boot

process of the IMM.

When the server is connected to power this LED flashes quickly to indicate that the IMM code is loading. When the loading is complete, the LED stops flashing briefly and then flashes slowly to indicate that the IMM if fully operational and you can press the power-control button to start the server.

PCI slot 1 to PCI slot 8 error LEDs

H8 heartbeat LED Indicates the status of power-on and

An error has occurred on a PCI bus or on the system board. An additional LED is lit next to a failing PCI slot.

power-off sequencing.

If the LED does not begin flashing within 30 seconds of when the server is connected to power, complete the following steps:

1. (Trained service technician only) Use the IMM recovery switch to recover the firmware (see Table 10 on page 144).

2. (Trained service technician only) Replace the system board.

1. Check the system-event log for information about the error.

2. If you cannot isolate the failing adapter through the LEDs and the information in the system-event log, remove one adapter at a time, and restart the server after each adapter is removed.

3. If the failure remains, go to http://www.lenovo.com/ support for additional troubleshooting information.

1. If the H8 heartbeat LED is blinking ata1Hzrate, no action is necessary.

2. (Trained service technician only) If the H8 heartbeat LED is not blinking, replace the system board.

Remind button

You can use the remind button on the EasyLED panel to put the system-error LED on the operator information panel into Remind mode. When you press the remind button, you acknowledge the error but indicate that you will not take immediate action. The system-error LED flashes while it is in Remind mode and stays in Remind mode until one of the following conditions occurs:

v All known errors are corrected. v The server is restarted. v A new error occurs, causing the system-error LED to be lit again.

Chapter 5. Diagnostics 87

Page 96

Power-supply LEDs

The following illustration shows the power-supply LEDs on the rear of the server.

1 ac power LED 2 dc power LED 3 Power error LED

The following table describes the problems that are indicated by various combinations of the power-supply LEDs and the system power LED on the operator information panel and suggested actions to correct the detected problems.

88 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 97

Table 3. Power-supply LEDs

Power-supply LEDs

Description Action NotesAC DC Error

Off Off Off No ac power to

the server or a problem with the ac power source

Off Off On No ac power to

the server or a problem with the ac power source and the power supply had detected an internal problem

Off On Off Faulty power

supply

Off On On Faulty power

supply

On Off Off Power supply not

fully seated, faulty system board, or faulty power supply

On Off or

Flashing

On Faulty power

supply On On Off Normal operation On On On Power supply is

faulty but still

operational

1. Check the ac power to the server.

2. Make sure that the power cord is connected to a functioning power source.

3. Turn the server off and then turn the server back on.

4. If the problem remains, replace the power supply.

1. Replace the power supply.

2. Make sure that the power cord is connected to a functioning power source.

Replace the power supply.

1. Reseat the power supply.

2. If the system-board error LED is not lit, replace the power supply.

3. (Trained service technician only) If system-board error LED is lit, replace the system board.

Replace the power supply.

This is a normal condition when no ac power is present.

This happens only when a second power supply is providing power to the server.

Typically indicates that a power supply is not fully seated.

Chapter 5. Diagnostics 89

Page 98

Diagnostic programs, messages, and error codes

The diagnostic programs are the primary method of testing the major components of the server. As you run the diagnostic programs, text messages and error codes are displayed on the screen and are saved in the test log. A diagnostic text message or error code indicates that a problem has been detected; to determine what action you should take as a result of a message or error code, see the table in “Diagnostic messages” on page 91.

Running the diagnostic programs

To run the diagnostic programs, complete the following steps:

1. If the server is running, turn off the server and all attached devices.

2. Turn on all attached devices; then, turn on the server.

3. When the prompt Press F2 for Dynamic System Analysis (DSA) is displayed, press F2.

Note: The DSA Preboot diagnostic program might appear to be unresponsive for an unusual length of time when you start the program. This is normal operation while the program loads.

4. Optionally, select Quit to DSA to exit from the stand-alone memory diagnostic program.

Note: After you exit from the stand-alone memory diagnostic environment, you must restart the server to access the stand-alone memory diagnostic environment again.

5. Select gui to display the graphical user interface, or select cmd to display the DSA interactive menu.

6. Follow the instructions on the screen to select the diagnostic test to run.

If the diagnostic programs do not detect any hardware errors but the problem remains during normal server operations, a software error might be the cause. If you suspect a software problem, see the information that comes with your software.

A single problem might cause more than one error message. When this happens, correct the cause of the first error message. The other error messages usually will not occur the next time you run the diagnostic programs.

If the server stops during testing and you cannot continue, restart the server and try running the diagnostic programs again. If the problem remains, replace the component that was being tested when the server stopped.

Diagnostic text messages

Diagnostic text messages are displayed while the tests are running. A diagnostic text message contains one of the following results:

Passed: The test was completed without any errors.

Failed: The test detected an error.

90 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Page 99

User Aborted: You stopped the test before it was completed.

Not Applicable: You attempted to test a device that is not present in the server.

Aborted: The test could not proceed because of the server configuration.

Warning: The test could not be run. There was no failure of the hardware that was

being tested, but there might be a hardware failure elsewhere, or another problem prevented the test from running; for example, there might be a configuration problem, or the hardware might be missing or is not being recognized.

The result is followed by an error code or other additional information about the error.

Viewing the test log

To view the DSA log when the tests are completed, select Utility from the top of the screen and then select View Test Log. To view a detailed test log, press Tab while you view the DSA log. The DSA log data is maintained only while you are running the diagnostic programs. When you exit from the diagnostic programs, the DSA log is cleared.

To save the DSA log to a file on a diskette or to the hard disk, click Save Log on the diagnostic programs screen and specify a location and name for the saved log file.

Notes:

1. To create and use a diskette, you must add an optional external diskette drive to the server.

2. To save the test log to a diskette, you must use a diskette that you have formatted yourself; this function does not work with preformatted diskettes. If the diskette has sufficient space for the test log, the diskette can contain other data.

Diagnostic messages

The following table describes the messages that the diagnostic programs might generate and suggested actions to correct the detected problems. Follow the suggested actions in the order in which they are listed in the column.

Table 4. DSA messages

v Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to determine which

components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a Trained

service technician.

Message number Component Test State Description Action

089-000-xxx CPU CPU Stress

test

Pass CPU passed

stress test

No action required.

Chapter 5. Diagnostics

Page 100

Table 4. DSA messages (continued)

v Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved.

v See Chapter 8, “Parts Listing, TD200x Machine Types 3719, 3821, 3822, and 3823,” on page 237 to determine which

components are customer replaceable units (CRU) and which components are field replaceable units (FRU).

v If an action step is preceded by “(Trained service technician only),” that step must be performed only by a Trained

service technician.

Message number Component Test State Description Action

089-801-xxx CPU CPU Stress

Test

089-802-xxx CPU CPU Stress

Test

Aborted Internal

program error.

Aborted System

resource availability error.

1. Turn off and restart the system.

2. Make sure that the DSA code is at the latest level.

3. Run the test again.

4. Make sure that the system firmware is at the latest level. The installed firmware level is shown in the DSA log in the Firmware/VPD section for this component. For more information, see “Updating the firmware” on page 267.

5. Run the test again.

6. Turn off and restart the system if necessary to recover from a hung state.

7. Run the test again.

8. Replace the following components one at a time, in the order shown, and run this test again to determine whether the problem has been solved:

a. (Trained service technician only)

Microprocessor board

b. (Trained service technician only)

Microprocessor

9. If the failure remains, go to the Lenovo Web site for more troubleshooting information at http://www.lenovo.com/support.

1. Turn off and restart the system.

2. Make sure that the DSA code is at the latest level.

3. Run the test again.

5. Run the test again.

6. Turn off and restart the system if necessary to recover from a hung state.

7. Run the test again.

8. Replace the following components one at a time, in the order shown, and run this test again to determine whether the problem has been solved:

a. (Trained service technician only)

Microprocessor board

b. (Trained service technician only)

Microprocessor

9. If the failure remains, go to the Lenovo Web site for more troubleshooting information at http://www.lenovo.com/support.

92 ThinkServer TD200x Types 3719, 3821, 3822, and 3823: Hardware Maintenance Manual

Lenovo 3822, 3821, 3719, 3823 User Manual

Specifications and Main Features

Frequently Asked Questions

User Manual

Contents

Chapter 1. About this manual

Important Safety Information

Important information about replacing RoHS compliant FRUs

Turkish statement of compliance

Chapter 2. Safety information

Guidelines for trained service technicians

Inspecting for unsafe conditions

Guidelines for servicing electrical equipment

Safety statements

Chapter 3. General information

Features and technologies

Specifications

Software

EasyStartup

EasyManage

Chapter 4. General Checkout

Checkout procedure

About the checkout procedure

Performing the checkout procedure

Diagnosing a problem

Undocumented problems

Chapter 5. Diagnostics

Diagnostic tools

Event logs

Viewing event logs through the Setup utility

Viewing event logs without restarting the server

POST error codes

System-event log

Integrated management module error messages

Troubleshooting tables

DVD drive problems

General problems

Hard disk drive problems

Intermittent problems

Keyboard, mouse, or pointing-device problems

Memory problems

Microprocessor problems

Monitor problems

Optional-device problems

Power problems

Serial port problems

Software problems

Universal Serial Bus (USB) port problems

EasyLED diagnostics

Remind button

Power-supply LEDs

Diagnostic programs, messages, and error codes

Running the diagnostic programs

Diagnostic text messages

Viewing the test log

Diagnostic messages