Oracle X5-8 Service Manual

Page 1

Oracle® Server X5-8 Service Manual

Part No: E56311-03
December 2015
Page 2
Page 3
Oracle Server X5-8 Service Manual
Part No: E56311-03
Copyright © 2015, Oracle and/or its affiliates. All rights reserved.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.
Access to Oracle Support
Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?
ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.
Page 4
Référence: E56311-03
Copyright © 2015, Oracle et/ou ses affiliés. Tous droits réservés.
Ce logiciel et la documentation qui l'accompagne sont protégés par les lois sur la propriété intellectuelle. Ils sont concédés sous licence et soumis à des restrictions d'utilisation et de divulgation. Sauf stipulation expresse de votre contrat de licence ou de la loi, vous ne pouvez pas copier, reproduire, traduire, diffuser, modifier, accorder de licence, transmettre, distribuer, exposer, exécuter, publier ou afficher le logiciel, même partiellement, sous quelque forme et par quelque procédé que ce soit. Par ailleurs, il est interdit de procéder à toute ingénierie inverse du logiciel, de le désassembler ou de le décompiler, excepté à des fins d'interopérabilité avec des logiciels tiers ou tel que prescrit par la loi.
Les informations fournies dans ce document sont susceptibles de modification sans préavis. Par ailleurs, Oracle Corporation ne garantit pas qu'elles soient exemptes d'erreurs et vous invite, le cas échéant, à lui en faire part par écrit.
Si ce logiciel, ou la documentation qui l'accompagne, est livré sous licence au Gouvernement des Etats-Unis, ou à quiconque qui aurait souscrit la licence de ce logiciel pour le compte du Gouvernement des Etats-Unis, la notice suivante s'applique :
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
Ce logiciel ou matériel a été développé pour un usage général dans le cadre d'applications de gestion des informations. Ce logiciel ou matériel n'est pas conçu ni n'est destiné à être utilisé dans des applications à risque, notamment dans des applications pouvant causer un risque de dommages corporels. Si vous utilisez ce logiciel ou ce matériel dans le cadre d'applications dangereuses, il est de votre responsabilité de prendre toutes les mesures de secours, de sauvegarde, de redondance et autres mesures nécessaires à son utilisation dans des conditions optimales de sécurité. Oracle Corporation et ses affiliés déclinent toute responsabilité quant aux dommages causés par l'utilisation de ce logiciel ou matériel pour des applications dangereuses.
Oracle et Java sont des marques déposées d'Oracle Corporation et/ou de ses affiliés. Tout autre nom mentionné peut correspondre à des marques appartenant à d'autres propriétaires qu'Oracle.
Intel et Intel Xeon sont des marques ou des marques déposées d'Intel Corporation. Toutes les marques SPARC sont utilisées sous licence et sont des marques ou des marques déposées de SPARC International, Inc. AMD, Opteron, le logo AMD et le logo AMD Opteron sont des marques ou des marques déposées d'Advanced Micro Devices. UNIX est une marque déposée de The Open Group.
Ce logiciel ou matériel et la documentation qui l'accompagne peuvent fournir des informations ou des liens donnant accès à des contenus, des produits et des services émanant de tiers. Oracle Corporation et ses affiliés déclinent toute responsabilité ou garantie expresse quant aux contenus, produits ou services émanant de tiers, sauf mention contraire stipulée dans un contrat entre vous et Oracle. En aucun cas, Oracle Corporation et ses affiliés ne sauraient être tenus pour responsables des pertes subies, des coûts occasionnés ou des dommages causés par l'accès à des contenus, produits ou services tiers, ou à leur utilisation, sauf mention contraire stipulée dans un contrat entre vous et Oracle.
Accès aux services de support Oracle
Les clients Oracle qui ont souscrit un contrat de support ont accès au support électronique via My Oracle Support. Pour plus d'informations, visitez le site http://www.oracle.com/
pls/topic/lookup?ctx=acc&id=info ou le site http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs si vous êtes malentendant.
Page 5

Contents

Using This Documentation ............. ................ ................ ................ ................ ... 11
About This Document .......................................................................................  13
Service Notes ....................................................................................................  15
Intended Audience .........................................................................................  15
Warning Label ..............................................................................................  15
Server and Components Overview ...................................................................  17
Server Overview ........... ................ ................ ................ ................ ................  18
Chassis Overview ..........................................................................................  19
Chassis Front Side Components ...............................................................  19
Chassis Internal Components . ................ ................ ................ ................ ..  20
Chassis Backside Components .......... ................ ................ ................ .......  22
CPU Module (CMOD) Overview ..... ................ ................ ................ ................  23
Processor and Memory Overview ....... ................ ......................................  23
CMOD Configuration Options .......... ................ ................ ................ .......  23
CMOD Population Rules ........................................................................  26
CMOD Layout ........................................................... ................ ...........  27
CMOD and Fan Module Power ...............................................................  28
System Module (SMOD) Overview ..................................................................  28
SMOD Overview ..................................................................................  29
Storage Drives ......................................................................................  30
SMOD Motherboard ..............................................................................  31
Service Processor (SP) ...........................................................................  31
Storage Drive Backplanes .......................................................................  32
SAS Host Bus Adapter (HBA) Card, Riser, and Cables ................................  32
Internal USB Ports ........... ................ ................ ................ .....................  34
5
Page 6
Contents
Energy Storage Module and Cable ............... ................ ................ ............  34
Server Subsystems .........................................................................................  34
Controls and Indicators ........................................................................... 34
Server Management Software ..................................................................  43
Storage and IO Subsystem . ................ ................ ................ ................ .....  45
Chassis Cooling Subsystem .....................................................................  47
Power Subsystem ..................................................................................  50
Server Block Diagram ....................................................................................  50
Troubleshooting and Diagnostics ........... ..........................................................  53
Troubleshooting Server Hardware Component Faults ... ................ ................ ........  53
Troubleshooting Server Hardware Faults ...................................................  54
Troubleshooting and Diagnostic Information ....... ................ ................ .......  58
Troubleshooting Indicators ......................................................................  58
Indicator Blink Rates .............................................................................  62
The CMOD Fault Remind Test Circuit .......... ................ ................ ............  68
Troubleshooting System Cooling Issues ..................................................... 68
Troubleshooting Power Issues ................ ................ ................ ................ .. 69
Troubleshooting With Diagnostic Tools .............................................................  70
Diagnostic Tools ...................................................................................  71
Diagnostic Tool Documentation ............... ................ ................ ................  72
Attaching Devices to the Server .............. ................ ................ ................ .........  72
▼ Attach Devices to the Server ........ ................ ................ ................ ......  73
Configuring Serial Port Sharing ...............................................................  73
Ethernet Port Device Naming ..................................................................  75
Getting Help .. ................ ................ ................ ................ ................ ...............  75
Contacting Support .... ................ ................ ................ ................ ............  76
Locating the Chassis Serial Number .........................................................  76
Servicing the Server .........................................................................................  79
Component Locations, and Designations ............................................................ 79
Component Serviceability Requirements ....................................................  79
Component Locations ............................................. ................ ................  80
Component Designations ........................................................................  82
Component Network Access Control (NAC) Names ......... ................ ...........  89
Performing Electrostatic Discharge and Static Prevention Measures ....... ................ .  90
Using an Anti-static Wrist Strap ........ .......................................................  90
6 Oracle Server X5-8 Service Manual • December 2015
Page 7
Contents
Using an Anti-static Mat ........................................................................  90
Tools and Equipment ................................ ................ ................ ................ .....  91
Component Filler Panels and Non-Powered Components ......................................  91
▼ Clear Hardware Fault Messages ..................................................................  92
Preparing for Service ........................................................................................  95
▼ Prepare the Server for Hot Service ..............................................................  95
▼ Prepare the Server for Warm Service .... ................ ................ ................ .......  96
▼ Prepare the Server for Cold Service ............................................................. 98
Powering Off the Server .......................................... ................ ................ .....  100
▼ Power Off the Server Using the Server OS ..........................................  101
▼ Power Off, Graceful (Power Button) ..................................................  101
▼ Power Off, Immediate (Power Button) .............. ................ ................ .. 102
▼ Power Off, Remote (Oracle ILOM CLI) ........................ ................ .....  103
▼ Power Off, Remote (Oracle ILOM Web Interface) .... ................ ............  104
▼ Remove Power ............................................................................... 105
Power Modes, Shutdowns, and Resets .....................................................  107
Managing the Locator Indicator .....................................................................  108
▼ Turn On the Locator Indicator Remotely (Oracle ILOM CLI) ............ ......  109
▼ Turn On the Locator Indicator Remotely (Oracle ILOM Web Interface) .....  110
▼ Control the Locator Indicator Locally .................................................  110
Servicing Components .... ................ ................ ................ ................ ................  113
▼ Upgrade the Server from Four to Eight CMODs ........................................... 113
Servicing Fan Modules and Fan Frames ..... ................ ................ ................ .....  117
▼ Remove a Fan Module ....................................................................  117
▼ Install a Fan Module .......................................................................  120
▼ Remove a Fan Frame ......................................................................  122
▼ Install a Fan Frame .........................................................................  124
Servicing Power Supply Units (PSUs) ...... ................ ................ ................ .......  126
▼ Remove a PSU ........... ................ ................ ................ ................ ...  126
▼ Install a PSU .................................................................................  129
Servicing the Front Indicator Module (FIM) .....................................................  132
▼ Remove the FIM ............. ................ ................ ................ ...............  133
▼ Install the FIM ...............................................................................  134
Servicing the CPU Module (CMOD) Components .............................................  136
▼ Remove a CMOD ...........................................................................  137
7
Page 8
Contents
▼ Remove and Install the CMOD Cover ................................................  140
▼ Install a CMOD .... ................ ................ ................ ................ .........  143
▼ Replace a Failed DIMM ................ ................ ................ ................ ..  146
▼ Install a DIMM ..............................................................................  152
▼ Remove a DIMM ...... ................ ................ ................ ................ .....  155
Memory and DIMM Reference ..............................................................  158
▼ Remove a Heatsink and Processor (FRU) ............... ................ ............. 161
▼ Install a Heatsink and Processor (FRU) ........... ................ ................ ...  172
Servicing Storage Drives ............................................................................... 180
▼ Remove a Storage Drive ..................................................................  180
▼ Install a Storage Drive .....................................................................  182
Storage Drive Reference ............... ................ ................ ........................  184
Servicing PCIe Cards and the Dual PCIe Card Carriers (DPCCs) ............ ..............  185
▼ Remove a DPCC ............................................................................  185
▼ Remove a PCIe Card ......................................................................  188
▼ Install a PCIe Card .........................................................................  190
▼ Install a DPCC ........................... ................ ................ ................ .... 193
▼ Replace a DPCC ............. ................ ................ ................ ...............  196
PCIe Card and DPCC Reference ........... ................ ................ ................ .  196
Servicing System Module (SMOD) Components ...............................................  197
▼ Remove the SMOD ........................................................................  197
▼ Install the SMOD ...... ................ ................ ................ ................ .....  200
Servicing the Host Bus Adapter (HBA) Card ............................................  202
Servicing the Energy Storage Module and Cables ......................................  209
Servicing the SAS Cable ....................................................................... 214
Servicing the Internal USB Flash Drives .. ................ ................ ................  215
▼ Replace the Real Time Clock (System) Battery ....................................  219
▼ Replace the Midplane Assembly ........................................................ .......  223
Returning the Server to Operation ..................................................................  235
▼ Prepare the Server for Operation ............ ................ ................ ................ ...  235
▼ Power On the Server ...............................................................................  236
BIOS Setup Utility ................................................ ................ ................ ............ 237
▼ Access the BIOS Setup Utility ..................................................................  237
BIOS Setup Utility Screens . ................ ................ ................ ................ ..........  238
Main Screen (Legacy) ..........................................................................  238
8 Oracle Server X5-8 Service Manual • December 2015
Page 9
Contents
Advanced Screen (Legacy) ....................................................................  244
Advanced - Processor Configuration .......................................................  245
Advanced - CPU Power Management Configuration ........... .......................  252
Memory Configuration .........................................................................  253
Advanced - USB Ports ............................................... ................ ..........  254
Advanced - Serial Port Console Redirection ............... ................ ..............  255
Advanced - Trusted Computing ..............................................................  259
Advanced - Network Stack ....................................................................  261
Advanced - Legacy iSCSI .....................................................................  262
Advanced - BMC Network Configuration ................................................  263
IO Screen ...........................................................................................  266
Boot Screens .......................................................................................  272
Boot Screen in Legacy Boot Mode .......................... ................ ...............  272
Index ............. ................ ................ ................ ................ ................ ................ ...  275
9
Page 10
10 Oracle Server X5-8 Service Manual • December 2015
Page 11

Using This Documentation

This section describes how to get the latest firmware, software, and documentation for the Oracle Server X5-8. It also provides feedback links and a document change history.
“Oracle Server X5-8 Model Naming Convention” on page 11
“Getting the Latest Firmware and Software” on page 11
“Documentation and Feedback” on page 12
“Contributors” on page 12
“Change History” on page 12
The information in this documentation set is presented in topic-based format (similar to online help) and therefore does not include chapters, appendixes, or section numbering.

Oracle Server X5-8 Model Naming Convention

The Oracle Server X5-8 name identifies the following:
X identifies an x86 product.
The first number, 5, identifies the generation of the server.
The second number, 8, identifies the maximum number of processors.

Getting the Latest Firmware and Software

Firmware, drivers, and other hardware-related software for each Oracle x86 server are updated periodically.
You can obtain the latest version in the following ways:
Oracle System Assistant: This is a factory-installed option for Oracle x86 servers. It has all the tools and drivers you need and resides on a USB drive installed in most servers.
You can download updates from My Oracle Support: https://support.oracle.com
Using This Documentation 11
Page 12

Documentation and Feedback

Documentation and Feedback
Documentation Link
All Oracle products
Oracle Server X5-8
Oracle Integrated Lights Out Manager (ILOM). Refer to the Oracle ILOM documentation.
Oracle Hardware Management Pack. Refer to the documentation for your supported version of Oracle HMP as listed in the Product Notes.
Provide feedback on this documentation at: http://www.oracle.com/goto/docfeedback

Contributors

Primary Authors: Michael Bechler, Cynthia Chin-Lee, Mark McGothigan.
http://docs.oracle.com
http://www.oracle.com/goto/X5-8/docs-videos
http://www.oracle.com/goto/ILOM/docs
http://www.oracle.com/goto/ohmp/docs
Contributors: William Schweickert, Anthony Villamor, Mick Tabor, Richard Masoner, Ray Angelo, Tamra Smith-Wasel, Denise Silverman.

Change History

The following lists the release history of this documentation set:
December 2015. Technical updates.
September 2015. Editorial improvements.
July 2015. Initial publication.
12 Oracle Server X5-8 Service Manual • December 2015
Page 13

About This Document

This document provides service, maintenance, and component replacement procedures for the Oracle Server X5-8.
The following table describes the major sections of this document.
Section Description Link
Important service information “Service Notes” on page 15
Server component and subsystem overviews “Server and Components Overview” on page 17
Troubleshooting procedures and information “Troubleshooting and Diagnostics” on page 53
General Information and procedures for servicing the server
Information and procedures for preparing the server for service
Procedures and information for removing and installing components
Procedures and information for returning the server to operation after performing service procedures
Server BIOS Setup Utility information and screen captures
“Servicing the Server” on page 79
“Preparing for Service” on page 95
“Servicing Components” on page 113
“Returning the Server to Operation” on page 235
“BIOS Setup Utility” on page 237
About This Document 13
Page 14
14 Oracle Server X5-8 Service Manual • December 2015
Page 15

Service Notes

This section contains preliminary service information:

Intended Audience

This guide is intended for trained technicians and authorized service personnel who have been instructed on the hazards within the equipment and qualified to replace and install hardware.

Warning Label

The following warning label is visible from the front of the server when you remove a fan module. It warns you to not insert your hands or any object into the space left vacant by the removal of the fan module. Fan modules are hot-swap components. Removing a fan module from a fully-powered server exposes open and active power connectors that can cause electric shock.
Service Notes 15
Page 16
16 Oracle Server X5-8 Service Manual • December 2015
Page 17

Server and Components Overview

This section describes the server and its subsystems. It includes:
Section Description Link
List of server features “Server Overview” on page 18
Chassis front, internal, and back components “Chassis Overview” on page 19
Features and components of the CPU module (CMOD)
Features and components of the system module (SMOD)
Server subsystems, their functions, and related components
Schematic-type block diagram of the server interconnects
“CPU Module (CMOD) Overview” on page 23
“System Module (SMOD) Overview” on page 28
“Server Subsystems” on page 34
“Server Block Diagram” on page 50
Server and Components Overview 17
Page 18

Server Overview

Server Overview
The Oracle Server X5-8 is a 5 rack-unit (RU) server with the following features:
Four and eight socket configurations that use Intel EX Xeon® E7-8895 v3 processors for a total of 72 or 144 cores.
Maximum memory: 3 TB (four socket) and 6 TB (eight socket) of DDR3 1333 memory.
Eight backside accessible SAS3 or SATA storage drive bays.
Expandable IO: eight 16-lane and eight 8-lane PCIe Gen3 slots and one 4 lane PCIe Gen 2 HBA slot.
One Emulex Pilot 3 service processor (SP) with 256 MB DDR3 memory, 256 MB of flash memory, and Oracle ILOM.
Four (N+N) hot-swap power supplies (PSUs).
Eight hot-swap redundant 100 watt cooling fan modules.
Note - For server specification information, see “Server Specifications” in Oracle Server X5-8
Installation Guide.
The following sections provide overviews of the main server components:
Component Link
Chassis “Chassis Overview” on page 19
CMODs “CPU Module (CMOD) Overview” on page 23
18 Oracle Server X5-8 Service Manual • December 2015
Page 19
Component Link
SMOD “System Module (SMOD) Overview” on page 28

Chassis Overview

The chassis consists of the front accessible components, internal components, and components accessible from the back of the server:
“Chassis Front Side Components” on page 19
“Chassis Internal Components” on page 20
“Chassis Backside Components” on page 22

Chassis Front Side Components

The following figure shows the front side components:
Chassis Overview
Server and Components Overview 19
Page 20
Chassis Overview
The front-side components include:
Call Out Component Link
1 Front indicator module (FIM) “Controls and Indicators” on page 34
2 Four power supplies “Power Subsystem” on page 50.
3 and 4 Eight fan modules (FMs) in two fan frames “Chassis Cooling Subsystem” on page 47
5 Two internal CMOD bays “CPU Module (CMOD)
Overview” on page 23

Chassis Internal Components

The following figure shows the chassis internal components:
20 Oracle Server X5-8 Service Manual • December 2015
Page 21
Chassis Overview
The chassis internal components include:
Call
Component Description
Out
1 CPU module
(CMOD) bays
2 Midplane/busbar The mid-plane assembly provides an interconnect between the backside components
CMOD bays can support either four or eight CMODs. Servicing CMODs requires warm or cold service.
For information about the CMODs, see “CPU Module (CMOD)
Overview” on page 23.
and the front-side components. This component requires cold service.
Server and Components Overview 21
Page 22
Chassis Overview

Chassis Backside Components

The following figure shows the chassis backside components:
The chassis backside components include:
Call Out
1 System Module (SMOD) The SMOD has internal components that can only be accessed by removing it
2 Dual PCIe card carrier
3 AC power block The AC power block has four AC power inlet connectors. The power block is
22 Oracle Server X5-8 Service Manual • December 2015
Component Description
(DPCC) bay
from the backside of the server.
For more information, see “System Module (SMOD)
Overview” on page 28.
The DPCC bay contains eight DPCCs and up to 16 PCIe cards.
For more information, see “Storage and IO Subsystem” on page 45.
not a removable component.
For more information, see “Power Subsystem” on page 50.
Page 23

CPU Module (CMOD) Overview

CPU modules (CMODs) contain the processors (CPUs) and the system memory, and supply power to the fan modules and the DPCCs.
CMODs are internal warm or cold-service components. To access the CMODs, you must remove the fan modules and the fan frames.
The following sections describe the CMOD configuration options and the internal layout of components:
“Processor and Memory Overview” on page 23
“CMOD Configuration Options” on page 23
“CMOD Layout” on page 27

Processor and Memory Overview

CPU Module (CMOD) Overview
Each CMOD contains one Intel Xeon® E7-8895 v3 (18-core 2.6 GHz) processor.
The maximum system memory with DDR3 1333 32 GB DIMMs is:
Four CMODs: 3 TB
Eight CMODs: 6TB

CMOD Configuration Options

The server supports four- and eight-CMOD configurations. In the four-CMOD configuration, the first four slots on the left (slots 0-3) are occupied and the four slots on the right (4-7) are unoccupied.
The following illustration shows a server with a four-CMOD configuration. In the illustration, the left-side fan modules and fan frame have been removed to show the four CMODs. Call out 1 identifies the group of four CMODs.
Server and Components Overview 23
Page 24
CPU Module (CMOD) Overview
The following illustration shows a server with a four-CMOD configuration with all eight fan modules and with both fan frames removed, exposing the empty CMOD bay on the right. The four right-side fan modules are not powered; however in a four-CMOD configuration, these fan modules must be installed. Call out 1 identifies the group of four CMODs installed on the left side of the server. Call out 2 identifies the empty CMOD bay on the right side of the server.
24 Oracle Server X5-8 Service Manual • December 2015
Page 25
CPU Module (CMOD) Overview
The following illustration shows a server with an eight (full) CMOD configuration. Call out 1 identifies the group of four CMODs installed on the left side of the server, and call out 2 identifies the second group of four CMODs installed on the right side of the server.
Server and Components Overview 25
Page 26
CPU Module (CMOD) Overview
In both CMOD configurations, the system includes four power supplies, eight fan modules, and eight DPCCs. However, fan modules and DPCCs receive power from the CMODs, so in a four-CPU configuration, only fan modules 0-3 and DPCCs 0-3 are active. Fan modules 4-7 and DPCCs 4-7 are not powered and not active.

CMOD Population Rules

The Oracle Server X5-8 supports four and eight CMOD configurations. Each CMOD supports a single socket containing a single Intel EX Xeon E7-8895 v3 processor.
For the four-socket server configuration:
CPU modules (CMODs) must be installed in slots 0-3.
DPCC slots 0-3 are active; however DPCCs 4-7 must be installed.
Both fan frames must be installed.
All eight fan modules (FMs) must be installed but only FMs 0-3 are active.
For the eight-socket server configuration:
CMODs must be installed in slots 0-7.
DPCC slots 0-7 are active.
26 Oracle Server X5-8 Service Manual • December 2015
Page 27
CPU Module (CMOD) Overview
Both fan frames must be installed.
All eight fan modules (FMs) must be installed and all FMs are active.

CMOD Layout

Each CMOD contains the following components:
Heatsink and processor assembly
24 DIMM slots arranged in four groups of six
DIMM test circuit, which helps you locate failed DIMMs and verify a failed CPU
Fault Remind button
Circuit Charge Status indicator
24 DIMM slot fault indicators
CPU fault indicator
The following illustration shows the location of the CMOD components.
Server and Components Overview 27
Page 28

System Module (SMOD) Overview

Call Out Description
1 Fault Remind button
2 Circuit Charge Status indicator
3 DIMM slots (24, four banks of six each)
4 DIMM slot fault indicators (24, one for each slot)
5 Heatsink and CPU assembly
6 CPU fault indicator
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.

CMOD and Fan Module Power

Fan modules (FMs) get power from CMODs. However, only CMODs in even-numbered slots supply power to fan modules. The following table shows which CMOD slots provide FM power.
Power Slots Fan Modules Powered
CMOD 0 FMs 0 and 1
CMOD 2 FMs 2 and 3
CMOD 4 FMs 4 and 5
CMOD 6 FMs 6 and 7
CMODs in slots 1, 3, 5, and 7 do not supply FM power.
System Module (SMOD) Overview
This section provides information about the server system module (SMOD) and its components. It includes:
“SMOD Overview” on page 29
“Storage Drives” on page 30
“SMOD Motherboard” on page 31
28 Oracle Server X5-8 Service Manual • December 2015
Page 29
“Service Processor (SP)” on page 31
“Storage Drive Backplanes” on page 32
“SAS Host Bus Adapter (HBA) Card, Riser, and Cables” on page 32
“Internal USB Ports” on page 34
“Energy Storage Module and Cable” on page 34

SMOD Overview

The SMOD components include:
Externally-accessible:
Server storage drives (HDD/SSD)
IO ports and two external USB ports
Internally accessible:
SMOD motherboard
Service processor (SP)
Storage drive backplane
SAS host bus adaptor (HBA)
Internal USB ports (2)
Energy storage module (ESM)
Real time clock battery
System Module (SMOD) Overview
The SMOD is located at the back of the server. It includes two removal and installation levers with green lock release tabs.
The following illustration shows the SMOD and the two release levers:
Server and Components Overview 29
Page 30
System Module (SMOD) Overview
Call Out Description
1 Removal and installation levers (2)
2 SMOD

Storage Drives

In the following illustration, call out 1 shows the location of the eight storage drive slots, which are arranged in two rows of four each.
30 Oracle Server X5-8 Service Manual • December 2015
Page 31
System Module (SMOD) Overview
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.

SMOD Motherboard

The SMOD motherboard hosts the service processor (SP), two disk backplanes (for the externally accessible server storage drives), the system real time clock battery, and an energy storage module for the HBA. It also has a PCIe riser for the server storage HBA, and two internal USB ports. The PCIe riser and the internal USB ports are located on the bottom of the SMOD.
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.

Service Processor (SP)

The system Emulex Pilot 3 service processor (SP) is located on the SMOD motherboard and is accessible locally and remotely through management ports on the front of the SMOD. The SP contains Oracle ILOM, an embedded server management tool. The SP is not removable.
Server and Components Overview 31
Page 32
System Module (SMOD) Overview

Storage Drive Backplanes

The externally accessible server storage drives on the SMOD connect to two backplanes mounted on the SMOD motherboard. A SAS cable also connects the backplane to the HBA card that is installed in a riser slot on the bottom of the SMOD. The backplanes are not removable or replaceable.
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.

SAS Host Bus Adapter (HBA) Card, Riser, and Cables

The server requires one internal HBA (Oracle Storage 12 Gb/s RAID HBA, Internal) for the externally-accessible SAS (or SATA) SMOD server storage drives. The HBA is installed in a riser slot on the underside of the SMOD motherboard and is connected to the backplanes by two mini-SAS4I connector cables.
The following illustration shows the HBA card installed on the underside (bottom) of the SMOD, the two SAS cables that connect the HBA to the server storage backplanes, and the cable that connects the energy storage module to the HBA.
32 Oracle Server X5-8 Service Manual • December 2015
Page 33
System Module (SMOD) Overview
Call Out Description
1 SMOD motherboard
2 Cable from HBA to ESM.
3 HBA
4 SAS cables
5 Backplanes
Server and Components Overview 33
Page 34

Server Subsystems

For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.

Internal USB Ports

The SMOD has two internal USB ports on the underside of the SMOD motherboard next to the PCIe card riser slot. The ports are designated as P0 and P1.
Unless you have opted out, port P0 has a factory-installed flash drive that contains Oracle System Assistant, a bootable server set-up, provisioning, and update tool. Port P0 can only be used to support Oracle System Assistant. It cannot be used to boot an OS or store files unrelated to Oracle System Assistant.

Energy Storage Module and Cable

The Energy Storage Module (ESM) provides backup power for the HBA. It sits in a holder in the top center of the SMOD, and has a cable that connects it to the HBA.
Server Subsystems
This section contains overviews of the server subsystems:
“Controls and Indicators” on page 34
“Server Management Software” on page 43
“Storage and IO Subsystem” on page 45
“Chassis Cooling Subsystem” on page 47
“Power Subsystem” on page 50

Controls and Indicators

The system management subsystem includes the buttons, switches, and indicators on the front and back of the server, and the embedded server management software, Oracle System Assistant and Oracle ILOM:
34 Oracle Server X5-8 Service Manual • December 2015
Page 35
Server Subsystems
“Front Indicator Module (FIM) Panel” on page 35
“Power Supply Unit (PSU) Indicators” on page 36
“Fan Module (FM) Indicators” on page 37
“Storage Drive Unit Indicators” on page 38
“Back Indicator Panel” on page 39
“Dual PCIe Card Carrier (DPCC) Indicators” on page 40
“AC Power Inlet Indicators” on page 41
“Switches and Buttons” on page 42
Front Indicator Module (FIM) Panel
The front indicator module (FIM) panel is located at the top left corner of the server (as viewed from the front of the server). It contains indicators and buttons that allow you to manage the server and determine its status.
The following illustration shows the buttons and indicators on the FIM.
Call Out Description Details
1 Locator indicator and
button
When activated remotely, it helps you find the server in a rack or room of servers. For more information about managing the Locator indicator remotely and locally, see “Managing the Locator Indicator” on page 108.
Server and Components Overview 35
Page 36
Server Subsystems
Call Out Description Details
2 Service Action
Required indicator
3 Power OK indicator Along with the SP indicator (below), it provides the status of the system power.
4 Power on and off
button
5 SP OK indicator Along with the Power OK indicator (above), it provides the status of the system
6 Server over-
temperature indicator
7 Rear (Back) Service
Action Required indicator
8 CMOD service action
required indicators (0-
7)
When lit, it indicates that a system fault has occurred. Other amber (fault) indicators might also be lit, which can help you isolate the fault to a particular subsystem. For more information about using the Service Action Required and subsystem fault indicators, see “Troubleshooting Indicators” on page 58.
For more information about using the Power and SP indicators to determine power state, see “Troubleshooting Indicators” on page 58.
Use it to manage power locally, when at the server. The duration of the button press determines the type of power off (graceful or immediate). For more information about using the Power button, see “Powering Off the
Server” on page 100 and “Power On the Server” on page 236.
power. For more information about using the Power and SP indicators to determine power state, see “Troubleshooting Indicators” on page 58.
When lit, it indicates that a fault has occurred in the cooling subsystem. The system Service Action Required indicator might also be lit. For more information about using the subsystem fault indicators and the system Service Action Required indicator, see “Troubleshooting Indicators” on page 58.
When lit, it indicates that a fault has occurred to one of the components on the server backside (SMOD, DPCC, PCIe card, or HBA). Other indicators (status and fault) might also be lit or in non-normal operating condition state (for example, if the backside SP Service Action Required indicator is lit, the system might not be able to boot and the Power OK indicator might not turn on). For more information about using the subsystem indicators and the system Service Action Required indicator, see “Troubleshooting Indicators” on page 58.
These light if the corresponding CMOD is in a fault state.
Power Supply Unit (PSU) Indicators
Each power supply unit (PSU) has three indicators arranged in a single row from left to right.
36 Oracle Server X5-8 Service Manual • December 2015
Page 37
Server Subsystems
Call Out Description Description
1 Service Action Required/Locate
(amber)
2 Status OK indicator (green) Lights steady on when the PSU is powered on and in a
3 AC OK indicator (green) Lights steady on when the PSU is connected to a
4 Release lever Used to release the power supply from the chassis
Lights steady on when the power supply is in a fault state.
normal functioning state (in this state, the AC indicator is also lit)
properly rated AC power source
Fan Module (FM) Indicators
Each fan module (FM) has two indicators arranged in a single row and from left to right as follows:
The following illustration shows the front of the FM.
Server and Components Overview 37
Page 38
Server Subsystems
Call Out Description Function
1 Service Action Required
indicator (amber)
2 Status OK indicator (green) Lights steady on when the FM is powered on and functioning
3 Release button Press to release the fan module so you can remove it.
Lights steady on when the FM is in a fault state.
properly.
Storage Drive Unit Indicators
Storage drives are installed in carriers. Each storage drive carrier has three indicators arranged in a single stacked row and from bottom to top.
The following illustration shows the front of the storage drive carrier and the storage drive indicators.
38 Oracle Server X5-8 Service Manual • December 2015
Page 39
Server Subsystems
Call Out Description Function
1 Ready to Remove indicator (blue) Lights when the storage drive is ready to be removed
2 Service Action Required indicator
(amber)
3 OK indicator (green) Lights when the storage drive is functioning normally and
from the server in response to an action initiated from the server OS.
Lights steady on when the drive is in a fault state.
blinks to show activity.
Note - The storage drive indicators blink at various rates
depending on the activity. For more information on blink rates, see “Indicator Blink Rates” on page 62.
Back Indicator Panel
The back indicator panel located on the SMOD allows you manage the server and determine its status. It includes some indicators and buttons not found on the front indicator module (FIM), including reset switches and indicators for SMOD components.
The following figure shows the back inidcator panel:
Call Out
1 Non-Maskable Interrupt
Description Details
(NMI) button
Service personnel only. Do not press.
This button requires a stylus.
Server and Components Overview 39
Page 40
Server Subsystems
Call Out
2 Host reset button (recessed) This button performs an immediate host reboot
3 Locator indicator and button When activated remotely, it helps you find the server. Locally it can be
4 System Service Action
5 Power OK indicator Along with the SP indicator (below), it provides the status of the system
6 SP OK indicator Along with the Power OK indicator (above), it provides the status of the
7 SP reset button (recessed) Press to manually reset the service processor if it becomes unresponsive,
8 SMOD Service Action
9 HBA Service Action
Description Details
This button requires a stylus.
pressed to prove physical presence.
For more information about managing the Locator indicator remotely and locally, see “Managing the Locator Indicator” on page 108.
Required indicator
Required indicator
Required indicator
When lit, it indicates that a system fault has occurred. Other amber (fault) indicators might also be lit, which can help you isolate the fault to a particular subsystem.
For more information about using the Service Action Required and subsystem fault indicators, see “Troubleshooting
Indicators” on page 58.
power. For more information about using the Power and SP indicators to determine power state, see “Troubleshooting Indicators” on page 58.
system power.
For more information about using the Power and SP indicators to determine power state, see “Troubleshooting Indicators” on page 58.
requires a reset, or fails to boot to standby power. This button requires a stylus.
Lights when the SMOD requires service.
Lights when the HBA requires service.
Dual PCIe Card Carrier (DPCC) Indicators
Each DPCC has two indicator panels, one for each PCIe slot inside the server. Each panel contains a green OK indicator, an amber Service Action Required indicator, and a recessed pinhole Attention (ATTN) button.
40 Oracle Server X5-8 Service Manual • December 2015
Page 41
Server Subsystems
Call Out Description
1 Recessed pinhole button
2 Service Action Required/Locator indicator
3 OK indicator
AC Power Inlet Indicators
Each power inlet on the AC power block at the back of the server has a single green OK indicator that turns on steady only when the power at the connector is sufficient for the power supply unit. In the following illustration, call out 1 shows the OK indicator for inlet AC 0.
Server and Components Overview 41
Page 42
Server Subsystems
Switches and Buttons
When you are at the server, the following switches and buttons are accessible:
Front panel Power button
Allows you to control server power while local to (at) the server. For power off information, see “Powering Off the Server” on page 100. For power on information, see “Power On
the Server” on page 236.
Two Locator indicator buttons (one on the front of the server and one on the back)
The buttons allow you to manage the Locator indicator locally. To deactivate (or activate) the Locator indicator, press and release the button (see “Managing the Locator
Indicator” on page 108).
Service processor (SP) pinhole reset button on the back of the server
The SP reset button allows you to manually reset the SP. Use the reset button if the SP becomes unresponsive, requires a reset, or fails to boot into standby power mode (activating the button requires the use of a stylus).
For location information, see “Back Indicator Panel” on page 39.
Host pinhole reset button on the back of the server
42 Oracle Server X5-8 Service Manual • December 2015
Page 43
Server Subsystems
The Host Reset button allows you to perform an immediate reboot of the server (activating the button requires the use of a stylus).
For location information, see “Back Indicator Panel” on page 39.
NMI pinhole button on the back of the server
The NMI button is used by Service personnel only. Do not press.
For location information, see “Back Indicator Panel” on page 39.
CMOD Fault Remind button
Each CMOD has a motherboard-mounted Fault Remind button. The button is part of the CMOD Fault Remind circuit. The circuit is charged and allows you to identify a failed DIMM or CPU after the CMOD has been removed for the server.
For button location information, see “CMOD Layout” on page 27.
Sixteen (16) recessed ATTN (attention) buttons (two on each DPCC)
The buttons are used to initiate DPCC removal and install. Before removing a DPCC, use a stylus to press both ATTN buttons. After installing a DPCC that contains a PCIe card, press the button again.
For button location information, see “Front Indicator Module (FIM) Panel” on page 35 and “Back Indicator Panel” on page 39.

Server Management Software

The system management software includes:
“Service Processor (SP) Oracle ILOM” on page 43
“Oracle System Assistant” on page 44
“Oracle Hardware Management Pack” on page 44
Service Processor (SP) Oracle ILOM
The server System Module (SMOD) includes an Emulex Pilot 3 service processor (SP) that runs Oracle ILOM. Oracle ILOM allows you to manage and monitor the server locally or remotely in full power or standby power modes. Local and remote interface and control connections to the SP are on the back of the server and include a RJ45 10/100/1000 GigabitEthernet port (remote access) and an RJ45 serial connector and DB15 VGA connector (local access). For information about Oracle ILOM, including initial release version and update information, see http://www.oracle.com/goto/ILOM/docs.
Server and Components Overview 43
Page 44
Server Subsystems
Oracle System Assistant
Your server might also come equipped with Oracle System Assistant, a server provisioning and update tool that assists in initial server set up and OS installation and allows you to easily manage server updates. A server-specific version of Oracle System Assistant is installed on the internal SMOD USB slot P0 at the factory.
You can start Oracle System Assistant from the server boot screen or from Oracle ILOM.
With Oracle System Assistant, you can:
Get a single server-specific bundle of the latest available BIOS, Oracle ILOM, and hardware firmware and the latest tools and OS drivers from the Oracle support site.
Update OS drivers and component firmware and configure RAID.
Install supported operating systems with the latest drivers and supported tools.
Configure a subset of Oracle ILOM settings.
Save and restore customized BIOS settings or revert the BIOS to the factory defaults.
Display system overview and detailed hardware inventory information.
For more information, refer to the Oracle X5 Series Servers Administration Guide at: http://
www.oracle.com/goto/x86AdminDiag/docs
Oracle Hardware Management Pack
Oracle Hardware Management Pack (HMP) provides a family of command-line interface (CLI) tools for managing your servers, and an SNMP monitoring agent.
You can use the Oracle Server CLI tools to configure Oracle servers. The CLI tools work with Oracle Solaris, Oracle Linux, Oracle VM, other variants of Linux, and Windows operating systems. They can be scripted to support multiple servers, as long as the servers are of the same type.
With the Hardware Management Agent SNMP Plugins, you can use SNMP to monitor Oracle servers from the operating system using a single host IP address. This prevents you from having to connect to two management points (Oracle ILOM and the host).
The Hardware Management Agent fetches and pushes information to and from Oracle ILOM. The SNMP Plugins provides an industry-standard SNMP user interface.
Oracle Linux Fault Management Architecture (FMA) allows you to manage faults at the operating system level using commands similar to those in the Oracle ILOM Fault Management shell on systems with Oracle Linux 6.5 or newer. Oracle Linux FMA is available on Hardware Management Pack 2.3.
44 Oracle Server X5-8 Service Manual • December 2015
Page 45
For more details on Oracle Hardware Management Pack, refer to:
http://www.oracle.com/goto/OHMP/docs

Storage and IO Subsystem

The server storage and input/ouput subsystem consists of the following:
8 or 16 PCIe Gen3 IO slots (up to eight 16-lane + eight 8-lane)
8 SAS2/SATA3 HDD or SSD SFF drive
Two 1G/100/10 Ethernet ports
4 USB 2.0 ports (2 external, 2 internal)
Back Panel Ports and Connectors
The following figure shows the back panel ports and connectors.
Server Subsystems
Server and Components Overview 45
Page 46
Server Subsystems
Call Out Description
1 Video DB-15
2 USB 2.0 port
3 USB 2.0 port
4 1 RJ45 10/100/1000 Ethernet service processor (SP) port (NET MGT)
5 1 RJ45 RS-232 serial console port (SER MGT)
6 RJ45 Host GigabitEthernet port (NET 0)
7 RJ45 Host GigabitEthernet port (NET 1)
8 AC power inlets
Dual PCIe Card Carrier (DPCC)
In the following illustration, call out 1 shows the location of the dual PCIe card carriers (DPCCs). The eight DPCCs are directly accessible from the back of the server and are located below the SMOD. Each DPCC holds two PCIe cards.
46 Oracle Server X5-8 Service Manual • December 2015
Page 47
Server Subsystems
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.

Chassis Cooling Subsystem

System cooling air flows from front to back. Primary cooling is provided by eight redundant front-side accessible 100 watt hot-swappable cooling fan modules. To maintain the integrity of the cooling system, ensure that:
All CMOD processors have a heat sink.
Each drive bay contains a storage device or a drive slot filler.
Every DPCC is installed regardless of whether it contains a card or not.
Both fan frames are populated with fan modules.
Server and Components Overview 47
Page 48
Server Subsystems
Cooling Zones
The server has five cooling zones. The cooling zones are designated from left to right (from the front of the server) as zone 0 - zone 4. The airflow cooling in zone 0 is concentrated through the power supplies (PSUs) and is provided by the internal PSU fan modules.
The fan modules (FMs) provide the airflow cooling for zones 1-4. Each zone has a pair of dedicated FMs:
Zone 1 airflow cooling is concentrated on the CPU modules (CMODs) 0 and 1 and is provided by FMs 0 and 1.
Zone 2 airflow cooling is concentrated on CMODs 2 and 3 and is provided by FMs 2 and 3.
Zone 3 airflow cooling is concentrated on CMODs 4 and 5 and is provided by FMs 4 and 5.
Zone 4 airflow is concentrated on CMODs 6 and 7 and is provided by FMs 6 and 7.
Note - In a four-CMOD configured server, the fan modules for cooling zones 3 and 4 are
not powered. However, to maintain the integrity of the cooling subsystem, FMs 4-7 must be installed in the server.
Call Out Description
0 Zone 0: Power supplies
Cooling provided by power supply fans
48 Oracle Server X5-8 Service Manual • December 2015
Page 49
Server Subsystems
Call Out Description
1 Zone 1: CMODS 0 and 1
Cooling provided by FMs 0 and 1
2 Zone 2: CMODs 2 and 3
Cooling provided by FMs 2 and 3
3 Zone 3: CMODs 4 and 5
Cooling provided by FMs 4 and 5
4 Zone 4: CMODs 6 and 7
Cooling provided by FMs 6 and 7
Cooling Fan Power
Power for the internal PSU cooling fans (zone 0) is provided by the PSUs. Power for the fan modules (zones 1-4) is supplied by CMODs 0, 2, 4, and 6.
The chassis cooling fans operate only when the chassis is in full power mode (see “Full
Power Mode” on page 107).
The PSU fans operate when the system is in full power or standby power mode.
The following table shows the CMODs and the fan modules to which they supply power:
CMOD Fan Modules Powered
CMOD 0 FMs 0 and 1
CMOD 2 FMs 2 and 3
CMOD 4 FMs 4 and 5
CMOD 6 FMs 6 and 7
Note - The fan power connectors for CMODs in slots 1,3,5, and 7 are not used.
Fan Module Redundancy
The eight fan modules (FMs) provide airflow for chassis cooling zones 1-4. For redundancy, each zone has two dedicated FMs. If an FM fails, replace it immediately. The FMs are hot­serviceable.
Server and Components Overview 49
Page 50

Server Block Diagram

Caution - Data Loss. Do not remove more than one fan module from a column while the system
is in full power mode. This action removes power from the CMODs and causes an immediate shutdown. On an eight-CMOD system, this applies to all fan modules. On a four-CMOD system, this applies to the fan modules in the left-hand fan frame.
For FM reference and servicing information, see “Servicing Fan Modules and Fan
Frames” on page 117.

Power Subsystem

Chassis power is provided by four hot-serviceable front-side accessible power supply units (PSUs). The four PSUs provide dual (2+2) redundancy. Therefore, the minimum PSU configuration is two. To ensure redundancy, power for the server should come from at least two separate circuits.
When the AC power cords are connected to AC inputs at the back of the chassis, the PSUs supply power to the Ethernet ports, the system sensors and inventory circuits, and the service processor (SP). When power is supplied to the SP, it boots, and the server enters the low-power standby power mode.
Once the SP boots into standby power mode, full power mode is initiated by pressing and releasing the chassis front-panel Power button or by powering on the server remotely from Oracle ILOM.
For more information about power modes, see “Power Modes, Shutdowns, and
Resets” on page 107.
Server Block Diagram
The following illustration shows the a block diagram of the server interconnects between the CMODs, the midplane, and the SMOD. It also shows the interconnects between attached and integrated components:
50 Oracle Server X5-8 Service Manual • December 2015
Page 51
Server Block Diagram
Server and Components Overview 51
Page 52
52 Oracle Server X5-8 Service Manual • December 2015
Page 53

Troubleshooting and Diagnostics

This section provides information about troubleshooting hardware component faults for the Oracle Server X5-8. It contains the following topics.
Description Link
Maintenance-related information and procedures used to troubleshoot and repair server hardware issues.
Information about software and firmware diagnostic tools used to isolate problems, monitor the server, and exercise the server subsystems.
Information about attaching devices to the server to perform troubleshooting.
Information about contacting Oracle support. “Getting Help” on page 75

Troubleshooting Server Hardware Component Faults

“Troubleshooting Server Hardware Component Faults” on page 53
“Troubleshooting With Diagnostic Tools” on page 70
“Attaching Devices to the Server” on page 72
This section contains maintenance-related information and procedures used to troubleshoot and repair server hardware. It includes the following topics.
Description Section Links
Troubleshooting overview information and procedure
Source listing for troubleshooting and diagnostic information
Discerning the server state using the front panel indicators
Explanation of indicator blink rates “Indicator Blink Rates” on page 62
Explanation of the CMOD Fault Remind Test Circuit
“Troubleshooting Server Hardware Faults” on page 54
“Troubleshooting and Diagnostic Information” on page 58
“Troubleshooting Indicators” on page 58
“The CMOD Fault Remind Test Circuit” on page 68
Troubleshooting and Diagnostics 53
Page 54
Troubleshooting Server Hardware Component Faults
Description Section Links
Causes, actions, and preventative measures for problems related to the cooling subsystem
Causes, actions, and preventative measures for problems related to the power subsystem

Troubleshooting Server Hardware Faults

When a server hardware fault event occurs the system lights the Service Action Required LED and captures the event in the system event log (SEL). If you have set up notifications through Oracle ILOM, you also receive an alert through the notification method you chose.
When you become aware of a hardware fault, you should address it immediately.
To investigate a hardware fault, see the following:
“Basic Troubleshooting Steps” on page 54
“Troubleshoot Hardware Faults” on page 55
“Troubleshooting System Cooling Issues” on page 68
“Troubleshooting Power Issues” on page 69
Basic Troubleshooting Steps
Use the following process to address a hardware fault (for the step-by-step procedure, see
“Troubleshoot Hardware Faults” on page 55):
1. Identify the server subsystem containing the fault.
You can use Oracle ILOM to identify the failed component.
2. Review the Product Notes.
Once you have identified the hardware issue, review the Oracle Server X5-8 Product Notes. This document contains up-to-date information about the server, including hardware-related issues.
3. Prepare the server for service using Oracle ILOM.
If you have determined that the hardware fault requires service (physical access to the server), use Oracle ILOM to power off the server, activate the Locate LED, and take the server offline.
4. Prepare the service work space.
Before servicing the server, prepare the work space to ensure ESD protection for the server and components.
54 Oracle Server X5-8 Service Manual • December 2015
Page 55
Troubleshoot Hardware Faults
5. Service components.
To service the components, see the removal, installation, and replacement procedures in this document.
Note - A component designated as a FRU must be replaced by Oracle Service personnel.
Contact Oracle Service.
6. Clear the fault in Oracle ILOM.
Depending on the component, you might need to clear the fault in Oracle ILOM. Generally, components that have a FRU ID clear the fault automatically.
See Also:
“Troubleshoot Hardware Faults” on page 55
Troubleshoot Hardware Faults
Before You Begin
1.
Note - The screens and information in this procedure might differ from those for your server.
This procedure uses the basic troubleshooting steps described in “Basic Troubleshooting
Steps” on page 54.
Use this procedure to troubleshoot hardware faults with the Oracle ILOM web interface and, if necessary, prepare the server for service.
Note - This procedure provides one basic approach to troubleshooting hardware faults. It uses
a combination of the Oracle ILOM web and CLI interfaces. However, the procedure can be performed using only the Oracle ILOM CLI interface. For more information about the Oracle ILOM web interface, refer to the Oracle ILOM documentation.
Obtain the latest version of the Oracle Server X5-8 Product Notes.
Log in to the server SP Oracle ILOM web interface.
Open a browser and type in the IP address of the server SP. Enter a user name (with administrator privileges) and password at the log-in screen. The Summary screen appears.
The Status section of the Summary screen provides information about the server subsystems, such as:
Processors
Troubleshooting and Diagnostics 55
Page 56
Troubleshoot Hardware Faults
Memory
Power
Cooling
Storage
Networking
I/O Modules
2.
In the Status section of the summary screen, identify the server subsystem that requires service.
In the above example, the Status screen shows that the Memory subsystem requires service. This indicates that a hardware component within the subsystem is in a fault state.
3.
To identify the component, click on the subsystem name.
56 Oracle Server X5-8 Service Manual • December 2015
Page 57
The subsystem screen appears.
Troubleshoot Hardware Faults
The above example shows the Memory subsystem screen and indicates that DIMM 8 on CPU 0 has an uncorrectable ECC fault.
4.
To get more information, click one of the Open Problems links.
The Open Problems screen provides detailed information, such as the time the event occurred, the component and subsystem name, and a description of the issue. It also includes a link to a KnowledgeBase article.
Tip - The System Log provides a chronological list of all the system events and faults that have
occurred since the log was last reset and includes additional information, such as severity levels and error counts. To access it, click the System Log link.
In this example, the hardware fault with DIMM 8 of CPU 0 requires local (physical) access to the server.
5.
Before going to the server, review the server Product Notes document for information related to the issue or the component.
The Oracle Server X5-8 Product Notes contains up-to-date information about the server, including hardware-related issues.
6.
To prepare the server for service, see “Preparing for Service” on page 95.
Troubleshooting and Diagnostics 57
Page 58
Troubleshoot Hardware Faults
Note - After servicing the component, you might need to clear the fault in Oracle ILOM. Refer
the service procedure for the component for more information.

Troubleshooting and Diagnostic Information

The following table lists diagnostic- and troubleshooting-related procedures and references that can assist you with resolving server issues.
Description Link
Diagnostic information for the x86 servers, including procedures for performing runtime and firmware-based tests, using Oracle ILOM, and running U-Boot and UEFIdiag to exercise the system and isolate subtle and intermittent hardware-related problems.
Administrative information for the Oracle Server X5 series servers, including information about how to use Oracle System Assistant and using the Oracle ILOM system event log (SEL) to identify a problem's possible source.
Oracle x86 Server Diagnostics, Applications, and Utilities Guide
Oracle X5 Series Servers Administration Guide

Troubleshooting Indicators

The eight indicators on the server front panel show the state of the server. For more information about indicator locations, see “Front Indicator Module (FIM) Panel” on page 35 and “Back
Indicator Panel” on page 39.
The following sections describe the status of the front panel indicators for various server states:
Note - For the error state scenarios described below, the state of the Power OK indicator
depends on presence of redundant components and the severity of the fault.
“Server Boot Process and Normal Operating State Indicators” on page 59
“Locator Indicator On” on page 60
“Over Temperature Condition” on page 60
58 Oracle Server X5-8 Service Manual • December 2015
Page 59
Troubleshoot Hardware Faults
“PSU Failure” on page 61
“Memory Failure” on page 61
“CPU Failure” on page 61
“Fan Module Failure” on page 61
“SP Failure” on page 61
“Front Panel Lamp Test” on page 62
Server Boot Process and Normal Operating State Indicators
A normal server boot process involves the service processor (SP) indicator and the Power OK indicator. In the illustration below, call out 1 shows the Power OK indicator and call out 3 shows the SP indicator. Call out 2 shows the power button.
The following table describes the indicator activity during a normal boot sequence.
System Condition SP Indicator Power OK Indicator
AC power applied to server. SP is booting. Blinks Off
SP is booted and ready to use. Host is off. Steady On Blinks at single blink
SP is running. Host is booting. Steady On Blinks at fast rate
SP and host are running. This is the normal operating state of the system.
Steady On Steady On
Troubleshooting and Diagnostics 59
rate (quick flash every 3 seconds)
Page 60
Troubleshoot Hardware Faults
Locator Indicator On
The Locator indicator is a white combination button/indicator located on both the front and back panels. When it is on, it blinks at the fast blink rate:
Turn it on remotely from Oracle ILOM to locate the server in a rack.
Typically, a server readied for service is placed in standby power mode and the Locator indicator is turned on.
Press the button to prove physical presence. Some service procedures require you to prove physical presence by pressing the Locator indicator button.
You can turn the Locator indicator off remotely from Oracle ILOM, or by pressing the button.
The following figure shows the Locator indicator on the front panel:
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
Over Temperature Condition
For a server in an over-temperature state, the server amber over-temperature indicator and the amber Service Action Required indicators (front and back) are on steady. The state of the front and back green Power OK indicator and the green SP indicator depends on the severity of the condition.
60 Oracle Server X5-8 Service Manual • December 2015
Page 61
Troubleshoot Hardware Faults
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
PSU Failure
For a server with a PSU in a failed state, the server amber Service Action Required indicators (front and back) and the amber Servic Action Required indicator on the PSU are on steady. The front and back green Power OK indicator and the green SP indicator are on steady.
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
Memory Failure
For a server with a failure in the memory subsystem, the server amber Service Action Required indicators (front and back) and an amber CMOD Service Action Required indicator are on steady. The front and back green Power OK indicator and the green SP indicator are on steady.
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
CPU Failure
For a server with a fault in the processor subsystem, the server amber Service Action Required indicators (front and back) and an amber CMOD Service Action Required indicator are on steady. The activity of front and back green Power OK indicator and the green SP indicator vary depending on whether the server can boot successfully. The server might not be able to boot out of standby power mode.
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
Fan Module Failure
For a server with a fan module fault, the server amber Service Action Required indicators (front and back) and an amber Service Action Required indicator on a fan module are on steady. The front and back green server OK indicator and the green SP indicator are on steady.
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
SP Failure
For a server with an SP fault, the server amber Service Action Required indicators (front and back) are on steady. The front and back System OK indicators and the SP OK indicator are off.
Troubleshooting and Diagnostics 61
Page 62
Troubleshoot Hardware Faults
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
Front Panel Lamp Test
To perform a lamp test of all front panel indicators, press the Locate button three times within a five second period. All the front and back indicators light up and remain on steady for 15 seconds (see “Unison Steady On” on page 66).
For indicator blink rate information, see “Indicator Blink Rates” on page 62.

Indicator Blink Rates

This section describes the following indicator blink rates:
“Steady On” on page 62
“Steady Off” on page 63
“Slow Blink Rate” on page 63
“Fast Blink Rate” on page 64
“Single (Standby) Blink Rate” on page 64
“Slow Unison Blink Rate” on page 65
“Insertion Blink” on page 65
“Unison Steady On” on page 66
“Alternating (Invalid FRU) Blink Rate” on page 66
“Feedback Flash” on page 67
“Data Blink Rate” on page 67
“Sequential (Diagnostic) Blink Rate” on page 67
Steady On
For the steady on state, an indicator is continually on (lit) and does not blink. This indicates a continuing condition, for example, an operational state (green) or a Service Action Required fault state (amber).
62 Oracle Server X5-8 Service Manual • December 2015
Page 63
Troubleshoot Hardware Faults
Steady Off
For the steady off state, an indicator is continually off (not lit) and does not blink. This indicates that a system is not operational, for example, no AC power (unlit green Power OK indicator) or a subsystem not in a fault state (unlit amber Service Action Required indicator).
Slow Blink Rate
For the slow blink rate, the indicator (typically green) repeatedly lights for half a second during a one second interval (1 Hz) and turns off for half a second. The slow blink rate indicates an on-going activity. For example, device rebuilding, booting, or in transition from one mode to another.
Troubleshooting and Diagnostics 63
Page 64
Troubleshoot Hardware Faults
Fast Blink Rate
For the fast blink rate, the indicator repeatedly blinks twice (on, off, on) during a one second interval (2 Hz). The fast blink rate indicates activity or data transfer.
Single (Standby) Blink Rate
For the single blink rate, the indicator repeatedly flashes once at the beginning of a three second interval. This indicates a system or component in standby mode. For example, a server in standby power mode or a hot spare device waiting to be used (also used with amber indicators to indicate a predicted fault).
64 Oracle Server X5-8 Service Manual • December 2015
Page 65
Troubleshoot Hardware Faults
Slow Unison Blink Rate
For the slow unison blink rate, the indicators on the component blink in unison for half a second during a one second interval (1 Hz). Typically, this is limited to three successive blinks. This confirms the successful insertion of a removable device (for example, a storage drive) into a powered system (confirming the power connection).
Insertion Blink
The insertion blink is three successive blinks of a hot-swap component's primary status indicator (for example, the green Power OK indicator). The insertion blink occurs immediately
Troubleshooting and Diagnostics 65
Page 66
Troubleshoot Hardware Faults
after three successive unison blinks (see “Slow Unison Blink Rate” on page 65) of all the component indicators.
Unison Steady On
For the unison steady on, all indicators are simultaneously on steady (see “Steady
On” on page 62. This occurs during the front panel lamp test (see “Front Panel Lamp Test” on page 62). This is the only time that the Locator indicator is on steady.
Alternating (Invalid FRU) Blink Rate
The alternating (invalid FRU) blink rate is a repeating sequence of lit green and amber indicators at 1 Hz. This indicates that a component has an incorrect version or mismatch (for example, a power supply with a lower rating than the one specified). The blink rate is also used for an unsupported component, or a component in an unsupported slot.
66 Oracle Server X5-8 Service Manual • December 2015
Page 67
Troubleshoot Hardware Faults
Feedback Flash
The indicator flashes on and off during periods of activity, commensurate with the activity, but the flashing does not exceed the 2 Hz fast blink rate (see, “Fast Blink Rate” on page 64). For example, this blink rate occurs during disk drive read and write activity and communication port transmit and receive activity.
Data Blink Rate
For this blink rate, a normally on Indicator repeatedly turns off twice during a one-second interval (2 Hz—see also, “Fast Blink Rate” on page 64) while data activity is taking place.
Sequential (Diagnostic) Blink Rate
This blink rate is a repeating sequence in which each indicator successively lights for 0.5 sec to indicate that diagnostics are running. This blink rate is used only on systems or components capable of running diagnostics.
Troubleshooting and Diagnostics 67
Page 68
Troubleshoot Hardware Faults

The CMOD Fault Remind Test Circuit

The CMODs have an internal test circuit that you can use to locate failed DIMMs and verify a failed CPU after removing the CMOD from the server. The DIMM and CPU Fault Remind circuits hold an electrical charge for 10 minutes after power is removed from the server, allowing enough time to remove the CMOD and use the circuit.
For more information, see “Replace a Failed DIMM” on page 146 and “Remove a Heatsink
and Processor (FRU)” on page 161.

Troubleshooting System Cooling Issues

Maintaining the proper internal operating temperature of the server is crucial to a the health of the server. To prevent server shutdown and damage to components, address over temperature and hardware related issues as soon as they occur. If your server has a temperature fault, the cause of the problem might be:
“External Ambient Temperature Too High” on page 68
“Airflow Blockage” on page 68
“Hardware Component Failure” on page 69
External Ambient Temperature Too High
If the ambient temperature in the server space is too high, the cool air that is pulled into the server cannot cool the server sufficiently to prevent the internal temperature from rising. This can cause poor performance or component failure.
Action: Check the ambient temperature of the server space against the environmental specifications for the server. If the temperature is not within the required operating range, remedy the situation immediately.
Prevention: Periodically check the ambient temperature of the server space to ensure that it is within the required range, especially if you have made any changes to the server space (for example, added additional servers). The temperature must be consistent and stable.
Airflow Blockage
The server cooling system uses fans to pull cool air in from the server front intake vents and exhaust warm air out the server back panel vents. If the front or back vents are blocked, the
68 Oracle Server X5-8 Service Manual • December 2015
Page 69
Troubleshoot Hardware Faults
airflow through the server is disrupted and the cooling system fails to function properly causing the server internal temperature to rise.
Action: Inspect the server front and back panel vents for blockage from dust or debris. Additionally, inspect the server interior for improperly installed components or cables that can block the flow of air through the server.
Prevention: Periodically inspect and clean the server vents using a vacuum cleaner. Ensure that all components, such as cards, cables, fans, air baffles and dividers are properly installed.
Hardware Component Failure
Fan modules and power supply fans drive the server cooling system. When one of these components fails, the server internal temperature can rise. This rise in temperature can cause other components to enter into an over-temperature state. Additionally, some components, such as processors, might overheat when they are failing, which can also generate an over­temperature event.
To reduce the risk related to component failure, power supplies and fan modules are installed in pairs to provide redundancy. Redundancy ensures that if one component in the pair fails, the remaining component can continue to maintain the subsystem. For example, power supplies serve a dual function; they provide both power and airflow. If one power supply fails, the other functioning power supply is able to maintain both the power and the cooling subsystems.
Action: Investigate the cause of the over-temperature event, and replace failed components immediately. For hardware troubleshooting information, see “Troubleshooting Server Hardware
Faults” on page 54.
Prevention: Maintain redundant systems and replace failed components immediately.

Troubleshooting Power Issues

If your server does not power on, the cause of the problem might be:
“AC Power Connection” on page 69
“Power Supplies (PSUs)” on page 70
AC Power Connection
The AC power cords are the direct connection between the server power supplies and the power sources. The server power supplies need separate stable AC circuits operating at specific
Troubleshooting and Diagnostics 69
Page 70

Troubleshooting With Diagnostic Tools

voltage levels. Insufficient voltage levels or voltage fluctuations can cause server power problems.
Action: Check that the AC power cords are connected to the server. Check that the correct power is present at the outlets and monitor the power to verify that it is within the acceptable range.
AC OK indicators next to the AC inlets on the back of the server are green when the power is connected, and off when it is not.
The AC OK and DC OK indicators on the PSU indicator panels on the front of the system are green when the PSU is functioning properly.
Prevention: Use the AC power cord retaining clips and position the cords to minimize the risk of accidental disconnection. Ensure that the AC circuits that supply power to the server are stable and not overburdened.
Power Supplies (PSUs)
The server power supplies (PSUs) provide the necessary server voltages from the AC power outlets. If the PSUs are inoperable, unplugged, or disengaged from the internal connectors, the server cannot power on.
Action: Check that the AC cables are connected to both PSUs. Check that the PSUs are operational (the PSU indicator panel should have a lit green AC OK indicator). Ensure that the PSU is properly installed. A PSU that is not fully engaged with its internal connector does not have power applied and does not have a lit green AC OK indicator.
Prevention: When a power supply fails, replace it immediately. When installing a power supply, ensure that it is fully seated and engaged with its connector inside the drive bay. A properly installed PSU, has a lit green AC OK indicator.
Troubleshooting With Diagnostic Tools
The server and its accompanying software and firmware contain diagnostic tools and features that can help you isolate component problems, monitor the status of a functioning system, and exercise one or more subsystems to disclose more subtle or intermittent hardware-related problems.
Each diagnostic tool has its own specific strength and application. Review the tools listed in this section and determine which tool might be best to use for your situation. Once you've determined the tool to use, you can access it locally, while at the server, or remotely.
70 Oracle Server X5-8 Service Manual • December 2015
Page 71
Troubleshooting With Diagnostic Tools
“Diagnostic Tools” on page 71
“Diagnostic Tool Documentation” on page 72

Diagnostic Tools

The diagnostic tools range in complexity from a comprehensive validation test suite (Oracle VTS) to a chronological event log (Oracle ILOM System Log). They include standalone software packages, firmware-based tests, and hardware-based LED indicators.
The following table summarizes the diagnostic tools.
Diagnostic Tool Type What It Does Availability Remote Capability
Oracle ILOM SP firmware Monitors
Preboot Menu SP firmware Enables you to
Hardware-based LED indicators
Power-on Self-Test (POST)
U-Boot SP firmware Initializes and test
Hardware and SP firmware
Host firmware Tests core
environmental condition and component functionality sensors, generates alerts, performs fault isolation, and provides remote access.
restore some of Oracle ILOM defaults (including firmware) when Oracle ILOM is not accessible.
Indicate status of overall system and particular components.
components of system: CPUs, memory, and motherboard I/O bridge integrated circuits.
aspects of the service processor (SP)
Available in either standby power mode or full power mode. It is not OS dependent.
Available in either standby power mode or full power mode. It is not OS dependent.
Available when system power is available.
Runs on startup. Available when the operating system is not running.
Available in either standby power mode or full power
Designed for remote and local access.
Local, but remote serial access is possible if the SP serial port is connected to a network-accessible terminal server.
Local, but sensor and indicators are accessible from Oracle ILOM web interface or command-line interface (CLI).
Local, but can be accessed through Oracle ILOM Remote Console.
Local, but remote serial access is possible if the
Troubleshooting and Diagnostics 71
Page 72

Attaching Devices to the Server

Diagnostic Tool Type What It Does Availability Remote Capability
Solaris commands Operating system
Oracle VTS Diagnostic tool
UEFI Diagnostics A suite of diagnostic

Diagnostic Tool Documentation

software
standalone software
tests
prior to booting the Oracle ILOM SP and operating system. Tests SP memory, SP, network devices and I/O devices.
Displays various kinds of system information.
Exercises and stresses the system, running tests in parallel.
Run diagnostic tests from Oracle ILOM.
mode. It is not OS dependent.
Requires operating system.
Requires operating system. Install Oracle VTS software separately.
Run tests manually or automatically. Read the results on screen or in log files.
SP serial port is connected to a network-accessible terminal server.
Local, and over network.
View and control over network.
Remote access using Oracle ILOM.
The following table identifies where you can find more information about diagnostic tools.
Diagnostic Tool Information Location
Oracle ILOM Oracle Integrated Lights Out
Manager Documentation Library
Preboot Menu Using the Preboot Menu Utility Oracle X5 Series Servers
U-Boot Oracle x86 Servers Diagnostics
Guide
System indicators and sensors Oracle Server X5-8 Service Manual “Troubleshooting
UEFI diagnostics Oracle x86 Servers Diagnostics
Guide
Oracle VTS Oracle VTS software and
documentation
Attaching Devices to the Server
The following sections contain procedures for attaching devices to the server. These allow you to access diagnostic tools when troubleshooting and servicing the server:
http://www.oracle.com/goto/ ILOM/docs
Administration Guide
Oracle x86 Server Diagnostics, Applications, and Utilities Guide
Indicators” on page 58
Oracle x86 Server Diagnostics, Applications, and Utilities Guide
Oracle x86 Server Diagnostics, Applications, and Utilities Guide
72 Oracle Server X5-8 Service Manual • December 2015
Page 73

Attach Devices to the Server

“Attach Devices to the Server” on page 73
“Configuring Serial Port Sharing” on page 73
“Ethernet Port Device Naming” on page 75
Attach Devices to the Server
This section provides instructions for connecting remote and local devices to server so you can interact with the service processor (SP) and the server console.
For port and connector information, see “Back Panel Ports and Connectors” on page 45.
1.
Connect an Ethernet cable to the Gigabit Ethernet (NET) connectors as needed for OS support.
2.
To connect to Oracle ILOM over the network, connect an Ethernet cable to the Ethernet port labeled NET MGT.
3.
To access the Oracle ILOM command-line interface (CLI) locally using the management port, connect a serial null modem cable to the RJ-45 serial port labeled SER MGT.
4.
To connect to the system console locally, connect a mouse and keyboard to the server front panel USB connectors and a monitor to the server front panel DB-15 video connector.

Configuring Serial Port Sharing

By default, the NET MGT serial port connects to the SP console. Using Oracle ILOM, you can configure it to connect to the host console instead. This feature is useful for Windows kernel debugging, as it enables you to view non-ASCII character traffic from the host console.
Do not configure the NET MGT port to connect to the host console until after you have configured the Oracle ILOM network connection. Otherwise you cannot connect to Oracle ILOM to switch it back.
For more details about restoring access to the server port on your server, see the Oracle Integrated Lights Out Manager (ILOM) 3.2 Documentation Library at: http://www.oracle.
com/goto/ILOM/docs.
Troubleshooting and Diagnostics 73
Page 74
Assign Serial Port Output Using the CLI
You can assign serial port output using either the Oracle ILOM web interface or the command­line interface (CLI). For instructions, see the following sections:
“Assign Serial Port Output Using the CLI” on page 74
“Assign Serial Port Output Using the Web Interface” on page 74
Assign Serial Port Output Using the CLI
1.
Open an SSH session and at the command line log in to the SP Oracle ILOM CLI.
Log in as a user with root or administrator privileges. For example:
ssh root@ipadress
where ipadress is the IP address of the server SP.
For more information, see Oracle X5 Series Servers Administration Guide.
The Oracle ILOM CLI prompt appears:
->
2.
To set the serial port owner, type:
-> set /SP/serial/portsharing owner=host
Note - The serial port sharing value by default is owner=SP.
3.
Connect a serial host to the server.
Assign Serial Port Output Using the Web Interface
1.
Log in to the service processor Oracle ILOM web interface.
To log in, open a web browser and direct it using the IP address of the server SP.
Log in as root or a user with administrator privileges. For more information, see Oracle X5
Series Servers Administration Guide.
The Summary screen appears.
2.
In the ILOM web interface, select ILOM Administration --> Connectivity from the navigation menu on the left side of the screen.
3.
Select the Serial Port tab.
74 Oracle Server X5-8 Service Manual • December 2015
Page 75
The Serial Port Settings page appears.
Note - The serial port sharing setting by default is Service Processor.
4.
In the Serial Port Settings page, select Host Server as the serial port owner.
5.
Click Save for the changes to take effect.
6.
Connect a serial host to the server.

Ethernet Port Device Naming

This section contains information about the device naming for the Ethernet ports on the back panel of the server (see “Back Panel Ports and Connectors” on page 45).
Note - Naming used by the interfaces might vary from that listed below depending on which
devices are installed in the system.

Getting Help

The device naming for the Ethernet interfaces is reported differently by different interfaces and operating systems. The following illustration explains the logical (operating system) and physical (BIOS) naming conventions used for each interface. These naming conventions might vary depending on conventions of your operating system and which devices are installed in the server.
Port BIOS Oracle Solaris Linux Windows
Net 1 0701 igb 1 eth 1 net2
Net 0 0700 igb 0 eth 0 net
Getting Help
The following sections describe how to get additional help to resolve server-related problems.
“Contacting Support” on page 76
“Locating the Chassis Serial Number” on page 76
Troubleshooting and Diagnostics 75
Page 76
Getting Help

Contacting Support

If the troubleshooting procedures in this chapter fail to solve your problem, use the following table to collect information that you might need to communicate to support personnel.
System Configuration Information Needed Your Information
Service contract number
System model
Operating environment
System serial number
Peripherals attached to the system
Email address and phone number for you and a secondary contact
Street address where the system is located
Superuser password
Summary of the problem and the work being done when the problem occurred
IP address
Server name (system host name)
Network or internet domain name
Proxy server configuration
See Also
“Locating the Chassis Serial Number” on page 76

Locating the Chassis Serial Number

You might need to have your server's serial number when you ask for service on your system. Record this number for future use. Use one of the following methods to locate your server's serial number:
On the front panel of the server, look at the middle left of the bezel to locate the server's serial number.
Locate the yellow Customer Information Sheet (CIS) attached to your server packaging. This sheet includes the serial number.
76 Oracle Server X5-8 Service Manual • December 2015
Page 77
From Oracle System Assistant, see the Summary screen.
From Oracle ILOM, enter the show /SYS command or go to the System Information tab in the Oracle ILOM browser interface.
Getting Help
Troubleshooting and Diagnostics 77
Page 78
78 Oracle Server X5-8 Service Manual • December 2015
Page 79

Servicing the Server

This section provides generalized information about servicing the server. It includes
Section Description Link
Component serviceability requirements, locations and designations
Procedures for creating an ESD-safe work space
Required tools “Tools and Equipment” on page 91
Information about component filler panels “Component Filler Panels and Non-Powered
Procedure for clearing hardware faults in the system

Component Locations, and Designations

This section provides information about servicing components, including component locations and designations.
“Component Locations, and Designations” on page 79
“Performing Electrostatic Discharge and Static Prevention Measures” on page 90
Components” on page 91
“Clear Hardware Fault Messages” on page 92
“Component Serviceability Requirements” on page 79
“Component Locations” on page 80
“Component Designations” on page 82
“Component Network Access Control (NAC) Names” on page 89

Component Serviceability Requirements

The following table lists the system components and identifies whether they are hot, warm, or cold service components, and whether they are a customer-replaceable unit (CRU) or a field­replaceable unit (FRU).
Servicing the Server 79
Page 80
Component Locations, and Designations
Hot service components can be serviced while the server is powered on and running in full­power mode.
Warm service components can be serviced while the server is in standby power mode. These include CMODs, DIMMs, and processors and heatsinks.
Cold service components must be serviced when the server is completely powered off and disconnected from the power source.
A CRU or FRU designation determines who is qualified to service a component.
CRUs can be serviced by customers.
FRUs must be serviced by qualified Oracle Service personnel.
Component Service Designation Serviceability
Front indicator module (FIM) CRU Cold
Power supply unit (PSU) CRU Hot
Fan modules (FM) CRU Hot
Fan frame CRU Cold
CPU module (CMOD) CRU Warm
Memory (DIMMs) CRU Warm
Processor and heatsink FRU Warm
Storage drive (HHD, SSD) CRU Hot
Dual PCIe card carrier (DPCC) CRU Hot
PCIe card CRU Hot/Cold
System module (SMOD) CRU Cold
Internal USB flash drive CRU Cold
External USB flash drive CRU Hot
Host bus adapter (HBA) card CRU Cold
HBA cable CRU Cold
Energy Storage Module CRU Cold
Energy Storage Module Cable CRU Cold
System Clock Battery CRU Cold
Hot service as part of DPCC, which must be removed first.

Component Locations

The following illustration shows the locations of the server components.
80 Oracle Server X5-8 Service Manual • December 2015
Page 81
Component Locations, and Designations
Call Out Description Call Out Description
1 AC power block
2 Dual PCIe carrier card (DPCC) with
8 Midplane
9 Server chassis
PCIe card (8)
3 Host bus adapter (HBA) card 10 CPU module (CMOD) (4 or 8)
4 Energy storage module 11 Fan frame (2)
5 Storage drive (8) 12 Fan module (8)
6 System module (SMOD) 13 Power supply (PSU) (4)
7 Top cover 14 Front indicator module (FIM)
The AC power block is not a removable component.
The chassis is not a removable component.
Servicing the Server 81
Page 82
Component Locations, and Designations

Component Designations

These sections show the location and designation of CRU and FRU components:
“Fan Module Designations” on page 82
“Power Supply Slot Designations” on page 83
“CMOD Slot Designations” on page 84
“Memory Slot Designations” on page 85
“Storage Drive Slot Designations” on page 86
“DPCC and PCIe Card Slot Designations” on page 87
“AC Input Power Block” on page 88
Fan Module Designations
The eight fan modules (FMs) are directly accessible at the front of the server and are arranged in two stacked rows of four FMs.
Bottom row from left to right: FM 0, FM 2, FM 4, and FM 6.
Top row from left to right: FM 1, FM 3, FM 5, and FM 7.
82 Oracle Server X5-8 Service Manual • December 2015
Page 83
Component Locations, and Designations
Call Out Description
0 FM 0
1 FM 1
2 FM 2
3 FM 3
4 FM 4
5 FM 5
6 FM 6
7 FM 7
The eight fan modules are installed in two fan frames The left frame contains FM 0, FM1, FM2, and FM 3. The right frame contains FM 4, FM 5, FM 6, and FM 7.
Each vertical pair of FMs provides cooling for the corresponding CPU modules (CMODs), which are located directly behind the FMs. For example, FMs 0 and 1 provide cooling for CMODs 0 and 1, and FMs 6 and 7 provide cooling for CMODs 6 and 7.
For CMOD designations, see “CMOD Slot Designations” on page 84.
Power Supply Slot Designations
The four slots for the power supply units (PSUs) are directly accessible at the front of the server and are arranged in a single stacked row. They are designated from the bottom to the top as, PSU 0, PSU 1, PSU 2, and PSU 3. The following illustration shows the arrangement of the PSUs.
Servicing the Server 83
Page 84
Component Locations, and Designations
Call Out Description
0 PSU 0
1 PSU 1
2 PSU 2
3 PSU 3
CMOD Slot Designations
CPU module slots are arranged in a single row and are designated from left to right as, CMOD 0–CMOD 7. The CMOD slots are accessible from the front of the server by removing the FMs and frames.
The server is available with four CMODs or eight CMODs. Four-CMOD systems have CMODs in CMOD 0–CMOD 3, and filler panels in CMOD 4–CMOD 7.
For more information, see “CPU Module (CMOD) Overview” on page 23.
Call Out Description
0 CMOD 0
84 Oracle Server X5-8 Service Manual • December 2015
Page 85
Component Locations, and Designations
Call Out Description
1 CMOD 1
2 CMOD 2
3 CMOD 3
4 CMOD 4
5 CMOD 5
6 CMOD 6
7 CMOD 7
Memory Slot Designations
Each CMOD contains 24 DIMM slots arranged in four groups of six slots. The following illustration shows the groups and their slot designations.
Servicing the Server 85
Page 86
Component Locations, and Designations
Call Out Description
1 Slots D12–D17
2 Slots D18–D23
3 Slots D6–D11
4 Slots D0–D5
See Also
“Memory and DIMM Reference” on page 158
Storage Drive Slot Designations
The eight storage drive slots are in the system module (SMOD) and directly accessible at the back of the server. Slots are arranged in two stacked rows of four slots and designated from right to left.
The top row contains slots 0, 2, 4, and 6.
The bottom row contains slots 1, 3, 5, and 7
86 Oracle Server X5-8 Service Manual • December 2015
Page 87
Component Locations, and Designations
Call Out Description
0 Slot 0
1 Slot 1
2 Slot 2
3 Slot 3
4 Slot 4
5 Slot 5
6 Slot 6
7 Slot 7
DPCC and PCIe Card Slot Designations
The eight dual PCIe card carrier (DPCC) slots are arranged in a single row at the back of the server. The slots are designated from right to left as, DPCC 0–DPCC 7.
Each DPCC supports two PCIe slots, for a total of 16. The PCIe slots are designated from right to left as PCIe 1–PCIe 16.
DPCC 0 contains PCIe slots 1 and 2
DPCC 1 contains PCIe slots 3 and 4
DPCC 2 contains PCIe slots 5 and 6
DPCC 3 contains PCIe slots 7 and 8
DPCC 4 contains PCIe slots 9 and 10
DPCC 5 contains PCIe slots 11 and 12
DPCC 6 contains PCIe slots 13 and 14
DPCC 7 contains PCIe slots 15 and 16
The following illustration shows the location and designations of the PCIe slots.
Servicing the Server 87
Page 88
Component Locations, and Designations
Call Out Description Call Out Description
1 PCIe Slot 1 in DPCC 0 9 PCIe PCIe Slot 9 in DPCC 4
2 PCIe Slot 2 in DPCC 0 10 PCIe Slot 10 in DPCC 4
3 PCIe Slot 3 in DPCC 1 11 PCIe Slot 11 in DPCC 5
4 PCIe Slot 4 in DPCC 1 12 PCIe Slot 12 in DPCC 5
5 PCIe Slot 5 in DPCC 2 13 PCIe Slot 13 in DPCC 6
6 PCIe Slot 6 in DPCC 2 14 PCIe Slot 14 in DPCC 6
7 PCIe Slot 7 in DPCC 3 15 PCIe Slot 15 in DPCC 7
8 PCIe Slot 8 in DPCC 3 16 PCIe Slot 16 in DPCC 7
AC Input Power Block
The four AC power inputs at the back of the server are arranged in a stack. Starting at the bottom, they are designated AC 0, AC 1, AC 2, and AC 3. The designations match the corresponding PSUs.
The AC power block is not a removable component.
88 Oracle Server X5-8 Service Manual • December 2015
Page 89
Component Locations, and Designations
The following illustration shows the location and designation of the inlets on the AC power block.
Call Out Description
0 AC 0
1 AC 1
2 AC 2
3 AC 3

Component Network Access Control (NAC) Names

Name Description
/SYS System
/SYS/UUID Unique system ID
/SYS/PS[0-3] Power supplies (static FRUID)
/SYS/SMOD/DBP[0/1] Disk backplanes (dynamic FRUID)
/SYS/FM[0-7] Fan modules (No FRUID)
/SYS/HDD[0-7] Hard disk drives
/SYS/SMOD/MB System module (SMOD) (dynamic FRUID)
/SYS/SMOD/MB/NET[0/1] System host network interfaces (static FRUID)
/SYS/SMOD/MB/CPLD CPLD on SMOD
/SYS/SMOD/MB/SP Service processor (SP) module (dynamic FRUID)
/SYS/SPNET[0/1] SP network interfaces
Servicing the Server 89
Page 90

Performing Electrostatic Discharge and Static Prevention Measures

Name Description
/SYS/CMOD[0-7] CPU modules (dynamic FRUID)
/SYS/CMOD[0-7]/P[0-7] Processors (CPUs) on CMOD (static FRUID)
/SYS/CMOD[0-7]/P[0-7]/P[0-23] DIMMs on CMOD MB (dynamic FRUID
/SYS/CMOD[0-7]/CPLD CPLDs on CMODs
/SYS/BIOS System BIOS
/SYS/DPCC[0-7] Dual PCIe card carriers (DPCCs)
/SYS/DPCC[0-7]/PCIE[1-16] PCIe cards
/SYS/FIM Front indicator module
Performing Electrostatic Discharge and Static Prevention Measures
Electrostatic discharge (ESD) sensitive devices, such as the PCIe cards, hard drives, CPUs, and memory cards, require special handling.

Using an Anti-static Wrist Strap

Wear an anti-static wrist strap when handling components such as disk drive assemblies, circuit boards, or PCIe cards. When servicing or removing server components, attach an anti-static strap to your wrist and then to a metal area on the server chassis. If your wrist strap is equipped with a banana connector, insert it into the grounding socket on the right-hand side of the chassis front panel.
Following this practice equalizes the electrical potentials between you and the server.
Note - An anti-static wrist strap is not shipped with the servers. However, anti-static wrist straps
are included with customer-replaceable units (CRUs), field-replaceable units (FRUs), and optional components.

Using an Anti-static Mat

In addition to wearing an anti-static wrist strap when handling components, create an ESD-free work place by using an anti-static mat as a work surface and as a place to set ESD-sensitive
90 Oracle Server X5-8 Service Manual • December 2015
Page 91
components such as printed circuit boards, DIMMs, and CPUs. You can use the following items as anti-static mats:
Anti-static bag used to wrap a replacement part
ESD mat (orderable from Oracle)
A disposable ESD mat (shipped with some optional system components)

Tools and Equipment

Most server component removal and installation procedures can be performed without tools. However, to service the system, you need the following:
ESD mat and grounding strap
Anti-static wrist strap
You might also need:
Tools and Equipment
No. 2 Phillips screwdriver
A system console device, such as one of the following:
PC or workstation with RS-232 serial port
ASCII terminal
Terminal server
Patch panel connected to a terminal server

Component Filler Panels and Non-Powered Components

A filler panel is a metal or plastic enclosure that does not contain any functioning system hardware or cable connectors. Filler panels occupy vacant component slots to help control noise, EMI, and airflow. They are installed at the factory and must remain in the server until you replace them with a component. If you remove a filler panel and continue to operate your system with an empty slot, the server might overheat due to improper airflow. Additionally, some components are installed but are not powered (for exampole, DPCCs and fan modules). As with filler panels, these components must remain installed in a fully powered-on server.
Servicing the Server 91
Page 92

Clear Hardware Fault Messages

Clear Hardware Fault Messages
After servicing the following components, you must clear the fault event in Oracle ILOM:
PCIe card
HBA
Front Indicator Module (FIM)
Processor (CPU)
Use the Oracle ILOM CLI to access the Fault Management shell, fmadm. For details, see http:
//www.oracle.com/goto/ILOM/docs.
Before You Begin
1.
2.
3.
This procedure uses the Oracle ILOM CLI interface.
Open an SSH session and at the command line log in to the SP Oracle ILOM CLI.
Log in as a user with root or administrator privileges. For example:
ssh root@ipadress
where ipadress is the IP address of the server SP.
For more information, see “Accessing Oracle ILOM” in Oracle X5 Series Servers
Administration Guide.
The Oracle ILOM CLI prompt appears:
->
To access fmadm, type:
start /SP/faultmgmt/shell
The fmadm prompt appears:
faultmgmtsp>
To get a listing of command options for displaying or clearing a fault with fmadm, type :
help fmadm
The following output appears:
where <subcommand> is one of the following: faulty [-asv] [-u <uuid>] : display list of faulty resources faulty -f [-a] : display faulty FRUs faulty -r [-a] : display faulty FRUs (summary) acquit <FRU> : acquit faults on a FRU
92 Oracle Server X5-8 Service Manual • December 2015
Page 93
Clear Hardware Fault Messages
acquit <UUID> : acquit faults associated with UUID acquit <FRU> <UUID> : acquit faults specified by (FRU, UUID) combination replaced <FRU> : replaced faults on a FRU repaired <FRU> : repaired faults on a FRU repair <FRU> : repair faults on a FRU rotate errlog : rotate error log rotate fltlog : rotate fault log
4.
Use fmadm faulty and the following options to display active faulty components:
-a – Show active faulty components.
-f – Show active faulty FRUs.
-r – Show active fault FRUs and their fault management states.
-s – Show a one-line fault summary for each fault event.
-u uuid – Show fault diagnosis events that match a specific universal unique identifier (uuid).
For command specifics, see the Oracle ILOM documentation at: http://www.oracle.com/
goto/ILOM/docs
5.
Use fmadm to clear the fault.
Clear the fault according to whether you want to use the acquit, repair, replaced, or repaired.
6.
Close the Oracle ILOM session.
Servicing the Server 93
Page 94
94 Oracle Server X5-8 Service Manual • December 2015
Page 95

Preparing for Service

This section includes preliminary information and procedures that assist you with preparing to service the server. The following table describes the contents of this section.
Section Description Link
Setting up for hot service. “Prepare the Server for Hot Service” on page 95
Setting up for warm service “Prepare the Server for Warm Service” on page 96
Setting up for cold service. “Prepare the Server for Cold Service” on page 98
Server power-off options. “Powering Off the Server” on page 100
Methods for activating and deactivating the server Locator indicator.

Prepare the Server for Hot Service

Note - The steps in this remote procedure use the Oracle ILOM web interface. However, the
procedure can also be performed remotely using the Oracle ILOM CLI interface. For more information, refer to the Oracle ILOM documentation.
“Managing the Locator Indicator” on page 108
Before You Begin
A hot-service component can be serviced while the server is operating at full-power mode. For more information about component serviceability, see “Component Serviceability
Requirements” on page 79.
This procedure describes how to prepare the server to remove, replace, or install the following hot-service components:
Fan modules
Power supplies
Storage drives
Dual PCIe Card Carriers (DPCCs)
Important: Review the Oracle Server X5-8 Product Notes for hardware-related information before performing removal and installation procedures.
Preparing for Service 95
Page 96

Prepare the Server for Warm Service

1.
Log in to the service processor Oracle ILOM web interface.
Direct a web browser to Oracle ILOM using the IP address of the server SP and log in as root or a as user with administrator privileges. See Oracle X5 Series Servers Administration Guide.
The Summary screen appears.
2.
In the Actions section of the Summary screen, click the Locator Indicator Turn On button.
This action activates the Locator indicator on the server front panel. For other options, see
“Managing the Locator Indicator” on page 108.
3.
Once at the service location, press the Locator button to deactivate the indicator. For more information, see “Control the Locator Indicator Locally” on page 110.
4.
Set up an ESD-safe space at the service location.
Set up a space where you can set components. The space needs to be ESD safe. See
“Performing Electrostatic Discharge and Static Prevention Measures” on page 90.
Next Steps
“Servicing Fan Modules and Fan Frames” on page 117
“Servicing Power Supply Units (PSUs)” on page 126
“Servicing Storage Drives” on page 180
“Servicing PCIe Cards and the Dual PCIe Card Carriers (DPCCs)” on page 185
Prepare the Server for Warm Service
This procedure describes how to prepare the server for warm service, so you can remove and replace CMODs, DIMMs, and processors and heatsinks without disconnecting the power cords or shutting down Oracle ILOM.
When Oracle ILOM detects that two fan modules in a single cooling zone (a vertical column) have been removed, it removes power from the CMODs, allowing you to service CMODs and their subcomponents without removing the power cords. Oracle ILOM remains available in warm service mode.
This procedure uses a combination of the Oracle ILOM web and CLI interfaces. However, the procedure can be performed using only the Oracle ILOM CLI interface (for more information, refer to the Oracle ILOM documentation).
For more information about component serviceability, see “Component Serviceability
Requirements” on page 79.
96 Oracle Server X5-8 Service Manual • December 2015
Page 97
Prepare the Server for Warm Service
Caution - Loss of service or component damage. Do not replace any components except for
CMODs and their subcomponents while the server is in warm service mode.
Caution - Data Loss. Do not remove more than one fan module from a column while the system
is in full power mode. This action removes power from the CMODs and causes an immediate shutdown. On an eight-CMOD system, this applies to all fan modules. On a four-CMOD system, this applies to the fan modules in the left-hand fan frame.
Before You Begin
1.
Important: Review the Oracle Server X5-8 Product Notes for hardware-related information before performing removal and installation procedures.
To power down the host and activate the front panel Locator indicator, do the following:
a.
Log in to the Oracle ILOM web interface.
Direct a web browser to Oracle ILOM using the IP address of the server SP and log in as root or a as user with administrator privileges. See “Accessing Oracle ILOM” in Oracle X5
Series Servers Administration Guide.
b.
In the Actions section of the Summary screen, click the Power State Turn Off button.
This action powers off the server to standby power mode. For more power off options, see
“Powering Off the Server” on page 100.
c.
In the Actions section of the Summary screen, click the Locator Indicator Turn On button.
Preparing for Service 97
Page 98

Prepare the Server for Cold Service

This action activates the Locator indicator on the server front and back panel. For other options, see “Managing the Locator Indicator” on page 108.
2.
When at the server, set up an ESD-safe service space where you can place components.
See “Performing Electrostatic Discharge and Static Prevention Measures” on page 90.
3.
Press the Locator indicator button to deactivate the indicator. For more information, see “Control the Locator Indicator Locally” on page 110.
4.
Begin the CMOD removal procedures. For details, see “Servicing the CPU
Module (CMOD) Components” on page 136.
The server transitions to warm service mode by removing power from the CMODs when it senses one of the following events:
On an eight-CMOD system, when both fans in a single column are removed.
On a four CMOD system, when both fans in a single column are removed from the left­hand fan frame (CMODs 0 through 3), or when a CMOD is inserted into an unoccupied CMOD slot (4 through 7).
Next Steps
“Servicing the Server” on page 79
“Servicing Components” on page 113
Prepare the Server for Cold Service
Note - This procedure uses a combination of the Oracle ILOM web and CLI interfaces.
However, the procedure can be performed using only the Oracle ILOM CLI interface (for more information, refer to the Oracle ILOM documentation).
98 Oracle Server X5-8 Service Manual • December 2015
Page 99
Prepare the Server for Cold Service
A cold-service component must be serviced when the server is completely powered off. For more information about component serviceability, see “Component Serviceability
Requirements” on page 79.
This procedure describes how to prepare the server for service, so you can:
Remove, replace, or install cold-serviceable components.
Use the motherboard processor and DIMM fault remind circuitry.
Access internal components, such as the internal USB drives.
Before You Begin
1.
Important: Review the Oracle Server X5-8 Product Notes for hardware-related information before performing removal and installation procedures.
To power down the server and activate the front panel Locator indicator, do the following:
a.
Log in to the Oracle ILOM web interface.
Direct a web browser to Oracle ILOM using the IP address of the server SP and log in as root or a as user with administrator privileges. See “Accessing Oracle ILOM” in Oracle X5
Series Servers Administration Guide.
b.
In the Actions section of the Summary screen, click the Power State Turn Off button.
This action powers off the server to standby power mode. For more power off options, see
“Powering Off the Server” on page 100.
c.
In the Actions section of the Summary screen, click the Locator Indicator Turn On button.
Preparing for Service 99
Page 100

Powering Off the Server

2.
When at the server, set up an ESD-safe service space.
Set up a space where you can place components. The space needs to be ESD safe. See
“Performing Electrostatic Discharge and Static Prevention Measures” on page 90.
3.
Disconnect the server power cords.
This action activates the Locator indicator on the server front and back panel. For other options, see “Managing the Locator Indicator” on page 108.
Caution - Data loss. Removing the power cords when the server is in full power mode results
in an immediate shut down of the server. Do not remove the power cord if the server is in full power mode. Power off the server to standby power mode first.
4.
If necessary, label and disconnect any other cables attached to the server back panel.
If you plan to remove a component that has cables attached to it (SMOD, DPCC), label the port or slot to which the cable is attached and remove the cable.
Next Steps
“Servicing the Server” on page 79
“Servicing Components” on page 113
Powering Off the Server
This section contains information and procedures related to power modes and power off options, including complete power removal:
“Power Off the Server Using the Server OS” on page 101
100 Oracle Server X5-8 Service Manual • December 2015
Loading...