This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except
as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform,
publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is
prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation,
delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental
regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the
hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous
applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all
appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this
software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of
SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered
trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are
not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement
between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content,
products, or services, except as set forth in an applicable agreement between you and Oracle.
Access to Oracle Support
Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?
ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.
Ce logiciel et la documentation qui l'accompagne sont protégés par les lois sur la propriété intellectuelle. Ils sont concédés sous licence et soumis à des restrictions d'utilisation et
de divulgation. Sauf stipulation expresse de votre contrat de licence ou de la loi, vous ne pouvez pas copier, reproduire, traduire, diffuser, modifier, accorder de licence, transmettre,
distribuer, exposer, exécuter, publier ou afficher le logiciel, même partiellement, sous quelque forme et par quelque procédé que ce soit. Par ailleurs, il est interdit de procéder à toute
ingénierie inverse du logiciel, de le désassembler ou de le décompiler, excepté à des fins d'interopérabilité avec des logiciels tiers ou tel que prescrit par la loi.
Les informations fournies dans ce document sont susceptibles de modification sans préavis. Par ailleurs, Oracle Corporation ne garantit pas qu'elles soient exemptes d'erreurs et vous
invite, le cas échéant, à lui en faire part par écrit.
Si ce logiciel, ou la documentation qui l'accompagne, est livré sous licence au Gouvernement des Etats-Unis, ou à quiconque qui aurait souscrit la licence de ce logiciel pour le
compte du Gouvernement des Etats-Unis, la notice suivante s'applique :
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation,
delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental
regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the
hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
Ce logiciel ou matériel a été développé pour un usage général dans le cadre d'applications de gestion des informations. Ce logiciel ou matériel n'est pas conçu ni n'est destiné à être
utilisé dans des applications à risque, notamment dans des applications pouvant causer un risque de dommages corporels. Si vous utilisez ce logiciel ou ce matériel dans le cadre
d'applications dangereuses, il est de votre responsabilité de prendre toutes les mesures de secours, de sauvegarde, de redondance et autres mesures nécessaires à son utilisation dans
des conditions optimales de sécurité. Oracle Corporation et ses affiliés déclinent toute responsabilité quant aux dommages causés par l'utilisation de ce logiciel ou matériel pour des
applications dangereuses.
Oracle et Java sont des marques déposées d'Oracle Corporation et/ou de ses affiliés. Tout autre nom mentionné peut correspondre à des marques appartenant à d'autres propriétaires
qu'Oracle.
Intel et Intel Xeon sont des marques ou des marques déposées d'Intel Corporation. Toutes les marques SPARC sont utilisées sous licence et sont des marques ou des marques
déposées de SPARC International, Inc. AMD, Opteron, le logo AMD et le logo AMD Opteron sont des marques ou des marques déposées d'Advanced Micro Devices. UNIX est une
marque déposée de The Open Group.
Ce logiciel ou matériel et la documentation qui l'accompagne peuvent fournir des informations ou des liens donnant accès à des contenus, des produits et des services émanant de
tiers. Oracle Corporation et ses affiliés déclinent toute responsabilité ou garantie expresse quant aux contenus, produits ou services émanant de tiers, sauf mention contraire stipulée
dans un contrat entre vous et Oracle. En aucun cas, Oracle Corporation et ses affiliés ne sauraient être tenus pour responsables des pertes subies, des coûts occasionnés ou des
dommages causés par l'accès à des contenus, produits ou services tiers, ou à leur utilisation, sauf mention contraire stipulée dans un contrat entre vous et Oracle.
Accès aux services de support Oracle
Les clients Oracle qui ont souscrit un contrat de support ont accès au support électronique via My Oracle Support. Pour plus d'informations, visitez le site http://www.oracle.com/
pls/topic/lookup?ctx=acc&id=info ou le site http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs si vous êtes malentendant.
Page 5
Contents
Using This Documentation ............. ................ ................ ................ ................ ... 11
About This Document ....................................................................................... 13
Service Notes .................................................................................................... 15
Boot Screen in Legacy Boot Mode .......................... ................ ............... 272
Index ............. ................ ................ ................ ................ ................ ................ ... 275
9
Page 10
10Oracle Server X5-8 Service Manual • December 2015
Page 11
Using This Documentation
This section describes how to get the latest firmware, software, and documentation for the
Oracle Server X5-8. It also provides feedback links and a document change history.
■
“Oracle Server X5-8 Model Naming Convention” on page 11
■
“Getting the Latest Firmware and Software” on page 11
■
“Documentation and Feedback” on page 12
■
“Contributors” on page 12
■
“Change History” on page 12
The information in this documentation set is presented in topic-based format (similar to online
help) and therefore does not include chapters, appendixes, or section numbering.
Oracle Server X5-8 Model Naming Convention
The Oracle Server X5-8 name identifies the following:
■
X identifies an x86 product.
■
The first number, 5, identifies the generation of the server.
■
The second number, 8, identifies the maximum number of processors.
Getting the Latest Firmware and Software
Firmware, drivers, and other hardware-related software for each Oracle x86 server are updated
periodically.
You can obtain the latest version in the following ways:
■
Oracle System Assistant: This is a factory-installed option for Oracle x86 servers. It has all
the tools and drivers you need and resides on a USB drive installed in most servers.
■
You can download updates from My Oracle Support: https://support.oracle.com
Using This Documentation11
Page 12
Documentation and Feedback
Documentation and Feedback
DocumentationLink
All Oracle products
Oracle Server X5-8
Oracle Integrated Lights Out Manager (ILOM). Refer to
the Oracle ILOM documentation.
Oracle Hardware Management Pack. Refer to the
documentation for your supported version of Oracle
HMP as listed in the Product Notes.
Provide feedback on this documentation at: http://www.oracle.com/goto/docfeedback
Contributors
Primary Authors: Michael Bechler, Cynthia Chin-Lee, Mark McGothigan.
http://docs.oracle.com
http://www.oracle.com/goto/X5-8/docs-videos
http://www.oracle.com/goto/ILOM/docs
http://www.oracle.com/goto/ohmp/docs
Contributors: William Schweickert, Anthony Villamor, Mick Tabor, Richard Masoner, Ray
Angelo, Tamra Smith-Wasel, Denise Silverman.
Change History
The following lists the release history of this documentation set:
■
December 2015. Technical updates.
■
September 2015. Editorial improvements.
■
July 2015. Initial publication.
12Oracle Server X5-8 Service Manual • December 2015
Page 13
About This Document
This document provides service, maintenance, and component replacement procedures for the
Oracle Server X5-8.
The following table describes the major sections of this document.
Section DescriptionLink
Important service information“Service Notes” on page 15
Server component and subsystem overviews“Server and Components Overview” on page 17
Troubleshooting procedures and information“Troubleshooting and Diagnostics” on page 53
General Information and procedures for
servicing the server
Information and procedures for preparing the
server for service
Procedures and information for removing
and installing components
Procedures and information for returning the
server to operation after performing service
procedures
Server BIOS Setup Utility information and
screen captures
“Servicing the Server” on page 79
“Preparing for Service” on page 95
“Servicing Components” on page 113
“Returning the Server to Operation” on page 235
“BIOS Setup Utility” on page 237
About This Document13
Page 14
14Oracle Server X5-8 Service Manual • December 2015
Page 15
Service Notes
This section contains preliminary service information:
Intended Audience
This guide is intended for trained technicians and authorized service personnel who have been
instructed on the hazards within the equipment and qualified to replace and install hardware.
Warning Label
The following warning label is visible from the front of the server when you remove a fan
module. It warns you to not insert your hands or any object into the space left vacant by the
removal of the fan module. Fan modules are hot-swap components. Removing a fan module
from a fully-powered server exposes open and active power connectors that can cause electric
shock.
Service Notes15
Page 16
16Oracle Server X5-8 Service Manual • December 2015
Page 17
Server and Components Overview
This section describes the server and its subsystems. It includes:
Section DescriptionLink
List of server features“Server Overview” on page 18
Chassis front, internal, and back components“Chassis Overview” on page 19
Features and components of the CPU module
(CMOD)
Features and components of the system
module (SMOD)
Server subsystems, their functions, and
related components
Schematic-type block diagram of the server
interconnects
“CPU Module (CMOD) Overview” on page 23
“System Module (SMOD) Overview” on page 28
“Server Subsystems” on page 34
“Server Block Diagram” on page 50
Server and Components Overview17
Page 18
Server Overview
Server Overview
The Oracle Server X5-8 is a 5 rack-unit (RU) server with the following features:
■
Four and eight socket configurations that use Intel EX Xeon® E7-8895 v3 processors for a
total of 72 or 144 cores.
■
Maximum memory: 3 TB (four socket) and 6 TB (eight socket) of DDR3 1333 memory.
■
Eight backside accessible SAS3 or SATA storage drive bays.
■
Expandable IO: eight 16-lane and eight 8-lane PCIe Gen3 slots and one 4 lane PCIe Gen 2
HBA slot.
■
One Emulex Pilot 3 service processor (SP) with 256 MB DDR3 memory, 256 MB of flash
memory, and Oracle ILOM.
■
Four (N+N) hot-swap power supplies (PSUs).
■
Eight hot-swap redundant 100 watt cooling fan modules.
Note - For server specification information, see “Server Specifications” in Oracle Server X5-8
Installation Guide.
The following sections provide overviews of the main server components:
ComponentLink
Chassis“Chassis Overview” on page 19
CMODs“CPU Module (CMOD) Overview” on page 23
18Oracle Server X5-8 Service Manual • December 2015
Page 19
ComponentLink
SMOD“System Module (SMOD) Overview” on page 28
Chassis Overview
The chassis consists of the front accessible components, internal components, and components
accessible from the back of the server:
■
“Chassis Front Side Components” on page 19
■
“Chassis Internal Components” on page 20
■
“Chassis Backside Components” on page 22
Chassis Front Side Components
The following figure shows the front side components:
Chassis Overview
Server and Components Overview19
Page 20
Chassis Overview
The front-side components include:
Call OutComponentLink
1Front indicator module (FIM)“Controls and Indicators” on page 34
2Four power supplies“Power Subsystem” on page 50.
3 and 4Eight fan modules (FMs) in two fan frames“Chassis Cooling Subsystem” on page 47
5Two internal CMOD bays“CPU Module (CMOD)
Overview” on page 23
Chassis Internal Components
The following figure shows the chassis internal components:
20Oracle Server X5-8 Service Manual • December 2015
Page 21
Chassis Overview
The chassis internal components include:
Call
ComponentDescription
Out
1CPU module
(CMOD) bays
2Midplane/busbarThe mid-plane assembly provides an interconnect between the backside components
CMOD bays can support either four or eight CMODs. Servicing CMODs requires
warm or cold service.
For information about the CMODs, see “CPU Module (CMOD)
Overview” on page 23.
and the front-side components. This component requires cold service.
Server and Components Overview21
Page 22
Chassis Overview
Chassis Backside Components
The following figure shows the chassis backside components:
The chassis backside components include:
Call
Out
1System Module (SMOD)The SMOD has internal components that can only be accessed by removing it
2Dual PCIe card carrier
3AC power blockThe AC power block has four AC power inlet connectors. The power block is
22Oracle Server X5-8 Service Manual • December 2015
ComponentDescription
(DPCC) bay
from the backside of the server.
For more information, see “System Module (SMOD)
Overview” on page 28.
The DPCC bay contains eight DPCCs and up to 16 PCIe cards.
For more information, see “Storage and IO Subsystem” on page 45.
not a removable component.
For more information, see “Power Subsystem” on page 50.
Page 23
CPU Module (CMOD) Overview
CPU modules (CMODs) contain the processors (CPUs) and the system memory, and supply
power to the fan modules and the DPCCs.
CMODs are internal warm or cold-service components. To access the CMODs, you must
remove the fan modules and the fan frames.
The following sections describe the CMOD configuration options and the internal layout of
components:
■
“Processor and Memory Overview” on page 23
■
“CMOD Configuration Options” on page 23
■
“CMOD Layout” on page 27
Processor and Memory Overview
CPU Module (CMOD) Overview
Each CMOD contains one Intel Xeon® E7-8895 v3 (18-core 2.6 GHz) processor.
The maximum system memory with DDR3 1333 32 GB DIMMs is:
■
Four CMODs: 3 TB
■
Eight CMODs: 6TB
CMOD Configuration Options
The server supports four- and eight-CMOD configurations. In the four-CMOD configuration,
the first four slots on the left (slots 0-3) are occupied and the four slots on the right (4-7) are
unoccupied.
The following illustration shows a server with a four-CMOD configuration. In the illustration,
the left-side fan modules and fan frame have been removed to show the four CMODs. Call out
1 identifies the group of four CMODs.
Server and Components Overview23
Page 24
CPU Module (CMOD) Overview
The following illustration shows a server with a four-CMOD configuration with all eight fan
modules and with both fan frames removed, exposing the empty CMOD bay on the right. The
four right-side fan modules are not powered; however in a four-CMOD configuration, these fan
modules must be installed. Call out 1 identifies the group of four CMODs installed on the left
side of the server. Call out 2 identifies the empty CMOD bay on the right side of the server.
24Oracle Server X5-8 Service Manual • December 2015
Page 25
CPU Module (CMOD) Overview
The following illustration shows a server with an eight (full) CMOD configuration. Call out
1 identifies the group of four CMODs installed on the left side of the server, and call out 2
identifies the second group of four CMODs installed on the right side of the server.
Server and Components Overview25
Page 26
CPU Module (CMOD) Overview
In both CMOD configurations, the system includes four power supplies, eight fan modules,
and eight DPCCs. However, fan modules and DPCCs receive power from the CMODs, so in a
four-CPU configuration, only fan modules 0-3 and DPCCs 0-3 are active. Fan modules 4-7 and
DPCCs 4-7 are not powered and not active.
CMOD Population Rules
The Oracle Server X5-8 supports four and eight CMOD configurations. Each CMOD supports a
single socket containing a single Intel EX Xeon E7-8895 v3 processor.
For the four-socket server configuration:
■
CPU modules (CMODs) must be installed in slots 0-3.
■
DPCC slots 0-3 are active; however DPCCs 4-7 must be installed.
■
Both fan frames must be installed.
■
All eight fan modules (FMs) must be installed but only FMs 0-3 are active.
For the eight-socket server configuration:
■
CMODs must be installed in slots 0-7.
■
DPCC slots 0-7 are active.
26Oracle Server X5-8 Service Manual • December 2015
Page 27
CPU Module (CMOD) Overview
■
Both fan frames must be installed.
■
All eight fan modules (FMs) must be installed and all FMs are active.
CMOD Layout
Each CMOD contains the following components:
■
Heatsink and processor assembly
■
24 DIMM slots arranged in four groups of six
■
DIMM test circuit, which helps you locate failed DIMMs and verify a failed CPU
■
Fault Remind button
■
Circuit Charge Status indicator
■
24 DIMM slot fault indicators
■
CPU fault indicator
The following illustration shows the location of the CMOD components.
Server and Components Overview27
Page 28
System Module (SMOD) Overview
Call OutDescription
1Fault Remind button
2Circuit Charge Status indicator
3DIMM slots (24, four banks of six each)
4DIMM slot fault indicators (24, one for each slot)
5Heatsink and CPU assembly
6CPU fault indicator
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.
CMOD and Fan Module Power
Fan modules (FMs) get power from CMODs. However, only CMODs in even-numbered slots
supply power to fan modules. The following table shows which CMOD slots provide FM
power.
Power SlotsFan Modules Powered
CMOD 0FMs 0 and 1
CMOD 2FMs 2 and 3
CMOD 4FMs 4 and 5
CMOD 6FMs 6 and 7
CMODs in slots 1, 3, 5, and 7 do not supply FM power.
System Module (SMOD) Overview
This section provides information about the server system module (SMOD) and its components.
It includes:
■
“SMOD Overview” on page 29
■
“Storage Drives” on page 30
■
“SMOD Motherboard” on page 31
28Oracle Server X5-8 Service Manual • December 2015
Page 29
■
“Service Processor (SP)” on page 31
■
“Storage Drive Backplanes” on page 32
■
“SAS Host Bus Adapter (HBA) Card, Riser, and Cables” on page 32
■
“Internal USB Ports” on page 34
■
“Energy Storage Module and Cable” on page 34
SMOD Overview
The SMOD components include:
■
Externally-accessible:
■
Server storage drives (HDD/SSD)
■
IO ports and two external USB ports
■
Internally accessible:
■
SMOD motherboard
■
Service processor (SP)
■
Storage drive backplane
■
SAS host bus adaptor (HBA)
■
Internal USB ports (2)
■
Energy storage module (ESM)
■
Real time clock battery
System Module (SMOD) Overview
The SMOD is located at the back of the server. It includes two removal and installation levers
with green lock release tabs.
The following illustration shows the SMOD and the two release levers:
Server and Components Overview29
Page 30
System Module (SMOD) Overview
Call OutDescription
1Removal and installation levers (2)
2SMOD
Storage Drives
In the following illustration, call out 1 shows the location of the eight storage drive slots, which
are arranged in two rows of four each.
30Oracle Server X5-8 Service Manual • December 2015
Page 31
System Module (SMOD) Overview
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.
SMOD Motherboard
The SMOD motherboard hosts the service processor (SP), two disk backplanes (for the
externally accessible server storage drives), the system real time clock battery, and an energy
storage module for the HBA. It also has a PCIe riser for the server storage HBA, and two
internal USB ports. The PCIe riser and the internal USB ports are located on the bottom of the
SMOD.
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.
Service Processor (SP)
The system Emulex Pilot 3 service processor (SP) is located on the SMOD motherboard and is
accessible locally and remotely through management ports on the front of the SMOD. The SP
contains Oracle ILOM, an embedded server management tool. The SP is not removable.
Server and Components Overview31
Page 32
System Module (SMOD) Overview
Storage Drive Backplanes
The externally accessible server storage drives on the SMOD connect to two backplanes
mounted on the SMOD motherboard. A SAS cable also connects the backplane to the HBA
card that is installed in a riser slot on the bottom of the SMOD. The backplanes are not
removable or replaceable.
■
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.
SAS Host Bus Adapter (HBA) Card, Riser, and
Cables
The server requires one internal HBA (Oracle Storage 12 Gb/s RAID HBA, Internal) for the
externally-accessible SAS (or SATA) SMOD server storage drives. The HBA is installed in a
riser slot on the underside of the SMOD motherboard and is connected to the backplanes by two
mini-SAS4I connector cables.
The following illustration shows the HBA card installed on the underside (bottom) of the
SMOD, the two SAS cables that connect the HBA to the server storage backplanes, and the
cable that connects the energy storage module to the HBA.
32Oracle Server X5-8 Service Manual • December 2015
Page 33
System Module (SMOD) Overview
Call OutDescription
1SMOD motherboard
2Cable from HBA to ESM.
3HBA
4SAS cables
5Backplanes
Server and Components Overview33
Page 34
Server Subsystems
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.
Internal USB Ports
The SMOD has two internal USB ports on the underside of the SMOD motherboard next to the
PCIe card riser slot. The ports are designated as P0 and P1.
Unless you have opted out, port P0 has a factory-installed flash drive that contains Oracle
System Assistant, a bootable server set-up, provisioning, and update tool. Port P0 can only be
used to support Oracle System Assistant. It cannot be used to boot an OS or store files unrelated
to Oracle System Assistant.
Energy Storage Module and Cable
The Energy Storage Module (ESM) provides backup power for the HBA. It sits in a holder in
the top center of the SMOD, and has a cable that connects it to the HBA.
Server Subsystems
This section contains overviews of the server subsystems:
■
“Controls and Indicators” on page 34
■
“Server Management Software” on page 43
■
“Storage and IO Subsystem” on page 45
■
“Chassis Cooling Subsystem” on page 47
■
“Power Subsystem” on page 50
Controls and Indicators
The system management subsystem includes the buttons, switches, and indicators on the
front and back of the server, and the embedded server management software, Oracle System
Assistant and Oracle ILOM:
34Oracle Server X5-8 Service Manual • December 2015
Page 35
Server Subsystems
■
“Front Indicator Module (FIM) Panel” on page 35
■
“Power Supply Unit (PSU) Indicators” on page 36
■
“Fan Module (FM) Indicators” on page 37
■
“Storage Drive Unit Indicators” on page 38
■
“Back Indicator Panel” on page 39
■
“Dual PCIe Card Carrier (DPCC) Indicators” on page 40
■
“AC Power Inlet Indicators” on page 41
■
“Switches and Buttons” on page 42
Front Indicator Module (FIM) Panel
The front indicator module (FIM) panel is located at the top left corner of the server (as viewed
from the front of the server). It contains indicators and buttons that allow you to manage the
server and determine its status.
The following illustration shows the buttons and indicators on the FIM.
Call OutDescriptionDetails
1Locator indicator and
button
When activated remotely, it helps you find the server in a rack or room of
servers. For more information about managing the Locator indicator remotely
and locally, see “Managing the Locator Indicator” on page 108.
Server and Components Overview35
Page 36
Server Subsystems
Call OutDescriptionDetails
2Service Action
Required indicator
3Power OK indicatorAlong with the SP indicator (below), it provides the status of the system power.
4Power on and off
button
5SP OK indicatorAlong with the Power OK indicator (above), it provides the status of the system
6Server over-
temperature indicator
7Rear (Back) Service
Action Required
indicator
8CMOD service action
required indicators (0-
7)
When lit, it indicates that a system fault has occurred. Other amber (fault)
indicators might also be lit, which can help you isolate the fault to a particular
subsystem. For more information about using the Service Action Required and
subsystem fault indicators, see “Troubleshooting Indicators” on page 58.
For more information about using the Power and SP indicators to determine
power state, see “Troubleshooting Indicators” on page 58.
Use it to manage power locally, when at the server. The duration of the
button press determines the type of power off (graceful or immediate). For
more information about using the Power button, see “Powering Off the
Server” on page 100 and “Power On the Server” on page 236.
power. For more information about using the Power and SP indicators to
determine power state, see “Troubleshooting Indicators” on page 58.
When lit, it indicates that a fault has occurred in the cooling subsystem.
The system Service Action Required indicator might also be lit. For more
information about using the subsystem fault indicators and the system Service
Action Required indicator, see “Troubleshooting Indicators” on page 58.
When lit, it indicates that a fault has occurred to one of the components on the
server backside (SMOD, DPCC, PCIe card, or HBA). Other indicators (status
and fault) might also be lit or in non-normal operating condition state (for
example, if the backside SP Service Action Required indicator is lit, the system
might not be able to boot and the Power OK indicator might not turn on). For
more information about using the subsystem indicators and the system Service
Action Required indicator, see “Troubleshooting Indicators” on page 58.
These light if the corresponding CMOD is in a fault state.
Power Supply Unit (PSU) Indicators
Each power supply unit (PSU) has three indicators arranged in a single row from left to right.
36Oracle Server X5-8 Service Manual • December 2015
Page 37
Server Subsystems
Call OutDescriptionDescription
1Service Action Required/Locate
(amber)
2Status OK indicator (green)Lights steady on when the PSU is powered on and in a
3AC OK indicator (green)Lights steady on when the PSU is connected to a
4Release leverUsed to release the power supply from the chassis
Lights steady on when the power supply is in a fault
state.
normal functioning state (in this state, the AC indicator
is also lit)
properly rated AC power source
Fan Module (FM) Indicators
Each fan module (FM) has two indicators arranged in a single row and from left to right as
follows:
The following illustration shows the front of the FM.
Server and Components Overview37
Page 38
Server Subsystems
Call OutDescriptionFunction
1Service Action Required
indicator (amber)
2Status OK indicator (green)Lights steady on when the FM is powered on and functioning
3Release buttonPress to release the fan module so you can remove it.
Lights steady on when the FM is in a fault state.
properly.
Storage Drive Unit Indicators
Storage drives are installed in carriers. Each storage drive carrier has three indicators arranged
in a single stacked row and from bottom to top.
The following illustration shows the front of the storage drive carrier and the storage drive
indicators.
38Oracle Server X5-8 Service Manual • December 2015
Page 39
Server Subsystems
Call OutDescriptionFunction
1Ready to Remove indicator (blue)Lights when the storage drive is ready to be removed
2Service Action Required indicator
(amber)
3OK indicator (green)Lights when the storage drive is functioning normally and
from the server in response to an action initiated from the
server OS.
Lights steady on when the drive is in a fault state.
blinks to show activity.
Note - The storage drive indicators blink at various rates
depending on the activity. For more information on blink
rates, see “Indicator Blink Rates” on page 62.
Back Indicator Panel
The back indicator panel located on the SMOD allows you manage the server and determine its
status. It includes some indicators and buttons not found on the front indicator module (FIM),
including reset switches and indicators for SMOD components.
The following figure shows the back inidcator panel:
Call
Out
1Non-Maskable Interrupt
DescriptionDetails
(NMI) button
Service personnel only. Do not press.
This button requires a stylus.
Server and Components Overview39
Page 40
Server Subsystems
Call
Out
2Host reset button (recessed)This button performs an immediate host reboot
3Locator indicator and button When activated remotely, it helps you find the server. Locally it can be
4System Service Action
5Power OK indicatorAlong with the SP indicator (below), it provides the status of the system
6SP OK indicatorAlong with the Power OK indicator (above), it provides the status of the
7SP reset button (recessed)Press to manually reset the service processor if it becomes unresponsive,
8SMOD Service Action
9HBA Service Action
DescriptionDetails
This button requires a stylus.
pressed to prove physical presence.
For more information about managing the Locator indicator remotely and
locally, see “Managing the Locator Indicator” on page 108.
Required indicator
Required indicator
Required indicator
When lit, it indicates that a system fault has occurred. Other amber (fault)
indicators might also be lit, which can help you isolate the fault to a
particular subsystem.
For more information about using the Service Action
Required and subsystem fault indicators, see “Troubleshooting
Indicators” on page 58.
power. For more information about using the Power and SP indicators to
determine power state, see “Troubleshooting Indicators” on page 58.
system power.
For more information about using the Power and SP indicators to determine
power state, see “Troubleshooting Indicators” on page 58.
requires a reset, or fails to boot to standby power. This button requires a
stylus.
Lights when the SMOD requires service.
Lights when the HBA requires service.
Dual PCIe Card Carrier (DPCC) Indicators
Each DPCC has two indicator panels, one for each PCIe slot inside the server. Each panel
contains a green OK indicator, an amber Service Action Required indicator, and a recessed
pinhole Attention (ATTN) button.
40Oracle Server X5-8 Service Manual • December 2015
Page 41
Server Subsystems
Call OutDescription
1Recessed pinhole button
2Service Action Required/Locator indicator
3OK indicator
AC Power Inlet Indicators
Each power inlet on the AC power block at the back of the server has a single green OK
indicator that turns on steady only when the power at the connector is sufficient for the power
supply unit. In the following illustration, call out 1 shows the OK indicator for inlet AC 0.
Server and Components Overview41
Page 42
Server Subsystems
Switches and Buttons
When you are at the server, the following switches and buttons are accessible:
■
Front panel Power button
Allows you to control server power while local to (at) the server. For power off information,
see “Powering Off the Server” on page 100. For power on information, see “Power On
the Server” on page 236.
■
Two Locator indicator buttons (one on the front of the server and one on the back)
The buttons allow you to manage the Locator indicator locally. To deactivate (or
activate) the Locator indicator, press and release the button (see “Managing the Locator
Indicator” on page 108).
■
Service processor (SP) pinhole reset button on the back of the server
The SP reset button allows you to manually reset the SP. Use the reset button if the SP
becomes unresponsive, requires a reset, or fails to boot into standby power mode (activating
the button requires the use of a stylus).
For location information, see “Back Indicator Panel” on page 39.
■
Host pinhole reset button on the back of the server
42Oracle Server X5-8 Service Manual • December 2015
Page 43
Server Subsystems
The Host Reset button allows you to perform an immediate reboot of the server (activating
the button requires the use of a stylus).
For location information, see “Back Indicator Panel” on page 39.
■
NMI pinhole button on the back of the server
The NMI button is used by Service personnel only. Do not press.
For location information, see “Back Indicator Panel” on page 39.
■
CMOD Fault Remind button
Each CMOD has a motherboard-mounted Fault Remind button. The button is part of the
CMOD Fault Remind circuit. The circuit is charged and allows you to identify a failed
DIMM or CPU after the CMOD has been removed for the server.
For button location information, see “CMOD Layout” on page 27.
■
Sixteen (16) recessed ATTN (attention) buttons (two on each DPCC)
The buttons are used to initiate DPCC removal and install. Before removing a DPCC, use a
stylus to press both ATTN buttons. After installing a DPCC that contains a PCIe card, press
the button again.
For button location information, see “Front Indicator Module (FIM) Panel” on page 35
and “Back Indicator Panel” on page 39.
Server Management Software
The system management software includes:
■
“Service Processor (SP) Oracle ILOM” on page 43
■
“Oracle System Assistant” on page 44
■
“Oracle Hardware Management Pack” on page 44
Service Processor (SP) Oracle ILOM
The server System Module (SMOD) includes an Emulex Pilot 3 service processor (SP) that
runs Oracle ILOM. Oracle ILOM allows you to manage and monitor the server locally or
remotely in full power or standby power modes. Local and remote interface and control
connections to the SP are on the back of the server and include a RJ45 10/100/1000
GigabitEthernet port (remote access) and an RJ45 serial connector and DB15 VGA connector
(local access). For information about Oracle ILOM, including initial release version and update
information, see http://www.oracle.com/goto/ILOM/docs.
Server and Components Overview43
Page 44
Server Subsystems
Oracle System Assistant
Your server might also come equipped with Oracle System Assistant, a server provisioning
and update tool that assists in initial server set up and OS installation and allows you to easily
manage server updates. A server-specific version of Oracle System Assistant is installed on the
internal SMOD USB slot P0 at the factory.
You can start Oracle System Assistant from the server boot screen or from Oracle ILOM.
With Oracle System Assistant, you can:
■
Get a single server-specific bundle of the latest available BIOS, Oracle ILOM, and
hardware firmware and the latest tools and OS drivers from the Oracle support site.
■
Update OS drivers and component firmware and configure RAID.
■
Install supported operating systems with the latest drivers and supported tools.
■
Configure a subset of Oracle ILOM settings.
■
Save and restore customized BIOS settings or revert the BIOS to the factory defaults.
■
Display system overview and detailed hardware inventory information.
For more information, refer to the Oracle X5 Series Servers Administration Guide at: http://
www.oracle.com/goto/x86AdminDiag/docs
Oracle Hardware Management Pack
Oracle Hardware Management Pack (HMP) provides a family of command-line interface (CLI)
tools for managing your servers, and an SNMP monitoring agent.
■
You can use the Oracle Server CLI tools to configure Oracle servers. The CLI tools work
with Oracle Solaris, Oracle Linux, Oracle VM, other variants of Linux, and Windows
operating systems. They can be scripted to support multiple servers, as long as the servers
are of the same type.
■
With the Hardware Management Agent SNMP Plugins, you can use SNMP to monitor
Oracle servers from the operating system using a single host IP address. This prevents you
from having to connect to two management points (Oracle ILOM and the host).
The Hardware Management Agent fetches and pushes information to and from Oracle
ILOM. The SNMP Plugins provides an industry-standard SNMP user interface.
Oracle Linux Fault Management Architecture (FMA) allows you to manage faults at
the operating system level using commands similar to those in the Oracle ILOM Fault
Management shell on systems with Oracle Linux 6.5 or newer. Oracle Linux FMA is
available on Hardware Management Pack 2.3.
44Oracle Server X5-8 Service Manual • December 2015
Page 45
For more details on Oracle Hardware Management Pack, refer to:
http://www.oracle.com/goto/OHMP/docs
Storage and IO Subsystem
The server storage and input/ouput subsystem consists of the following:
■
8 or 16 PCIe Gen3 IO slots (up to eight 16-lane + eight 8-lane)
■
8 SAS2/SATA3 HDD or SSD SFF drive
■
Two 1G/100/10 Ethernet ports
■
4 USB 2.0 ports (2 external, 2 internal)
Back Panel Ports and Connectors
The following figure shows the back panel ports and connectors.
Server Subsystems
Server and Components Overview45
Page 46
Server Subsystems
Call OutDescription
1Video DB-15
2USB 2.0 port
3USB 2.0 port
41 RJ45 10/100/1000 Ethernet service processor (SP) port (NET MGT)
51 RJ45 RS-232 serial console port (SER MGT)
6RJ45 Host GigabitEthernet port (NET 0)
7RJ45 Host GigabitEthernet port (NET 1)
8AC power inlets
Dual PCIe Card Carrier (DPCC)
In the following illustration, call out 1 shows the location of the dual PCIe card carriers
(DPCCs). The eight DPCCs are directly accessible from the back of the server and are located
below the SMOD. Each DPCC holds two PCIe cards.
46Oracle Server X5-8 Service Manual • December 2015
Page 47
Server Subsystems
■
For component serviceability, locations, and designations, see “Component Locations, and
Designations” on page 79.
Chassis Cooling Subsystem
System cooling air flows from front to back. Primary cooling is provided by eight redundant
front-side accessible 100 watt hot-swappable cooling fan modules. To maintain the integrity of
the cooling system, ensure that:
■
All CMOD processors have a heat sink.
■
Each drive bay contains a storage device or a drive slot filler.
■
Every DPCC is installed regardless of whether it contains a card or not.
■
Both fan frames are populated with fan modules.
Server and Components Overview47
Page 48
Server Subsystems
Cooling Zones
The server has five cooling zones. The cooling zones are designated from left to right (from the
front of the server) as zone 0 - zone 4. The airflow cooling in zone 0 is concentrated through the
power supplies (PSUs) and is provided by the internal PSU fan modules.
The fan modules (FMs) provide the airflow cooling for zones 1-4. Each zone has a pair of
dedicated FMs:
■
Zone 1 airflow cooling is concentrated on the CPU modules (CMODs) 0 and 1 and is
provided by FMs 0 and 1.
■
Zone 2 airflow cooling is concentrated on CMODs 2 and 3 and is provided by FMs 2 and 3.
■
Zone 3 airflow cooling is concentrated on CMODs 4 and 5 and is provided by FMs 4 and 5.
■
Zone 4 airflow is concentrated on CMODs 6 and 7 and is provided by FMs 6 and 7.
Note - In a four-CMOD configured server, the fan modules for cooling zones 3 and 4 are
not powered. However, to maintain the integrity of the cooling subsystem, FMs 4-7 must be
installed in the server.
Call OutDescription
0Zone 0: Power supplies
Cooling provided by power supply fans
48Oracle Server X5-8 Service Manual • December 2015
Page 49
Server Subsystems
Call OutDescription
1Zone 1: CMODS 0 and 1
Cooling provided by FMs 0 and 1
2Zone 2: CMODs 2 and 3
Cooling provided by FMs 2 and 3
3Zone 3: CMODs 4 and 5
Cooling provided by FMs 4 and 5
4Zone 4: CMODs 6 and 7
Cooling provided by FMs 6 and 7
Cooling Fan Power
Power for the internal PSU cooling fans (zone 0) is provided by the PSUs. Power for the fan
modules (zones 1-4) is supplied by CMODs 0, 2, 4, and 6.
■
The chassis cooling fans operate only when the chassis is in full power mode (see “Full
Power Mode” on page 107).
■
The PSU fans operate when the system is in full power or standby power mode.
The following table shows the CMODs and the fan modules to which they supply power:
CMODFan Modules Powered
CMOD 0FMs 0 and 1
CMOD 2FMs 2 and 3
CMOD 4FMs 4 and 5
CMOD 6FMs 6 and 7
Note - The fan power connectors for CMODs in slots 1,3,5, and 7 are not used.
Fan Module Redundancy
The eight fan modules (FMs) provide airflow for chassis cooling zones 1-4. For redundancy,
each zone has two dedicated FMs. If an FM fails, replace it immediately. The FMs are hotserviceable.
Server and Components Overview49
Page 50
Server Block Diagram
Caution - Data Loss. Do not remove more than one fan module from a column while the system
is in full power mode. This action removes power from the CMODs and causes an immediate
shutdown. On an eight-CMOD system, this applies to all fan modules. On a four-CMOD
system, this applies to the fan modules in the left-hand fan frame.
For FM reference and servicing information, see “Servicing Fan Modules and Fan
Frames” on page 117.
Power Subsystem
Chassis power is provided by four hot-serviceable front-side accessible power supply units
(PSUs). The four PSUs provide dual (2+2) redundancy. Therefore, the minimum PSU
configuration is two. To ensure redundancy, power for the server should come from at least two
separate circuits.
When the AC power cords are connected to AC inputs at the back of the chassis, the PSUs
supply power to the Ethernet ports, the system sensors and inventory circuits, and the service
processor (SP). When power is supplied to the SP, it boots, and the server enters the low-power
standby power mode.
Once the SP boots into standby power mode, full power mode is initiated by pressing and
releasing the chassis front-panel Power button or by powering on the server remotely from
Oracle ILOM.
For more information about power modes, see “Power Modes, Shutdowns, and
Resets” on page 107.
Server Block Diagram
The following illustration shows the a block diagram of the server interconnects between the
CMODs, the midplane, and the SMOD. It also shows the interconnects between attached and
integrated components:
50Oracle Server X5-8 Service Manual • December 2015
Page 51
Server Block Diagram
Server and Components Overview51
Page 52
52Oracle Server X5-8 Service Manual • December 2015
Page 53
Troubleshooting and Diagnostics
This section provides information about troubleshooting hardware component faults for the
Oracle Server X5-8. It contains the following topics.
DescriptionLink
Maintenance-related information and
procedures used to troubleshoot and repair
server hardware issues.
Information about software and firmware
diagnostic tools used to isolate problems,
monitor the server, and exercise the server
subsystems.
Information about attaching devices to the
server to perform troubleshooting.
Information about contacting Oracle support.“Getting Help” on page 75
Troubleshooting Server Hardware Component Faults
“Troubleshooting Server Hardware Component
Faults” on page 53
“Troubleshooting With Diagnostic Tools” on page 70
“Attaching Devices to the Server” on page 72
This section contains maintenance-related information and procedures used to troubleshoot and
repair server hardware. It includes the following topics.
DescriptionSection Links
Troubleshooting overview information and
procedure
Source listing for troubleshooting and
diagnostic information
Discerning the server state using the front
panel indicators
Explanation of indicator blink rates“Indicator Blink Rates” on page 62
Explanation of the CMOD Fault Remind
Test Circuit
“Troubleshooting Server Hardware Faults” on page 54
“Troubleshooting and Diagnostic Information” on page 58
“Troubleshooting Indicators” on page 58
“The CMOD Fault Remind Test Circuit” on page 68
Troubleshooting and Diagnostics53
Page 54
Troubleshooting Server Hardware Component Faults
DescriptionSection Links
Causes, actions, and preventative measures
for problems related to the cooling
subsystem
Causes, actions, and preventative measures
for problems related to the power subsystem
Troubleshooting Server Hardware Faults
When a server hardware fault event occurs the system lights the Service Action Required LED
and captures the event in the system event log (SEL). If you have set up notifications through
Oracle ILOM, you also receive an alert through the notification method you chose.
When you become aware of a hardware fault, you should address it immediately.
To investigate a hardware fault, see the following:
■
“Basic Troubleshooting Steps” on page 54
■
“Troubleshoot Hardware Faults” on page 55
“Troubleshooting System Cooling Issues” on page 68
“Troubleshooting Power Issues” on page 69
Basic Troubleshooting Steps
Use the following process to address a hardware fault (for the step-by-step procedure, see
“Troubleshoot Hardware Faults” on page 55):
1. Identify the server subsystem containing the fault.
You can use Oracle ILOM to identify the failed component.
2. Review the Product Notes.
Once you have identified the hardware issue, review the Oracle Server X5-8 Product Notes.
This document contains up-to-date information about the server, including hardware-related
issues.
3. Prepare the server for service using Oracle ILOM.
If you have determined that the hardware fault requires service (physical access to the
server), use Oracle ILOM to power off the server, activate the Locate LED, and take the
server offline.
4. Prepare the service work space.
Before servicing the server, prepare the work space to ensure ESD protection for the server
and components.
54Oracle Server X5-8 Service Manual • December 2015
Page 55
Troubleshoot Hardware Faults
5. Service components.
To service the components, see the removal, installation, and replacement procedures in this
document.
Note - A component designated as a FRU must be replaced by Oracle Service personnel.
Contact Oracle Service.
6. Clear the fault in Oracle ILOM.
Depending on the component, you might need to clear the fault in Oracle ILOM. Generally,
components that have a FRU ID clear the fault automatically.
See Also:
■
“Troubleshoot Hardware Faults” on page 55
Troubleshoot Hardware Faults
Before You Begin
1.
Note - The screens and information in this procedure might differ from those for your server.
This procedure uses the basic troubleshooting steps described in “Basic Troubleshooting
Steps” on page 54.
Use this procedure to troubleshoot hardware faults with the Oracle ILOM web interface and, if
necessary, prepare the server for service.
Note - This procedure provides one basic approach to troubleshooting hardware faults. It uses
a combination of the Oracle ILOM web and CLI interfaces. However, the procedure can be
performed using only the Oracle ILOM CLI interface. For more information about the Oracle
ILOM web interface, refer to the Oracle ILOM documentation.
■
Obtain the latest version of the Oracle Server X5-8 Product Notes.
Log in to the server SP Oracle ILOM web interface.
Open a browser and type in the IP address of the server SP. Enter a user name (with
administrator privileges) and password at the log-in screen. The Summary screen appears.
The Status section of the Summary screen provides information about the server subsystems,
such as:
■
Processors
Troubleshooting and Diagnostics55
Page 56
Troubleshoot Hardware Faults
■
Memory
■
Power
■
Cooling
■
Storage
■
Networking
■
I/O Modules
2.
In the Status section of the summary screen, identify the server subsystem that
requires service.
In the above example, the Status screen shows that the Memory subsystem requires service.
This indicates that a hardware component within the subsystem is in a fault state.
3.
To identify the component, click on the subsystem name.
56Oracle Server X5-8 Service Manual • December 2015
Page 57
The subsystem screen appears.
Troubleshoot Hardware Faults
The above example shows the Memory subsystem screen and indicates that DIMM 8 on CPU 0
has an uncorrectable ECC fault.
4.
To get more information, click one of the Open Problems links.
The Open Problems screen provides detailed information, such as the time the event occurred,
the component and subsystem name, and a description of the issue. It also includes a link to a
KnowledgeBase article.
Tip - The System Log provides a chronological list of all the system events and faults that have
occurred since the log was last reset and includes additional information, such as severity levels
and error counts. To access it, click the System Log link.
In this example, the hardware fault with DIMM 8 of CPU 0 requires local (physical) access to
the server.
5.
Before going to the server, review the server Product Notes document for
information related to the issue or the component.
The Oracle Server X5-8 Product Notes contains up-to-date information about the server,
including hardware-related issues.
6.
To prepare the server for service, see “Preparing for Service” on page 95.
Troubleshooting and Diagnostics57
Page 58
Troubleshoot Hardware Faults
Note - After servicing the component, you might need to clear the fault in Oracle ILOM. Refer
the service procedure for the component for more information.
Troubleshooting and Diagnostic Information
The following table lists diagnostic- and troubleshooting-related procedures and references that
can assist you with resolving server issues.
DescriptionLink
Diagnostic information for the x86 servers,
including procedures for performing runtime
and firmware-based tests, using Oracle
ILOM, and running U-Boot and UEFIdiag
to exercise the system and isolate subtle and
intermittent hardware-related problems.
Administrative information for the Oracle
Server X5 series servers, including
information about how to use Oracle System
Assistant and using the Oracle ILOM system
event log (SEL) to identify a problem's
possible source.
Oracle x86 Server Diagnostics, Applications, and Utilities Guide
Oracle X5 Series Servers Administration Guide
Troubleshooting Indicators
The eight indicators on the server front panel show the state of the server. For more information
about indicator locations, see “Front Indicator Module (FIM) Panel” on page 35 and “Back
Indicator Panel” on page 39.
The following sections describe the status of the front panel indicators for various server states:
Note - For the error state scenarios described below, the state of the Power OK indicator
depends on presence of redundant components and the severity of the fault.
■
“Server Boot Process and Normal Operating State Indicators” on page 59
■
“Locator Indicator On” on page 60
■
“Over Temperature Condition” on page 60
58Oracle Server X5-8 Service Manual • December 2015
Page 59
Troubleshoot Hardware Faults
■
“PSU Failure” on page 61
■
“Memory Failure” on page 61
■
“CPU Failure” on page 61
■
“Fan Module Failure” on page 61
■
“SP Failure” on page 61
■
“Front Panel Lamp Test” on page 62
Server Boot Process and Normal Operating State Indicators
A normal server boot process involves the service processor (SP) indicator and the Power OK
indicator. In the illustration below, call out 1 shows the Power OK indicator and call out 3
shows the SP indicator. Call out 2 shows the power button.
The following table describes the indicator activity during a normal boot sequence.
System ConditionSP IndicatorPower OK Indicator
AC power applied to server. SP is booting.BlinksOff
SP is booted and ready to use. Host is off.Steady OnBlinks at single blink
SP is running. Host is booting.Steady OnBlinks at fast rate
SP and host are running. This is the normal operating
state of the system.
Steady OnSteady On
Troubleshooting and Diagnostics59
rate (quick flash every 3
seconds)
Page 60
Troubleshoot Hardware Faults
Locator Indicator On
The Locator indicator is a white combination button/indicator located on both the front and
back panels. When it is on, it blinks at the fast blink rate:
■
Turn it on remotely from Oracle ILOM to locate the server in a rack.
Typically, a server readied for service is placed in standby power mode and the Locator
indicator is turned on.
■
Press the button to prove physical presence. Some service procedures require you to prove
physical presence by pressing the Locator indicator button.
■
You can turn the Locator indicator off remotely from Oracle ILOM, or by pressing the
button.
The following figure shows the Locator indicator on the front panel:
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
Over Temperature Condition
For a server in an over-temperature state, the server amber over-temperature indicator and the
amber Service Action Required indicators (front and back) are on steady. The state of the front
and back green Power OK indicator and the green SP indicator depends on the severity of the
condition.
60Oracle Server X5-8 Service Manual • December 2015
Page 61
Troubleshoot Hardware Faults
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
PSU Failure
For a server with a PSU in a failed state, the server amber Service Action Required indicators
(front and back) and the amber Servic Action Required indicator on the PSU are on steady. The
front and back green Power OK indicator and the green SP indicator are on steady.
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
Memory Failure
For a server with a failure in the memory subsystem, the server amber Service Action Required
indicators (front and back) and an amber CMOD Service Action Required indicator are on
steady. The front and back green Power OK indicator and the green SP indicator are on steady.
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
CPU Failure
For a server with a fault in the processor subsystem, the server amber Service Action Required
indicators (front and back) and an amber CMOD Service Action Required indicator are on
steady. The activity of front and back green Power OK indicator and the green SP indicator vary
depending on whether the server can boot successfully. The server might not be able to boot out
of standby power mode.
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
Fan Module Failure
For a server with a fan module fault, the server amber Service Action Required indicators (front
and back) and an amber Service Action Required indicator on a fan module are on steady. The
front and back green server OK indicator and the green SP indicator are on steady.
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
SP Failure
For a server with an SP fault, the server amber Service Action Required indicators (front and
back) are on steady. The front and back System OK indicators and the SP OK indicator are off.
Troubleshooting and Diagnostics61
Page 62
Troubleshoot Hardware Faults
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
Front Panel Lamp Test
To perform a lamp test of all front panel indicators, press the Locate button three times within
a five second period. All the front and back indicators light up and remain on steady for 15
seconds (see “Unison Steady On” on page 66).
For indicator blink rate information, see “Indicator Blink Rates” on page 62.
Indicator Blink Rates
This section describes the following indicator blink rates:
■
“Steady On” on page 62
■
“Steady Off” on page 63
■
“Slow Blink Rate” on page 63
■
“Fast Blink Rate” on page 64
■
“Single (Standby) Blink Rate” on page 64
■
“Slow Unison Blink Rate” on page 65
■
“Insertion Blink” on page 65
■
“Unison Steady On” on page 66
■
“Alternating (Invalid FRU) Blink Rate” on page 66
■
“Feedback Flash” on page 67
■
“Data Blink Rate” on page 67
■
“Sequential (Diagnostic) Blink Rate” on page 67
Steady On
For the steady on state, an indicator is continually on (lit) and does not blink. This indicates a
continuing condition, for example, an operational state (green) or a Service Action Required
fault state (amber).
62Oracle Server X5-8 Service Manual • December 2015
Page 63
Troubleshoot Hardware Faults
Steady Off
For the steady off state, an indicator is continually off (not lit) and does not blink. This indicates
that a system is not operational, for example, no AC power (unlit green Power OK indicator) or
a subsystem not in a fault state (unlit amber Service Action Required indicator).
Slow Blink Rate
For the slow blink rate, the indicator (typically green) repeatedly lights for half a second during
a one second interval (1 Hz) and turns off for half a second. The slow blink rate indicates an
on-going activity. For example, device rebuilding, booting, or in transition from one mode to
another.
Troubleshooting and Diagnostics63
Page 64
Troubleshoot Hardware Faults
Fast Blink Rate
For the fast blink rate, the indicator repeatedly blinks twice (on, off, on) during a one second
interval (2 Hz). The fast blink rate indicates activity or data transfer.
Single (Standby) Blink Rate
For the single blink rate, the indicator repeatedly flashes once at the beginning of a three second
interval. This indicates a system or component in standby mode. For example, a server in
standby power mode or a hot spare device waiting to be used (also used with amber indicators
to indicate a predicted fault).
64Oracle Server X5-8 Service Manual • December 2015
Page 65
Troubleshoot Hardware Faults
Slow Unison Blink Rate
For the slow unison blink rate, the indicators on the component blink in unison for half a second
during a one second interval (1 Hz). Typically, this is limited to three successive blinks. This
confirms the successful insertion of a removable device (for example, a storage drive) into a
powered system (confirming the power connection).
Insertion Blink
The insertion blink is three successive blinks of a hot-swap component's primary status
indicator (for example, the green Power OK indicator). The insertion blink occurs immediately
Troubleshooting and Diagnostics65
Page 66
Troubleshoot Hardware Faults
after three successive unison blinks (see “Slow Unison Blink Rate” on page 65) of all the
component indicators.
Unison Steady On
For the unison steady on, all indicators are simultaneously on steady (see “Steady
On” on page 62. This occurs during the front panel lamp test (see “Front Panel Lamp
Test” on page 62). This is the only time that the Locator indicator is on steady.
Alternating (Invalid FRU) Blink Rate
The alternating (invalid FRU) blink rate is a repeating sequence of lit green and amber
indicators at 1 Hz. This indicates that a component has an incorrect version or mismatch (for
example, a power supply with a lower rating than the one specified). The blink rate is also used
for an unsupported component, or a component in an unsupported slot.
66Oracle Server X5-8 Service Manual • December 2015
Page 67
Troubleshoot Hardware Faults
Feedback Flash
The indicator flashes on and off during periods of activity, commensurate with the activity, but
the flashing does not exceed the 2 Hz fast blink rate (see, “Fast Blink Rate” on page 64).
For example, this blink rate occurs during disk drive read and write activity and communication
port transmit and receive activity.
Data Blink Rate
For this blink rate, a normally on Indicator repeatedly turns off twice during a one-second
interval (2 Hz—see also, “Fast Blink Rate” on page 64) while data activity is taking place.
Sequential (Diagnostic) Blink Rate
This blink rate is a repeating sequence in which each indicator successively lights for 0.5 sec
to indicate that diagnostics are running. This blink rate is used only on systems or components
capable of running diagnostics.
Troubleshooting and Diagnostics67
Page 68
Troubleshoot Hardware Faults
The CMOD Fault Remind Test Circuit
The CMODs have an internal test circuit that you can use to locate failed DIMMs and verify
a failed CPU after removing the CMOD from the server. The DIMM and CPU Fault Remind
circuits hold an electrical charge for 10 minutes after power is removed from the server,
allowing enough time to remove the CMOD and use the circuit.
For more information, see “Replace a Failed DIMM” on page 146 and “Remove a Heatsink
and Processor (FRU)” on page 161.
Troubleshooting System Cooling Issues
Maintaining the proper internal operating temperature of the server is crucial to a the health of
the server. To prevent server shutdown and damage to components, address over temperature
and hardware related issues as soon as they occur. If your server has a temperature fault, the
cause of the problem might be:
■
“External Ambient Temperature Too High” on page 68
■
“Airflow Blockage” on page 68
■
“Hardware Component Failure” on page 69
External Ambient Temperature Too High
If the ambient temperature in the server space is too high, the cool air that is pulled into the
server cannot cool the server sufficiently to prevent the internal temperature from rising. This
can cause poor performance or component failure.
Action: Check the ambient temperature of the server space against the environmental
specifications for the server. If the temperature is not within the required operating range,
remedy the situation immediately.
Prevention: Periodically check the ambient temperature of the server space to ensure that it
is within the required range, especially if you have made any changes to the server space (for
example, added additional servers). The temperature must be consistent and stable.
Airflow Blockage
The server cooling system uses fans to pull cool air in from the server front intake vents and
exhaust warm air out the server back panel vents. If the front or back vents are blocked, the
68Oracle Server X5-8 Service Manual • December 2015
Page 69
Troubleshoot Hardware Faults
airflow through the server is disrupted and the cooling system fails to function properly causing
the server internal temperature to rise.
Action: Inspect the server front and back panel vents for blockage from dust or debris.
Additionally, inspect the server interior for improperly installed components or cables that can
block the flow of air through the server.
Prevention: Periodically inspect and clean the server vents using a vacuum cleaner. Ensure that
all components, such as cards, cables, fans, air baffles and dividers are properly installed.
Hardware Component Failure
Fan modules and power supply fans drive the server cooling system. When one of these
components fails, the server internal temperature can rise. This rise in temperature can cause
other components to enter into an over-temperature state. Additionally, some components,
such as processors, might overheat when they are failing, which can also generate an overtemperature event.
To reduce the risk related to component failure, power supplies and fan modules are installed
in pairs to provide redundancy. Redundancy ensures that if one component in the pair fails, the
remaining component can continue to maintain the subsystem. For example, power supplies
serve a dual function; they provide both power and airflow. If one power supply fails, the other
functioning power supply is able to maintain both the power and the cooling subsystems.
Action: Investigate the cause of the over-temperature event, and replace failed components
immediately. For hardware troubleshooting information, see “Troubleshooting Server Hardware
Faults” on page 54.
Prevention: Maintain redundant systems and replace failed components immediately.
Troubleshooting Power Issues
If your server does not power on, the cause of the problem might be:
■
“AC Power Connection” on page 69
■
“Power Supplies (PSUs)” on page 70
AC Power Connection
The AC power cords are the direct connection between the server power supplies and the
power sources. The server power supplies need separate stable AC circuits operating at specific
Troubleshooting and Diagnostics69
Page 70
Troubleshooting With Diagnostic Tools
voltage levels. Insufficient voltage levels or voltage fluctuations can cause server power
problems.
Action: Check that the AC power cords are connected to the server. Check that the correct
power is present at the outlets and monitor the power to verify that it is within the acceptable
range.
■
AC OK indicators next to the AC inlets on the back of the server are green when the power
is connected, and off when it is not.
■
The AC OK and DC OK indicators on the PSU indicator panels on the front of the system
are green when the PSU is functioning properly.
Prevention: Use the AC power cord retaining clips and position the cords to minimize the risk
of accidental disconnection. Ensure that the AC circuits that supply power to the server are
stable and not overburdened.
Power Supplies (PSUs)
The server power supplies (PSUs) provide the necessary server voltages from the AC power
outlets. If the PSUs are inoperable, unplugged, or disengaged from the internal connectors, the
server cannot power on.
Action: Check that the AC cables are connected to both PSUs. Check that the PSUs are
operational (the PSU indicator panel should have a lit green AC OK indicator). Ensure that the
PSU is properly installed. A PSU that is not fully engaged with its internal connector does not
have power applied and does not have a lit green AC OK indicator.
Prevention: When a power supply fails, replace it immediately. When installing a power
supply, ensure that it is fully seated and engaged with its connector inside the drive bay. A
properly installed PSU, has a lit green AC OK indicator.
Troubleshooting With Diagnostic Tools
The server and its accompanying software and firmware contain diagnostic tools and features
that can help you isolate component problems, monitor the status of a functioning system,
and exercise one or more subsystems to disclose more subtle or intermittent hardware-related
problems.
Each diagnostic tool has its own specific strength and application. Review the tools listed in
this section and determine which tool might be best to use for your situation. Once you've
determined the tool to use, you can access it locally, while at the server, or remotely.
70Oracle Server X5-8 Service Manual • December 2015
Page 71
Troubleshooting With Diagnostic Tools
■
“Diagnostic Tools” on page 71
■
“Diagnostic Tool Documentation” on page 72
Diagnostic Tools
The diagnostic tools range in complexity from a comprehensive validation test suite (Oracle
VTS) to a chronological event log (Oracle ILOM System Log). They include standalone
software packages, firmware-based tests, and hardware-based LED indicators.
The following table summarizes the diagnostic tools.
Diagnostic ToolTypeWhat It DoesAvailabilityRemote Capability
Oracle ILOMSP firmwareMonitors
Preboot MenuSP firmwareEnables you to
Hardware-based
LED indicators
Power-on Self-Test
(POST)
U-BootSP firmwareInitializes and test
Hardware and SP
firmware
Host firmwareTests core
environmental
condition and
component
functionality sensors,
generates alerts,
performs fault
isolation, and
provides remote
access.
restore some of
Oracle ILOM
defaults (including
firmware) when
Oracle ILOM is not
accessible.
Indicate status of
overall system
and particular
components.
components of
system: CPUs,
memory, and
motherboard I/O
bridge integrated
circuits.
aspects of the service
processor (SP)
Available in either
standby power
mode or full power
mode. It is not OS
dependent.
Available in either
standby power
mode or full power
mode. It is not OS
dependent.
Available when
system power is
available.
Runs on startup.
Available when the
operating system is
not running.
Available in either
standby power
mode or full power
Designed for remote
and local access.
Local, but remote
serial access is
possible if the
SP serial port is
connected to a
network-accessible
terminal server.
Local, but sensor
and indicators are
accessible from
Oracle ILOM
web interface or
command-line
interface (CLI).
Local, but can be
accessed through
Oracle ILOM
Remote Console.
Local, but remote
serial access is
possible if the
Troubleshooting and Diagnostics71
Page 72
Attaching Devices to the Server
Diagnostic ToolTypeWhat It DoesAvailabilityRemote Capability
Solaris commandsOperating system
Oracle VTSDiagnostic tool
UEFI DiagnosticsA suite of diagnostic
Diagnostic Tool Documentation
software
standalone software
tests
prior to booting the
Oracle ILOM SP and
operating system.
Tests SP memory,
SP, network devices
and I/O devices.
Displays various
kinds of system
information.
Exercises and
stresses the system,
running tests in
parallel.
Run tests manually
or automatically.
Read the results on
screen or in log files.
SP serial port is
connected to a
network-accessible
terminal server.
Local, and over
network.
View and control
over network.
Remote access using
Oracle ILOM.
The following table identifies where you can find more information about diagnostic tools.
Diagnostic ToolInformationLocation
Oracle ILOMOracle Integrated Lights Out
Manager Documentation Library
Preboot MenuUsing the Preboot Menu UtilityOracle X5 Series Servers
U-BootOracle x86 Servers Diagnostics
Guide
System indicators and sensorsOracle Server X5-8 Service Manual“Troubleshooting
UEFI diagnosticsOracle x86 Servers Diagnostics
Guide
Oracle VTSOracle VTS software and
documentation
Attaching Devices to the Server
The following sections contain procedures for attaching devices to the server. These allow you
to access diagnostic tools when troubleshooting and servicing the server:
http://www.oracle.com/goto/
ILOM/docs
Administration Guide
Oracle x86 Server Diagnostics,
Applications, and Utilities Guide
Indicators” on page 58
Oracle x86 Server Diagnostics,
Applications, and Utilities Guide
Oracle x86 Server Diagnostics,
Applications, and Utilities Guide
72Oracle Server X5-8 Service Manual • December 2015
Page 73
Attach Devices to the Server
■
“Attach Devices to the Server” on page 73
■
“Configuring Serial Port Sharing” on page 73
■
“Ethernet Port Device Naming” on page 75
Attach Devices to the Server
This section provides instructions for connecting remote and local devices to server so you can
interact with the service processor (SP) and the server console.
For port and connector information, see “Back Panel Ports and Connectors” on page 45.
1.
Connect an Ethernet cable to the Gigabit Ethernet (NET) connectors as needed
for OS support.
2.
To connect to Oracle ILOM over the network, connect an Ethernet cable to the
Ethernet port labeled NET MGT.
3.
To access the Oracle ILOM command-line interface (CLI) locally using the
management port, connect a serial null modem cable to the RJ-45 serial port
labeled SER MGT.
4.
To connect to the system console locally, connect a mouse and keyboard to the
server front panel USB connectors and a monitor to the server front panel DB-15
video connector.
Configuring Serial Port Sharing
By default, the NET MGT serial port connects to the SP console. Using Oracle ILOM, you can
configure it to connect to the host console instead. This feature is useful for Windows kernel
debugging, as it enables you to view non-ASCII character traffic from the host console.
Do not configure the NET MGT port to connect to the host console until after you have
configured the Oracle ILOM network connection. Otherwise you cannot connect to Oracle
ILOM to switch it back.
For more details about restoring access to the server port on your server, see the Oracle
Integrated Lights Out Manager (ILOM) 3.2 Documentation Library at: http://www.oracle.
com/goto/ILOM/docs.
Troubleshooting and Diagnostics73
Page 74
Assign Serial Port Output Using the CLI
You can assign serial port output using either the Oracle ILOM web interface or the commandline interface (CLI). For instructions, see the following sections:
■
“Assign Serial Port Output Using the CLI” on page 74
■
“Assign Serial Port Output Using the Web Interface” on page 74
Assign Serial Port Output Using the CLI
1.
Open an SSH session and at the command line log in to the SP Oracle ILOM CLI.
Log in as a user with root or administrator privileges. For example:
ssh root@ipadress
where ipadress is the IP address of the server SP.
For more information, see Oracle X5 Series Servers Administration Guide.
The Oracle ILOM CLI prompt appears:
->
2.
To set the serial port owner, type:
-> set /SP/serial/portsharing owner=host
Note - The serial port sharing value by default is owner=SP.
3.
Connect a serial host to the server.
Assign Serial Port Output Using the Web Interface
1.
Log in to the service processor Oracle ILOM web interface.
To log in, open a web browser and direct it using the IP address of the server SP.
Log in as root or a user with administrator privileges. For more information, see Oracle X5
Series Servers Administration Guide.
The Summary screen appears.
2.
In the ILOM web interface, select ILOM Administration --> Connectivity from the
navigation menu on the left side of the screen.
3.
Select the Serial Port tab.
74Oracle Server X5-8 Service Manual • December 2015
Page 75
The Serial Port Settings page appears.
Note - The serial port sharing setting by default is Service Processor.
4.
In the Serial Port Settings page, select Host Server as the serial port owner.
5.
Click Save for the changes to take effect.
6.
Connect a serial host to the server.
Ethernet Port Device Naming
This section contains information about the device naming for the Ethernet ports on the back
panel of the server (see “Back Panel Ports and Connectors” on page 45).
Note - Naming used by the interfaces might vary from that listed below depending on which
devices are installed in the system.
Getting Help
The device naming for the Ethernet interfaces is reported differently by different interfaces
and operating systems. The following illustration explains the logical (operating system) and
physical (BIOS) naming conventions used for each interface. These naming conventions might
vary depending on conventions of your operating system and which devices are installed in the
server.
PortBIOSOracle SolarisLinuxWindows
Net 10701igb 1eth 1net2
Net 00700igb 0eth 0net
Getting Help
The following sections describe how to get additional help to resolve server-related problems.
■
“Contacting Support” on page 76
■
“Locating the Chassis Serial Number” on page 76
Troubleshooting and Diagnostics75
Page 76
Getting Help
Contacting Support
If the troubleshooting procedures in this chapter fail to solve your problem, use the following
table to collect information that you might need to communicate to support personnel.
System Configuration Information NeededYour Information
Service contract number
System model
Operating environment
System serial number
Peripherals attached to the system
Email address and phone number for you and
a secondary contact
Street address where the system is located
Superuser password
Summary of the problem and the work being
done when the problem occurred
IP address
Server name (system host name)
Network or internet domain name
Proxy server configuration
See Also
■
“Locating the Chassis Serial Number” on page 76
Locating the Chassis Serial Number
You might need to have your server's serial number when you ask for service on your system.
Record this number for future use. Use one of the following methods to locate your server's
serial number:
■
On the front panel of the server, look at the middle left of the bezel to locate the server's
serial number.
■
Locate the yellow Customer Information Sheet (CIS) attached to your server packaging.
This sheet includes the serial number.
76Oracle Server X5-8 Service Manual • December 2015
Page 77
■
From Oracle System Assistant, see the Summary screen.
■
From Oracle ILOM, enter the show /SYS command or go to the System Information tab in
the Oracle ILOM browser interface.
Getting Help
Troubleshooting and Diagnostics77
Page 78
78Oracle Server X5-8 Service Manual • December 2015
Page 79
Servicing the Server
This section provides generalized information about servicing the server. It includes
Section DescriptionLink
Component serviceability requirements,
locations and designations
Procedures for creating an ESD-safe work
space
Required tools“Tools and Equipment” on page 91
Information about component filler panels“Component Filler Panels and Non-Powered
Procedure for clearing hardware faults in the
system
Component Locations, and Designations
This section provides information about servicing components, including component locations
and designations.
“Component Locations, and Designations” on page 79
“Performing Electrostatic Discharge and Static Prevention
Measures” on page 90
Components” on page 91
“Clear Hardware Fault Messages” on page 92
■
“Component Serviceability Requirements” on page 79
■
“Component Locations” on page 80
■
“Component Designations” on page 82
■
“Component Network Access Control (NAC) Names” on page 89
Component Serviceability Requirements
The following table lists the system components and identifies whether they are hot, warm, or
cold service components, and whether they are a customer-replaceable unit (CRU) or a fieldreplaceable unit (FRU).
Servicing the Server79
Page 80
Component Locations, and Designations
■
Hot service components can be serviced while the server is powered on and running in fullpower mode.
■
Warm service components can be serviced while the server is in standby power mode. These
include CMODs, DIMMs, and processors and heatsinks.
■
Cold service components must be serviced when the server is completely powered off and
disconnected from the power source.
A CRU or FRU designation determines who is qualified to service a component.
■
CRUs can be serviced by customers.
■
FRUs must be serviced by qualified Oracle Service personnel.
ComponentService DesignationServiceability
Front indicator module (FIM)CRUCold
Power supply unit (PSU)CRUHot
Fan modules (FM)CRUHot
Fan frameCRUCold
CPU module (CMOD)CRUWarm
Memory (DIMMs)CRUWarm
Processor and heatsinkFRUWarm
Storage drive (HHD, SSD)CRUHot
Dual PCIe card carrier (DPCC)CRUHot
PCIe cardCRUHot/Cold
System module (SMOD)CRUCold
Internal USB flash driveCRUCold
External USB flash driveCRUHot
Host bus adapter (HBA) cardCRUCold
HBA cableCRUCold
Energy Storage ModuleCRUCold
Energy Storage Module CableCRUCold
System Clock BatteryCRUCold
†
Hot service as part of DPCC, which must be removed first.
†
Component Locations
The following illustration shows the locations of the server components.
80Oracle Server X5-8 Service Manual • December 2015
Page 81
Component Locations, and Designations
Call OutDescriptionCall OutDescription
1AC power block
2Dual PCIe carrier card (DPCC) with
†
8Midplane
9Server chassis
‡
PCIe card (8)
3Host bus adapter (HBA) card10CPU module (CMOD) (4 or 8)
4Energy storage module11Fan frame (2)
5Storage drive (8)12Fan module (8)
6System module (SMOD)13Power supply (PSU) (4)
7Top cover14Front indicator module (FIM)
†
The AC power block is not a removable component.
‡
The chassis is not a removable component.
Servicing the Server81
Page 82
Component Locations, and Designations
Component Designations
These sections show the location and designation of CRU and FRU components:
■
“Fan Module Designations” on page 82
■
“Power Supply Slot Designations” on page 83
■
“CMOD Slot Designations” on page 84
■
“Memory Slot Designations” on page 85
■
“Storage Drive Slot Designations” on page 86
■
“DPCC and PCIe Card Slot Designations” on page 87
■
“AC Input Power Block” on page 88
Fan Module Designations
The eight fan modules (FMs) are directly accessible at the front of the server and are arranged
in two stacked rows of four FMs.
■
Bottom row from left to right: FM 0, FM 2, FM 4, and FM 6.
■
Top row from left to right: FM 1, FM 3, FM 5, and FM 7.
82Oracle Server X5-8 Service Manual • December 2015
Page 83
Component Locations, and Designations
Call OutDescription
0FM 0
1FM 1
2FM 2
3FM 3
4FM 4
5FM 5
6FM 6
7FM 7
The eight fan modules are installed in two fan frames The left frame contains FM 0, FM1, FM2,
and FM 3. The right frame contains FM 4, FM 5, FM 6, and FM 7.
Each vertical pair of FMs provides cooling for the corresponding CPU modules (CMODs),
which are located directly behind the FMs. For example, FMs 0 and 1 provide cooling for
CMODs 0 and 1, and FMs 6 and 7 provide cooling for CMODs 6 and 7.
For CMOD designations, see “CMOD Slot Designations” on page 84.
Power Supply Slot Designations
The four slots for the power supply units (PSUs) are directly accessible at the front of the server
and are arranged in a single stacked row. They are designated from the bottom to the top as,
PSU 0, PSU 1, PSU 2, and PSU 3. The following illustration shows the arrangement of the
PSUs.
Servicing the Server83
Page 84
Component Locations, and Designations
Call OutDescription
0PSU 0
1PSU 1
2PSU 2
3PSU 3
CMOD Slot Designations
CPU module slots are arranged in a single row and are designated from left to right as, CMOD
0–CMOD 7. The CMOD slots are accessible from the front of the server by removing the FMs
and frames.
The server is available with four CMODs or eight CMODs. Four-CMOD systems have CMODs
in CMOD 0–CMOD 3, and filler panels in CMOD 4–CMOD 7.
For more information, see “CPU Module (CMOD) Overview” on page 23.
Call OutDescription
0CMOD 0
84Oracle Server X5-8 Service Manual • December 2015
Page 85
Component Locations, and Designations
Call OutDescription
1CMOD 1
2CMOD 2
3CMOD 3
4CMOD 4
5CMOD 5
6CMOD 6
7CMOD 7
Memory Slot Designations
Each CMOD contains 24 DIMM slots arranged in four groups of six slots. The following
illustration shows the groups and their slot designations.
Servicing the Server85
Page 86
Component Locations, and Designations
Call OutDescription
1Slots D12–D17
2Slots D18–D23
3Slots D6–D11
4Slots D0–D5
See Also
■
“Memory and DIMM Reference” on page 158
Storage Drive Slot Designations
The eight storage drive slots are in the system module (SMOD) and directly accessible at the
back of the server. Slots are arranged in two stacked rows of four slots and designated from
right to left.
■
The top row contains slots 0, 2, 4, and 6.
■
The bottom row contains slots 1, 3, 5, and 7
86Oracle Server X5-8 Service Manual • December 2015
Page 87
Component Locations, and Designations
Call OutDescription
0Slot 0
1Slot 1
2Slot 2
3Slot 3
4Slot 4
5Slot 5
6Slot 6
7Slot 7
DPCC and PCIe Card Slot Designations
The eight dual PCIe card carrier (DPCC) slots are arranged in a single row at the back of the
server. The slots are designated from right to left as, DPCC 0–DPCC 7.
Each DPCC supports two PCIe slots, for a total of 16. The PCIe slots are designated from right
to left as PCIe 1–PCIe 16.
■
DPCC 0 contains PCIe slots 1 and 2
■
DPCC 1 contains PCIe slots 3 and 4
■
DPCC 2 contains PCIe slots 5 and 6
■
DPCC 3 contains PCIe slots 7 and 8
■
DPCC 4 contains PCIe slots 9 and 10
■
DPCC 5 contains PCIe slots 11 and 12
■
DPCC 6 contains PCIe slots 13 and 14
■
DPCC 7 contains PCIe slots 15 and 16
The following illustration shows the location and designations of the PCIe slots.
Servicing the Server87
Page 88
Component Locations, and Designations
Call OutDescriptionCall OutDescription
1PCIe Slot 1 in DPCC 09PCIe PCIe Slot 9 in DPCC 4
2PCIe Slot 2 in DPCC 010PCIe Slot 10 in DPCC 4
3PCIe Slot 3 in DPCC 111PCIe Slot 11 in DPCC 5
4PCIe Slot 4 in DPCC 112PCIe Slot 12 in DPCC 5
5PCIe Slot 5 in DPCC 213PCIe Slot 13 in DPCC 6
6PCIe Slot 6 in DPCC 214PCIe Slot 14 in DPCC 6
7PCIe Slot 7 in DPCC 315PCIe Slot 15 in DPCC 7
8PCIe Slot 8 in DPCC 316PCIe Slot 16 in DPCC 7
AC Input Power Block
The four AC power inputs at the back of the server are arranged in a stack. Starting at the
bottom, they are designated AC 0, AC 1, AC 2, and AC 3. The designations match the
corresponding PSUs.
The AC power block is not a removable component.
88Oracle Server X5-8 Service Manual • December 2015
Page 89
Component Locations, and Designations
The following illustration shows the location and designation of the inlets on the AC power
block.
Performing Electrostatic Discharge and Static Prevention Measures
NameDescription
/SYS/CMOD[0-7]CPU modules (dynamic FRUID)
/SYS/CMOD[0-7]/P[0-7]Processors (CPUs) on CMOD (static FRUID)
/SYS/CMOD[0-7]/P[0-7]/P[0-23]DIMMs on CMOD MB (dynamic FRUID
/SYS/CMOD[0-7]/CPLDCPLDs on CMODs
/SYS/BIOSSystem BIOS
/SYS/DPCC[0-7]Dual PCIe card carriers (DPCCs)
/SYS/DPCC[0-7]/PCIE[1-16]PCIe cards
/SYS/FIMFront indicator module
Performing Electrostatic Discharge and Static Prevention
Measures
Electrostatic discharge (ESD) sensitive devices, such as the PCIe cards, hard drives, CPUs, and
memory cards, require special handling.
Using an Anti-static Wrist Strap
Wear an anti-static wrist strap when handling components such as disk drive assemblies, circuit
boards, or PCIe cards. When servicing or removing server components, attach an anti-static
strap to your wrist and then to a metal area on the server chassis. If your wrist strap is equipped
with a banana connector, insert it into the grounding socket on the right-hand side of the chassis
front panel.
Following this practice equalizes the electrical potentials between you and the server.
Note - An anti-static wrist strap is not shipped with the servers. However, anti-static wrist straps
are included with customer-replaceable units (CRUs), field-replaceable units (FRUs), and
optional components.
Using an Anti-static Mat
In addition to wearing an anti-static wrist strap when handling components, create an ESD-free
work place by using an anti-static mat as a work surface and as a place to set ESD-sensitive
90Oracle Server X5-8 Service Manual • December 2015
Page 91
components such as printed circuit boards, DIMMs, and CPUs. You can use the following items
as anti-static mats:
■
Anti-static bag used to wrap a replacement part
■
ESD mat (orderable from Oracle)
■
A disposable ESD mat (shipped with some optional system components)
Tools and Equipment
Most server component removal and installation procedures can be performed without tools.
However, to service the system, you need the following:
■
ESD mat and grounding strap
■
Anti-static wrist strap
You might also need:
Tools and Equipment
■
No. 2 Phillips screwdriver
■
A system console device, such as one of the following:
■
PC or workstation with RS-232 serial port
■
ASCII terminal
■
Terminal server
■
Patch panel connected to a terminal server
Component Filler Panels and Non-Powered Components
A filler panel is a metal or plastic enclosure that does not contain any functioning system
hardware or cable connectors. Filler panels occupy vacant component slots to help control
noise, EMI, and airflow. They are installed at the factory and must remain in the server until
you replace them with a component. If you remove a filler panel and continue to operate your
system with an empty slot, the server might overheat due to improper airflow. Additionally,
some components are installed but are not powered (for exampole, DPCCs and fan modules).
As with filler panels, these components must remain installed in a fully powered-on server.
Servicing the Server91
Page 92
Clear Hardware Fault Messages
Clear Hardware Fault Messages
After servicing the following components, you must clear the fault event in Oracle ILOM:
■
PCIe card
■
HBA
■
Front Indicator Module (FIM)
■
Processor (CPU)
Use the Oracle ILOM CLI to access the Fault Management shell, fmadm. For details, see http:
//www.oracle.com/goto/ILOM/docs.
Before You Begin
1.
2.
3.
■
This procedure uses the Oracle ILOM CLI interface.
Open an SSH session and at the command line log in to the SP Oracle ILOM CLI.
Log in as a user with root or administrator privileges. For example:
ssh root@ipadress
where ipadress is the IP address of the server SP.
For more information, see “Accessing Oracle ILOM” in Oracle X5 Series Servers
Administration Guide.
The Oracle ILOM CLI prompt appears:
->
To access fmadm, type:
start /SP/faultmgmt/shell
The fmadm prompt appears:
faultmgmtsp>
To get a listing of command options for displaying or clearing a fault with
fmadm, type :
help fmadm
The following output appears:
where <subcommand> is one of the following:
faulty [-asv] [-u <uuid>] : display list of faulty resources
faulty -f [-a] : display faulty FRUs
faulty -r [-a] : display faulty FRUs (summary)
acquit <FRU> : acquit faults on a FRU
92Oracle Server X5-8 Service Manual • December 2015
Page 93
Clear Hardware Fault Messages
acquit <UUID> : acquit faults associated with UUID
acquit <FRU> <UUID> : acquit faults specified by (FRU, UUID) combination
replaced <FRU> : replaced faults on a FRU
repaired <FRU> : repaired faults on a FRU
repair <FRU> : repair faults on a FRU
rotate errlog : rotate error log
rotate fltlog : rotate fault log
4.
Use fmadm faulty and the following options to display active faulty components:
■
-a – Show active faulty components.
■
-f – Show active faulty FRUs.
■
-r – Show active fault FRUs and their fault management states.
■
-s – Show a one-line fault summary for each fault event.
■
-u uuid – Show fault diagnosis events that match a specific universal unique identifier
(uuid).
For command specifics, see the Oracle ILOM documentation at: http://www.oracle.com/
goto/ILOM/docs
5.
Use fmadm to clear the fault.
Clear the fault according to whether you want to use the acquit, repair, replaced, or
repaired.
6.
Close the Oracle ILOM session.
Servicing the Server93
Page 94
94Oracle Server X5-8 Service Manual • December 2015
Page 95
Preparing for Service
This section includes preliminary information and procedures that assist you with preparing to
service the server. The following table describes the contents of this section.
Section DescriptionLink
Setting up for hot service.“Prepare the Server for Hot Service” on page 95
Setting up for warm service“Prepare the Server for Warm Service” on page 96
Setting up for cold service.“Prepare the Server for Cold Service” on page 98
Server power-off options.“Powering Off the Server” on page 100
Methods for activating and deactivating the
server Locator indicator.
Prepare the Server for Hot Service
Note - The steps in this remote procedure use the Oracle ILOM web interface. However, the
procedure can also be performed remotely using the Oracle ILOM CLI interface. For more
information, refer to the Oracle ILOM documentation.
“Managing the Locator Indicator” on page 108
Before You Begin
A hot-service component can be serviced while the server is operating at full-power mode.
For more information about component serviceability, see “Component Serviceability
Requirements” on page 79.
This procedure describes how to prepare the server to remove, replace, or install the following
hot-service components:
■
Fan modules
■
Power supplies
■
Storage drives
■
Dual PCIe Card Carriers (DPCCs)
■
Important: Review the Oracle Server X5-8 Product Notes for hardware-related information
before performing removal and installation procedures.
Preparing for Service95
Page 96
Prepare the Server for Warm Service
1.
Log in to the service processor Oracle ILOM web interface.
Direct a web browser to Oracle ILOM using the IP address of the server SP and log in as root or
a as user with administrator privileges. See Oracle X5 Series Servers Administration Guide.
The Summary screen appears.
2.
In the Actions section of the Summary screen, click the Locator Indicator Turn
On button.
This action activates the Locator indicator on the server front panel. For other options, see
“Managing the Locator Indicator” on page 108.
3.
Once at the service location, press the Locator button to deactivate the indicator.
For more information, see “Control the Locator Indicator Locally” on page 110.
4.
Set up an ESD-safe space at the service location.
Set up a space where you can set components. The space needs to be ESD safe. See
“Performing Electrostatic Discharge and Static Prevention Measures” on page 90.
Next Steps
■
“Servicing Fan Modules and Fan Frames” on page 117
■
“Servicing Power Supply Units (PSUs)” on page 126
■
“Servicing Storage Drives” on page 180
■
“Servicing PCIe Cards and the Dual PCIe Card Carriers (DPCCs)” on page 185
Prepare the Server for Warm Service
This procedure describes how to prepare the server for warm service, so you can remove and
replace CMODs, DIMMs, and processors and heatsinks without disconnecting the power cords
or shutting down Oracle ILOM.
When Oracle ILOM detects that two fan modules in a single cooling zone (a vertical column)
have been removed, it removes power from the CMODs, allowing you to service CMODs and
their subcomponents without removing the power cords. Oracle ILOM remains available in
warm service mode.
This procedure uses a combination of the Oracle ILOM web and CLI interfaces. However, the
procedure can be performed using only the Oracle ILOM CLI interface (for more information,
refer to the Oracle ILOM documentation).
For more information about component serviceability, see “Component Serviceability
Requirements” on page 79.
96Oracle Server X5-8 Service Manual • December 2015
Page 97
Prepare the Server for Warm Service
Caution - Loss of service or component damage. Do not replace any components except for
CMODs and their subcomponents while the server is in warm service mode.
Caution - Data Loss. Do not remove more than one fan module from a column while the system
is in full power mode. This action removes power from the CMODs and causes an immediate
shutdown. On an eight-CMOD system, this applies to all fan modules. On a four-CMOD
system, this applies to the fan modules in the left-hand fan frame.
Before You Begin
1.
■
Important: Review the Oracle Server X5-8 Product Notes for hardware-related information
before performing removal and installation procedures.
To power down the host and activate the front panel Locator indicator, do the
following:
a.
Log in to the Oracle ILOM web interface.
Direct a web browser to Oracle ILOM using the IP address of the server SP and log in as
root or a as user with administrator privileges. See “Accessing Oracle ILOM” in Oracle X5
Series Servers Administration Guide.
b.
In the Actions section of the Summary screen, click the Power State Turn Off
button.
This action powers off the server to standby power mode. For more power off options, see
“Powering Off the Server” on page 100.
c.
In the Actions section of the Summary screen, click the Locator Indicator
Turn On button.
Preparing for Service97
Page 98
Prepare the Server for Cold Service
This action activates the Locator indicator on the server front and back panel. For other
options, see “Managing the Locator Indicator” on page 108.
2.
When at the server, set up an ESD-safe service space where you can place
components.
See “Performing Electrostatic Discharge and Static Prevention Measures” on page 90.
3.
Press the Locator indicator button to deactivate the indicator. For more
information, see “Control the Locator Indicator Locally” on page 110.
4.
Begin the CMOD removal procedures. For details, see “Servicing the CPU
Module (CMOD) Components” on page 136.
The server transitions to warm service mode by removing power from the CMODs when it
senses one of the following events:
■
On an eight-CMOD system, when both fans in a single column are removed.
■
On a four CMOD system, when both fans in a single column are removed from the lefthand fan frame (CMODs 0 through 3), or when a CMOD is inserted into an unoccupied
CMOD slot (4 through 7).
Next Steps
■
“Servicing the Server” on page 79
■
“Servicing Components” on page 113
Prepare the Server for Cold Service
Note - This procedure uses a combination of the Oracle ILOM web and CLI interfaces.
However, the procedure can be performed using only the Oracle ILOM CLI interface (for more
information, refer to the Oracle ILOM documentation).
98Oracle Server X5-8 Service Manual • December 2015
Page 99
Prepare the Server for Cold Service
A cold-service component must be serviced when the server is completely powered off.
For more information about component serviceability, see “Component Serviceability
Requirements” on page 79.
This procedure describes how to prepare the server for service, so you can:
■
Remove, replace, or install cold-serviceable components.
■
Use the motherboard processor and DIMM fault remind circuitry.
■
Access internal components, such as the internal USB drives.
Before You Begin
1.
■
Important: Review the Oracle Server X5-8 Product Notes for hardware-related information
before performing removal and installation procedures.
To power down the server and activate the front panel Locator indicator, do the
following:
a.
Log in to the Oracle ILOM web interface.
Direct a web browser to Oracle ILOM using the IP address of the server SP and log in as
root or a as user with administrator privileges. See “Accessing Oracle ILOM” in Oracle X5
Series Servers Administration Guide.
b.
In the Actions section of the Summary screen, click the Power State Turn Off
button.
This action powers off the server to standby power mode. For more power off options, see
“Powering Off the Server” on page 100.
c.
In the Actions section of the Summary screen, click the Locator Indicator
Turn On button.
Preparing for Service99
Page 100
Powering Off the Server
2.
When at the server, set up an ESD-safe service space.
Set up a space where you can place components. The space needs to be ESD safe. See
“Performing Electrostatic Discharge and Static Prevention Measures” on page 90.
3.
Disconnect the server power cords.
This action activates the Locator indicator on the server front and back panel. For other
options, see “Managing the Locator Indicator” on page 108.
Caution - Data loss. Removing the power cords when the server is in full power mode results
in an immediate shut down of the server. Do not remove the power cord if the server is in full
power mode. Power off the server to standby power mode first.
4.
If necessary, label and disconnect any other cables attached to the server back
panel.
If you plan to remove a component that has cables attached to it (SMOD, DPCC), label the port
or slot to which the cable is attached and remove the cable.
Next Steps
■
“Servicing the Server” on page 79
■
“Servicing Components” on page 113
Powering Off the Server
This section contains information and procedures related to power modes and power off
options, including complete power removal:
■
“Power Off the Server Using the Server OS” on page 101
100Oracle Server X5-8 Service Manual • December 2015
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.