This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except
as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform,
publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is
prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation,
delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental
regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the
hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous
applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all
appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this
software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of
SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered
trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are
not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement
between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content,
products, or services, except as set forth in an applicable agreement between you and Oracle.
Access to Oracle Support
Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?
ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.
Ce logiciel et la documentation qui l'accompagne sont protégés par les lois sur la propriété intellectuelle. Ils sont concédés sous licence et soumis à des restrictions d'utilisation et
de divulgation. Sauf stipulation expresse de votre contrat de licence ou de la loi, vous ne pouvez pas copier, reproduire, traduire, diffuser, modifier, accorder de licence, transmettre,
distribuer, exposer, exécuter, publier ou afficher le logiciel, même partiellement, sous quelque forme et par quelque procédé que ce soit. Par ailleurs, il est interdit de procéder à toute
ingénierie inverse du logiciel, de le désassembler ou de le décompiler, excepté à des fins d'interopérabilité avec des logiciels tiers ou tel que prescrit par la loi.
Les informations fournies dans ce document sont susceptibles de modification sans préavis. Par ailleurs, Oracle Corporation ne garantit pas qu'elles soient exemptes d'erreurs et vous
invite, le cas échéant, à lui en faire part par écrit.
Si ce logiciel, ou la documentation qui l'accompagne, est livré sous licence au Gouvernement des Etats-Unis, ou à quiconque qui aurait souscrit la licence de ce logiciel pour le
compte du Gouvernement des Etats-Unis, la notice suivante s'applique :
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation,
delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental
regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the
hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
Ce logiciel ou matériel a été développé pour un usage général dans le cadre d'applications de gestion des informations. Ce logiciel ou matériel n'est pas conçu ni n'est destiné à être
utilisé dans des applications à risque, notamment dans des applications pouvant causer un risque de dommages corporels. Si vous utilisez ce logiciel ou ce matériel dans le cadre
d'applications dangereuses, il est de votre responsabilité de prendre toutes les mesures de secours, de sauvegarde, de redondance et autres mesures nécessaires à son utilisation dans
des conditions optimales de sécurité. Oracle Corporation et ses affiliés déclinent toute responsabilité quant aux dommages causés par l'utilisation de ce logiciel ou matériel pour des
applications dangereuses.
Oracle et Java sont des marques déposées d'Oracle Corporation et/ou de ses affiliés. Tout autre nom mentionné peut correspondre à des marques appartenant à d'autres propriétaires
qu'Oracle.
Intel et Intel Xeon sont des marques ou des marques déposées d'Intel Corporation. Toutes les marques SPARC sont utilisées sous licence et sont des marques ou des marques
déposées de SPARC International, Inc. AMD, Opteron, le logo AMD et le logo AMD Opteron sont des marques ou des marques déposées d'Advanced Micro Devices. UNIX est une
marque déposée de The Open Group.
Ce logiciel ou matériel et la documentation qui l'accompagne peuvent fournir des informations ou des liens donnant accès à des contenus, des produits et des services émanant de
tiers. Oracle Corporation et ses affiliés déclinent toute responsabilité ou garantie expresse quant aux contenus, produits ou services émanant de tiers, sauf mention contraire stipulée
dans un contrat entre vous et Oracle. En aucun cas, Oracle Corporation et ses affiliés ne sauraient être tenus pour responsables des pertes subies, des coûts occasionnés ou des
dommages causés par l'accès à des contenus, produits ou services tiers, ou à leur utilisation, sauf mention contraire stipulée dans un contrat entre vous et Oracle.
Accès aux services de support Oracle
Les clients Oracle qui ont souscrit un contrat de support ont accès au support électronique via My Oracle Support. Pour plus d'informations, visitez le site http://www.oracle.com/
pls/topic/lookup?ctx=acc&id=info ou le site http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs si vous êtes malentendant.
Page 5
Contents
Using This Documentation ............. ................ ................ ................ ................ ... 13
BIOS IO Menu IO Virtualization Options ................................................ 328
BIOS IO Menu IOAT Configuration Options ............................................ 329
BIOS IO Menu Internal Devices Options ................................................. 329
10Oracle Server X7-8 Service Manual • April 2018
Page 11
Contents
BIOS IO Menu Add-in Cards Options ..................................................... 330
BIOS Boot Menu Selections .......... ................ ................ ................ ................ 330
BIOS Save and Exit Menu Selections .............................................................. 332
Monitoring and Identifying Server Components .............................................. 335
Monitoring Component Health and Faults Using Oracle ILOM ............................ 335
Monitoring System Components ..................................................................... 336
System Components (FRUs) Network Access Control (NAC) Names ............ 337
System Indicators Network Access Control (NAC) Names .................................. 338
System Sensors Network Access Control (NAC) Names ............ ................ ......... 341
Index ............. ................ ................ ................ ................ ................ ................ ... 345
11
Page 12
12Oracle Server X7-8 Service Manual • April 2018
Page 13
Using This Documentation
■
Overview – Describes how to troubleshoot and maintain the Oracle Server X7-8.
■
Audience – Technicians, system administrators, authorized service providers, and trained
hardware service personnel who have been instructed on the hazards within the equipment
and are qualified to remove and replace hardware.
■
Required knowledge – Advanced experience troubleshooting and replacing hardware.
Product Documentation Library
Documentation and resources for this product and related products are available at https://
www.oracle.com/goto/x7-8/docs.
Feedback
Provide feedback about this documentation at https://www.oracle.com/goto/docfeedback.
Using This Documentation13
Page 14
14Oracle Server X7-8 Service Manual • April 2018
Page 15
About the Oracle Server X7-8
These sections list the server's replaceable components and describe the controls, connectors,
and status indicator LEDs.
■
“Product Description” on page 15
■
“About Controls and Connectors” on page 17
■
“Replaceable Components” on page 22
■
“CPU Module (CMOD) Overview” on page 26
■
“System Module (SMOD) Overview” on page 32
■
“Server Chassis Overview” on page 37
Note - Always update the server with the latest firmware, drivers, and other hardware-related
software by downloading the latest software release package when you first receive the server,
and for every new software release. For information about the software release packages and
how to download the software, refer to “Getting Firmware and Software Updates” in Oracle
Server X7-8 Installation Guide.
Product Description
Oracle Server X7-8 is an enterprise-class 5 rack unit (5U) server. You can configure the server
as one 4-socket server, two independent 4-socket servers, or one 8-socket server.
The system supports the following components.
FeatureSystem Component
Processors (CPU module/CMOD)Up to eight Intel Xeon vx processors (shelf 4) are supported.
■ Oracle Integrated Lights Out Management (ILOM) 4.0 with command-line access using a serial
connection (SER MGT port)
■ Support for Ethernet access to the SP through a dedicated management port (NET MGT port) and
optionally though one of the host Ethernet ports (sideband management).
Management Software■ Oracle Integrated Lights Out Management (ILOM). Refer to the Oracle ILOM 4.0
Documentation Library at https://www.oracle.com/goto/ilom/docs
■ Oracle Hardware Management Pack, available with the Oracle Solaris OS or as a standalone
product with other OS. Refer to the support matrix for specific information https://www.
oracle.com/goto/ohmp/docs
16Oracle Server X7-8 Service Manual • April 2018
Page 17
FeatureSystem Component
■ Oracle Enterprise Manager Ops Center, available software to manage multiple systems in a data
center. Refer to the product information page at: http://www.oracle.com/technetwork/oem/
ops-center/index.html
Operating Systems and
Virtualization Software
■ Oracle Solaris 11.3 SRU 23
■ Oracle Linux 7.3 and 6.9
■ Oracle VM 3.4.4
■ Microsoft Windows Server 2016 and 2012 R2
■ VMware ESXi 6.5
For updated operating system information, refer to Oracle Server X7-8 Product Notes at: https://
www.oracle.com/goto/x7-8/docs
For server specification information, see “Server Features and Components” in Oracle Server
X7-8 Installation Guide.
For component serviceability, locations, and designations, see “Replaceable
Components” on page 22.
About Controls and Connectors
About Controls and Connectors
The following sections describe the controls, indicators, connectors, and drives located on the
Oracle Server X7-8 front and back panels.
■
“Front Panel Components” on page 17
■
“Back Panel Components” on page 19
Front Panel Components
The following figure shows the Oracle Server X7-8 chassis front panel components:
About the Oracle Server X7-817
Page 18
About Controls and Connectors
Call
ComponentLink
Out
1Front indicator module (FIM). The FIM provides separate controls and system status
LED indicators for System A and System B while in 4-socket mode.
The SMOD1 System B serial management port service processor labeled
SER MGT port uses an RJ-45 cable and terminal (or emulator) to provide
access to the Oracle ILOM command-line interface (CLI).
4SMOD1 System B USB
The SMOD1 System B USB 3.0 port supports hot-plugging of devices.
“System Module (SMOD)
Overview” on page 32
“System Module (SMOD)
Indicators” on page 63
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“System Module (SMOD)
Overview” on page 32
“Networking Subsystem” on page 88
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“System Module (SMOD)
Overview” on page 32
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“System Module (SMOD)
Overview” on page 32
20Oracle Server X7-8 Service Manual • April 2018
Page 21
Call
ComponentLink
Out
5SMOD1 System B Status indicators:
■ Fault-Service Required: Amber
■ SMOD1 System B OK: Green
■ HBA Host Bus Adapter Fault: Amber
6SMOD1 System B NET0-3
Four 10 GbE Network ports labeled NET0, NET1, NET2, and NET3
7Four server storage drives labeled 0-3 (SAS HDD/SSD)
About Controls and Connectors
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“System Module (SMOD)
Overview” on page 32
“System Module (SMOD)
Indicators” on page 63
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“System Module (SMOD)
Overview” on page 32
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“Server Storage Drives” on page 36
■ SMOD0 System A: Storage drives 0 through 3: Top row: 3, 1 Bottom
row: 2, 0
■ SMOD1 System B: Storage drives 0 through 3: Top row: 3, 1 Bottom
row: 2, 0
Serial management port labled SER MGT uses an RJ-45 cable and terminal
(or emulator) to provide access to the Oracle ILOM command-line interface
(CLI).
11SMOD0 System A USB
USB 3.0 port supports hot-plugging of devices.
12SMOD0 System A Status indicators
■ SMOD0 System A Fault-Service Required: Amber
■ SMOD0 System A OK: Green
■ SMOD0 System A HBA Host Bus Adapter Fault: Amber
“Storage Drive Indicators” on page 67
“Servicing Storage Drives (CRU)” on page 123
“System Module (SMOD)
Overview” on page 32
“System Module (SMOD)
Indicators” on page 63
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“System Module (SMOD)
Overview” on page 32
“Networking Subsystem” on page 88
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“System Module (SMOD)
Overview” on page 32
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“System Module (SMOD)
Overview” on page 32
“System Module (SMOD)
Overview” on page 32
“System Module (SMOD)
Indicators” on page 63
About the Oracle Server X7-821
Page 22
Replaceable Components
Call
ComponentLink
Out
13SMOD0 System A NET0-3
Four 10 GbE Network ports labeled NET0, NET1, NET2, and NET3.
14PCIe card slots 1 (right) through16 (left)
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“System Module (SMOD)
Overview” on page 32
“Networking Subsystem” on page 88
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“PCIe Card and DPCC Overview” on page 147
PCIe card slots 1 (right) through16 (left):Dual PCIe card carriers (DPCC)
contain one or two PCIe cards. One DPCC populates two PCIe card slots.
SMOD0 System A: AC 3 (Top), AC 2 ; SMOD1 System B: AC 1, AC 0
(bottom)
Note - The server does not provide video display ports on the SMODs. Use Oracle ILOM
RKVMS to display video.
Replaceable Components
These sections describe the components of the server and provide information about identifying
and servicing replaceable components.
■
“Illustrated Parts Breakdown” on page 22
■
“Component Serviceability Requirements” on page 24
■
“Customer-Replaceable Units” on page 24
■
“Field-Replaceable Units” on page 25
“PCI Devices Subsystem” on page 89
“Dual PCIe Card Carrier (DPCC)
Indicators” on page 68
“Chassis Back Panel
Components” on page 41“Power
Subsystem” on page 82
“AC Power Block Inlet Indicators” on page 69
Illustrated Parts Breakdown
The following illustration identifies the major components of the server.
22Oracle Server X7-8 Service Manual • April 2018
Page 23
Replaceable Components
Call
Out
DescriptionLinks
1Storage drives (8), back panel accessible“Servicing Storage Drives (CRU)” on page 123
2System Module 0 (SMOD0), back panel accessible“Servicing System Module (SMOD) Components
(FRU)” on page 198
3System Module 1 (SMOD1), back panel accessible“Servicing System Module (SMOD) Components
(FRU)” on page 198
4Dual PCIe carrier card (DPCC) with PCIe card (8)“Servicing PCIe Cards and Carriers (CRU)” on page 146
“PCI Devices Subsystem” on page 89
5PCIe cards (PCIe slots 5, 6, 7, and 8 are nonfunctional in 4-socket
“Servicing PCIe Cards and Carriers (CRU)” on page 146
systems.)
6AC power block
The AC power block is not a removable component.
“Servicing Power Supplies (CRU)” on page 139
“Power Subsystem” on page 82
7Top cover“Preparing for Service” on page 97
8Midplane“Servicing the Midplane Assembly (FRU)” on page 230
9Processor, front panel accessible“Servicing CPU Module (CMOD) Components
(FRU)” on page 159
10Memory, front panel accessible“Servicing CPU Module (CMOD) Components
(FRU)” on page 159
11Fan frame (2), front panel accessible“Servicing Fan Modules (CRU) and Fan Frames
(CRU)” on page 129
About the Oracle Server X7-823
Page 24
Replaceable Components
Call
Out
12Fan module (FM) (8), front panel accessible“Servicing Fan Modules (CRU) and Fan Frames
13CPU module (CMOD) (4 or 8), front panel accessible“Servicing CPU Module (CMOD) Components
14Server chassis
15Front indicator module (FIM), front panel accessible“Servicing the Front Indicator Module
16Power supply (PSU) (4), front panel accessible“Servicing Power Supplies (CRU)” on page 139
DescriptionLinks
(CRU)” on page 129
(FRU)” on page 159
“Server Chassis Overview” on page 37
The server chassis is not a removable component.
(FRU)” on page 226
Component Serviceability Requirements
System components are hot-serviceable, warm-serviceable, or cold-serviceable components,
and are customer-replaceable units (CRUs) or field-replaceable units (FRUs). See “Preparing
the Server for Component Replacement” on page 100.
■
Hot service components can be serviced while the server chassis is powered on and running
in Main power mode.
Hot-serviceable components are also warm-serviceable, or cold-serviceable.
■
Warm service components can be serviced while the server chassis is in Standby power
mode. You can remove and replace CMODs, and CMOD internal components such as
DIMMs, processors, and heatsinks without disconnecting the back panel power cords or
shutting down Oracle ILOM.
Warm-serviceable components are also cold-serviceable.
■
Cold service components must be serviced when the server chassis is completely powered
off and all four AC power cords are disconnected from the server back panel AC power
block.
A CRU or FRU designation determines who is qualified to service a component.
■
Customer-replaceable units (CRUs) can be serviced by customers.
■
Field-replaceable units (FRUs) must be serviced by qualified Oracle Service personnel.
Customer-Replaceable Units
The following table lists the customer-replaceable units (CRUs) in the server and directs you to
the replacement instructions.
24Oracle Server X7-8 Service Manual • April 2018
Page 25
CRUDescriptionReplacement Instructions
BatteryCold-serviceable
3V Lithium Coin Cell Battery that powers the CMOS BIOS and
real-time clock located in SMODs.
Memory
(DIMMs)
Storage drivesHot-serviceable
Fan modules
(FM0-7)
PCIe cardsHot-serviceable
Dual PCIe
card carriers
(DPCC)
Power supplies
(PS0-3)
Energy Storage
Module and
ESM extension
cable
Warm-serviceable
Add or replace RDIMM or LRDIMM memory modules located in
CMODs.
Storage drive configurations can comprise both hard disk drives
(HDDs) or solid state disk drives (SSDs).
Hot-serviceable
Eight fan modules for cooling the server components.
Optional add-in cards that can expand the functionality of the
server.
Note - The Oracle Storage 12 Gb/s SAS PCIe RAID HBA card
is a field replaceable unit (FRU) and should only be serviced by
authorized Oracle Service personnel.
Hot-serviceable
Eight dual PCIe card carriers (DPCCs), each with two PCIe card
slots.
Hot-serviceable
Four fully redundant AC-powered power supplies.
Cold-serviceable
Provides backup power between the Energy Storage Module
(ESM) and the Oracle Storage 12Gb SAS PCIe RAID HBA,
Internal 16-Port located in SMODs.
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“Servicing CPU Module (CMOD) Components
(FRU)” on page 159
“Servicing Storage Drives (CRU)” on page 123
“Servicing Fan Modules (CRU) and Fan Frames
(CRU)” on page 129
“Servicing PCIe Cards and Carriers
(CRU)” on page 146
“Servicing PCIe Cards and Carriers
(CRU)” on page 146
“Servicing Power Supplies (CRU)” on page 139
“Servicing the Energy Storage Module and Cables
(CRU)” on page 211
Replaceable Components
Related Information
■
“Field-Replaceable Units” on page 25
■
“Illustrated Parts Breakdown” on page 22
Field-Replaceable Units
The following table lists the field-replaceable units (FRUs) in the server and directs you to the
replacement instructions.
About the Oracle Server X7-825
Page 26
CPU Module (CMOD) Overview
FRUDescriptionReplacement Instructions
CMODWarm-serviceable
Add or replace CPU module assemblies (CMOD0-7) including a
processor and memory.
Processor and
heatsink
SMODCold-serviceable
Storage drive
backplanes
SAS cableCold-serviceable
Oracle Storage
12 Gb SAS
PCIe RAID
HBA, Internal
card
Front indicator
module (FIM)
Midplane
assembly
Warm-serviceable
Add or replace the Processor Heatsink Module (PHM) CPU that
carries out system instructions located in CMODs.
Replace system module assemblies SMOD0 or SMOD1.
Cold-serviceable
Provide power and communications connectors for storage drives
located in SMODs.
Provide signals between the SMOD front disk backplane and the
Oracle Storage 12Gb SAS PCIe RAID HBA, Internal 16-Port
located in SMODs.
Cold-serviceable
Located in SMODs, the Oracle Storage 12Gb SAS PCIe RAID
HBA, Internal 16-Port manages SAS storage drives.
Cold-serviceable
Contains push-button circuitry and LED indicators that are
displayed on the chassis bezel.
Cold-serviceable
Internal midplane/busbar assembly in the chassis.
“Servicing CPU Module (CMOD) Components
(FRU)” on page 159
“Servicing Processors (FRU)” on page 170
“Servicing System Module (SMOD) Components
(FRU)” on page 198
“Servicing Storage Drives (CRU)” on page 123
“Servicing the SAS Cable (FRU)” on page 216
“Servicing the Host Bus Adapter (HBA) Card
(FRU)” on page 204
“Servicing the Front Indicator Module
(FRU)” on page 226
“Servicing the Midplane Assembly
(FRU)” on page 230
Related Information
■
“Customer-Replaceable Units” on page 24
■
“Illustrated Parts Breakdown” on page 22
CPU Module (CMOD) Overview
CPU modules (CMODs) contain one processor (CPU) and the system memory. CMODs supply
power to the fan modules and the DPCCs.
26Oracle Server X7-8 Service Manual • April 2018
Page 27
CPU Module (CMOD) Overview
These topics provide information about server CMODs, CMOD 0-7 configuration options, and
the internal layout of CMOD components:
■
“CMOD Components” on page 27
■
“CMOD Processor” on page 28
■
“CMOD Memory” on page 29
■
“CMOD and Fan Module Power” on page 31
CMODs are internal warm or cold-service components. CMODs are accessible from the front
panel. To access CMODs, you must remove the fan modules and the fan frames.
Related Information
■
“Product Description” on page 15
■
“Servicing CPU Module (CMOD) Components (FRU)” on page 159
■
“CMOD Population Rules” on page 161
■
“System Module (SMOD) Overview” on page 32
CMOD Components
CMODs include the following processor (CPU) and DIMM memory components:
■
Processor Heatsink Module (PHM) assembly includes heatsink, processor, thermal interface
material (TIM), and carrier.
“CMOD Processor” on page 28
■
12 DIMM slots arranged in two groups of six
“CMOD Memory” on page 29
■
Fault Remind Test Circuit, which helps you locate failed DIMMs and verify a failed CPU
■
Fault Remind button
■
12 DIMM slot fault indicators
■
1 CPU fault indicator
The following illustration shows CMOD component locations.
About the Oracle Server X7-827
Page 28
CPU Module (CMOD) Overview
Call
Description
Out
1CMOD Fault Remind button
2Circuit Charge Status indicator
312 DIMM slots arranged in two groups of six
4DIMM slot fault indicators (12, one for each slot)
5Processor Heatsink Module (PHM)
6CPU fault indicator
For component serviceability, locations, and designations, see “Replaceable
The Processor Heatsink Module (PHM) is a three-part module that is installed above the socket
assembly.
28Oracle Server X7-8 Service Manual • April 2018
Page 29
Call
Description
Out
1Heatsink with thermal interface material (TIM), part of PHM
2Carrier, part of PHM
3Processor, part of PHM
4Socket assembly
CPU Module (CMOD) Overview
See “Servicing Processors (FRU)” on page 170.
CMOD Memory
The maximum system memory with DDR4-2666 64 GB LRDIMMs is:
■
Four CMODs: 3 TB with 48 installed x 64 GB LRDIMMs
■
Eight CMODs: 6 TB with 96 installed x 64 GB LRDIMMs
The maximum system memory with DDR4-2666 32 GB RDIMMs is:
■
Four CMODs: 1.5 TB with 48 installed x 64 GB LRDIMMs
■
Eight CMODs: 3 TB with 96 installed x 64 GB LRDIMMs
About the Oracle Server X7-829
Page 30
CPU Module (CMOD) Overview
Call
Description
Out
1DIMM Banks A, B, C
2DIMM Banks D, E, F
3Channel F, Slot 0, DIMM 0
4Channel F, Slot 1, DIMM 1
5Channel E, Slot 0, DIMM 2
6Channel E, Slot 1, DIMM 3
7Channel D, Slot 0, DIMM 4
8Channel D, Slot 1, DIMM 5
9Channel A, Slot 0, DIMM 6
10Channel A, Slot 1, DIMM 7
11Channel B, Slot 0, DIMM 8
12Channel B, Slot 1, DIMM 9
13Channel C, Slot 0, DIMM 10
14Channel C, Slot 1, DIMM 11
See “Servicing DIMMs (CRU)” on page 182.
30Oracle Server X7-8 Service Manual • April 2018
Page 31
CPU Module (CMOD) Overview
CMOD and Fan Module Power
Only CMODs in even-numbered slots supply power to fan modules (FMs). The following table
shows which CMOD slots provide FM power. Four fan modules per SMOD are front panel
accessible.
CMOD0-7
Power Slots
CMOD 0FM0 and FM1
CMOD 2FM2 and FM3
CMOD 4FM4 and FM5
CMOD 6FM6 and FM7
FM0-7 Fan Modules Powered
CMOD1, CMOD3, CMOD5, and CMOD7 in chassis slots 1, 3, 5, and 7 do not supply fan
module power.
See “Servicing Fan Modules (CRU) and Fan Frames (CRU)” on page 129.
About the Oracle Server X7-831
Page 32
System Module (SMOD) Overview
System Module (SMOD) Overview
These topics provide information about Oracle Server X7-8 system modules (SMODs) and
components.
■
“SMOD Components” on page 34
■
“SMOD Motherboard” on page 35
■
“Server Storage Drives” on page 36
Two SMODs are installed in the chassis: SMOD0 and SMOD1. SMODs are internal coldservice components. SMODs are accessible from the back panel.
Each SMOD includes:
■
One Intel Xeon C624 Chipset Platform Controller Hub (PCH).
■
One Internal PCIe slot for RAID storage HBA connectivity to drive bays.
This PCIe slot is populated by one PCIe Gen3 x8 HBA per SMOD (Oracle Storage
12Gb SAS PCIe RAID HBA, Internal 16-Port). Four ports in the card are used for SAS
connectivity.
■
One service processor (on SMOD): Emulex Pilot 4 service processor (SP).
SP characteristics:
■
Remote keyboard, video, mouse (RKVM) redirection
■
Full remote management through command-line, IPMI, and browser interfaces
■
Remote media capability (DVD, CD, ISO image)
■
Advanced power management and monitoring
■
Active Directory, LDAP, RADIUS support
■
Oracle ILOM
■
Dual Oracle ILOM image
■
Signed Oracle ILOM image
■
512 MB DDR4 memory
■
128 MB of flash memory
■
1 GB of NAND memory
■
Baseboard management controller (BMC)
■
One RJ-45 serial management port (SER MGT).
■
One network management port, 10/100/1000 GbE Network Ethernet port (NET MGT).
■
One USB 3.0 port external on each SMOD.
■
Four 10GBase-T ports per SMOD (NET0-3).
32Oracle Server X7-8 Service Manual • April 2018
Page 33
System Module (SMOD) Overview
■
Four hot swap capable SAS3 storage drive bays HDD0-3.
■
Four drives in each SMOD. SMOD0 0-3 and SMOD1 0-3. Drives can be a total of eight
2.5-inch, hot-swappable SAS3 hard disk drives (HDDs) or eight 2.5-inch SAS3 solid state
drives (SSDs).
Both SMODs are accessible from the back panel of the server. The following illustration shows
SMOD0 and SMOD1 removal and installation levers with green lock release tabs.
Call
Description
Out
1SMOD1 ejector levers (2)
2SMOD1 installed in chassis
3SMOD0 ejector levers (2)
4SMOD0 installed in chassis
Related Information
■
“System Module (SMOD) Indicators” on page 63
About the Oracle Server X7-833
Page 34
System Module (SMOD) Overview
■
“CPU Module (CMOD) Overview” on page 26
■
“Product Description” on page 15
SMOD Components
The following illustration shows an SMOD (system module) and associated components,
including an Oracle Storage 12Gb SAS PCIe RAID HBA, Internal 16-Port card installed below
the SMOD, one SAS cable that connects the HBA card to the server storage backplane, and an
ESM extension cable that connects the energy storage module to the HBA card.
34Oracle Server X7-8 Service Manual • April 2018
Page 35
System Module (SMOD) Overview
Call
Description
Out
1SAS host bus adaptor (HBA): Oracle Storage 12Gb SAS PCIe RAID HBA, Internal 16-Port
2Energy storage module (ESM)
3Cable from HBA to ESM
4SMOD motherboard
5Storage drive SMOD midplane connectors (4)
6Four server storage drives 0-3 (HDD/SSD), back panel accessible
7External USB port, back panel accessible
8NET MGT ILOM service processor (SP) network management 10/100/1000BASE-T Network port, back
panel accessible
9Serial management port labeled SER MGT port. IO port, back panel accessible
10NET 0-3 10 GbE network ports labeled NET0, NET1, NET2, and NET3. Network ports, back panel accessible
11Service processor (SP) - concealed location
123V lithium coin cell battery
13Real time clock battery socket
14SAS storage drive backplane cable
For component serviceability, locations, and designations, see “Replaceable
Components” on page 22.
For replacement procedures, see “Servicing System Module (SMOD) Components
(FRU)” on page 198.
SMOD Motherboard
Each SMOD motherboard contains:
■
Two Storage Drive Backplanes (for externally accessible server storage drives)
Externally accessible server storage drives on the SMOD connect to two storage drive
backplanes mounted on the SMOD motherboard. One SAS cable connects each backplane
to the HBA card that is installed in a riser slot on the underside of the SMOD. Storage drive
backplanes are not removable or replaceable.
■
Service processor (SP)
The system Emulex Pilot 4 service processor (SP) is located on the SMOD motherboard
and is accessible locally and remotely through management ports on the front of the SMOD.
The SP contains Oracle ILOM, an embedded server management tool. The SP is not
removable.
About the Oracle Server X7-835
Page 36
System Module (SMOD) Overview
■
System real time clock battery, 3V lithium coin cell battery
■
Energy storage module (ESM) for the server storage HBA
The ESM provides backup power for the Oracle Storage 12Gb SAS PCIe RAID HBA,
Internal 16-Port card. The ESM is located in a holder in the top center of the SMOD. A
cable connects the HBA to to the ESM.
■
PCIe riser for the server storage HBA, located on the bottom of the SMOD
The server requires one internal HBA (Oracle Storage 12Gb SAS PCIe RAID HBA,
Internal 16-Port) for the externally accessible SAS (or SATA) SMOD server storage drives.
The HBA is installed in a riser slot on the underside of the SMOD motherboard and is
connected to the backplanes by one mini-SAS4I connector cable.
■
One internal USB 3.0 port, located on the bottom of the SMOD.
Each SMOD has one unused internal USB 3.0 port that is designated as P0. The port is
located on the underside of the SMOD motherboard next to the PCIe card riser slot.
For component serviceability, locations, and designations, see “Replaceable
Components” on page 22.
For replacement procedures, see “Servicing System Module (SMOD) Components
(FRU)” on page 198.
Server Storage Drives
The following illustration shows the eight storage drive externally-accessible back panel bay
locations, which are arranged in two rows of four per SMOD.
36Oracle Server X7-8 Service Manual • April 2018
Page 37
Call
Description
Out
1SMOD0 System A storage drive 0 (HDD/SSD)
2SMOD0 System A storage drive 1 (HDD/SSD)
3SMOD0 System A storage drive 2 (HDD/SSD)
4SMOD0 System A storage drive 3 (HDD/SSD)
5SMOD1 System B storage drive 0 (HDD/SSD)
6SMOD1 System B storage drive 1 (HDD/SSD)
7SMOD1 System B storage drive 2 (HDD/SSD)
8SMOD1 System B storage drive 3 (HDD/SSD)
Server Chassis Overview
Server Chassis Overview
These sections describe the server chassis and internal components.
■
“Chassis Features” on page 38
■
“Chassis Internal Components” on page 39
■
“Chassis Front Panel Components” on page 40
■
“Chassis Back Panel Components” on page 41
■
“Switches and Buttons Reference” on page 42
About the Oracle Server X7-837
Page 38
Server Chassis Overview
The server chassis assembly contains front panel, back panel, and internal accessible
components.The following figure shows the server chassis front panel (1).
Related Information
■
“CPU Module (CMOD) Overview” on page 26
■
“System Module (SMOD) Overview” on page 32
Chassis Features
The Oracle Server X7-8 chassis features include:
■
One 5U chassis with central mid-plane.
■
One Front Indicator Module (FIM) with controls and indicators.
■
Eight front-loaded CMOD assemblies (behind fan module assemblies FM0-7). Four
CMODs are supported per SMOD.
■
Two rear-loaded system module assemblies (SMODs).
■
Four front-loaded hot-swap power supply units (PSUs) in a two per SMOD 1+1
configuration.
■
Expandable IO: Eight 16-lane and eight 8-lane PCIe Gen3 chassis slots. One x8 PCIe Gen 3
HBA SMOD slot.
38Oracle Server X7-8 Service Manual • April 2018
Page 39
Server Chassis Overview
Eight rear-loaded DPCC slots (each DPCC provides one 16 lane PCIe expansion slot and
one eight lane PCIe expansion slot)
■
Eight hot-swap redundant 100 watt cooling fan modules.
■
Cooling is front to back, using eight variable-speed, dual counter-rotating fan assemblies
mounted in the front of the server.
Chassis Internal Components
The following figure shows the chassis internal components.
The chassis internal components are described in the following table.
Call
ComponentDescriptionLinks
Out
1Top CoverThis component is part of the chassis and is non-
replaceable.
About the Oracle Server X7-839
“Remove the Midplane
Assembly” on page 231.
Page 40
Server Chassis Overview
Call
Out
2Midplane/
3CPU module
Chassis Front Panel Components
The following figure shows the server chassis front panel accessible replaceable components.
ComponentDescriptionLinks
busbar
(CMOD) bays
The midplane assembly provides an interconnect
between the back panel components and the front panel
components. This component requires cold service.
Chassis CMOD bays can support either four or eight
CMODs. Servicing CMODs requires warm or cold
service.
“Remove the Midplane
Assembly” on page 231.
“CPU Module (CMOD)
Overview” on page 26.
“Servicing CPU Module
(CMOD) Components
(FRU)” on page 159
Call
ComponentLinks
Out
1Front indicator module (FIM)“Servicing the Front Indicator Module
40Oracle Server X7-8 Service Manual • April 2018
(FRU)” on page 226
“Front Indicator Module (FIM)
Panel” on page 58
Page 41
Server Chassis Overview
Call
ComponentLinks
Out
2Power supply (PS0-3) (4)“Servicing Power Supplies
3Fan module (FM0-7) (8)“Servicing Fan Modules (CRU) and Fan Frames
4Fan Frame (2)“Servicing Fan Modules (CRU) and Fan Frames
5Server chassis
The server chassis is not a removable component.
(CRU)” on page 139
“Power Supply (PS) Indicators” on page 62
(CRU)” on page 129
“Fan Module (FM) Indicators” on page 61
(CRU)” on page 129
“Server Chassis Overview” on page 37
Chassis Back Panel Components
The following illustration shows the chassis back panel.
Call
ComponentDescriptionLinks
Out
1Storage DrivesLeft facing back of chassis: SMOD1 HDD/SSD
slots 0 through 3:
Right facing back of chassis: SMOD0 HDD/SSD
slots 0 through 3:
Top row : 3, 1
“Storage Drive Locations and
Numbering” on page 124
“Storage Drive
Indicators” on page 67
About the Oracle Server X7-841
Page 42
Server Chassis Overview
Call
Out
2System Module
3System Module
4Dual PCIe card
5AC power blockThe AC power block has four AC power inlet
ComponentDescriptionLinks
Bottom row: 2, 0
A (SMOD0)
B (SMOD1)
carrier (DPCC)
bay
SMOD0 System A internal components can only be
accessed by removing the SMOD from the server
back panel.
SMOD1 System B internal components can only be
accessed by removing the SMOD from the server
back panel.
The DPCC bay contains eight DPCCs and up to 16
PCIe cards.
connectors, two for each SMOD. The power block
is not a removable component.
“System Module (SMOD)
Overview” on page 32
“System Module (SMOD)
Indicators” on page 63
“System Module (SMOD)
Overview” on page 32
“System Module (SMOD)
Indicators” on page 63
“PCI Devices
Subsystem” on page 89
“Dual PCIe Card Carrier (DPCC)
Indicators” on page 68
“Power
Subsystem” on page 82
“AC Power Block Inlet
Indicators” on page 69
Switches and Buttons Reference
The server front panel and back panels provide access to the following switches and buttons.
ButtonDescriptionLinks
Two front panel On/
Standby buttons
Two back panel On/
Standby buttons
Two front panel
Locate Buttons/
LEDs
Press On/Standby button A to control server power on System A
SMOD0 while local to (at) the server FIM.
Press On/Standby button B to control server power on System B
SMOD1 while local to (at) the server FIM.
Caution - Exercise caution on dual systems.
Press On/Standby button A to control server power on System A
SMOD0 while local to (at) the server back panel.
Press On/Standby button B to control server power on System B
SMOD1 while local to (at) the server back panel.
Caution - Exercise caution on dual systems.
Press the buttons on the server front panel FIM to manage the
SMOD0 or SMOD1 Locate Button/LED indicator locally. To
deactivate (or activate) the Locate Button/LED, press and release
the button.
“Power On the Server” on page 246
“Powering Down the Server” on page 110
“Power On the Server” on page 246
“Powering Down the Server” on page 110
“Managing the Locate Button/
LED” on page 119
42Oracle Server X7-8 Service Manual • April 2018
Page 43
ButtonDescriptionLinks
Two back panel
Locate Buttons/
LEDs
Press the buttons on the server back panel SMODs to manage
the SMOD0 or SMOD1 Locate Button/LED indicator locally. To
deactivate (or activate) the Locate Button/LED, press and release
“Managing the Locate Button/
LED” on page 119
the button.
SP ResetService processor (SP) pinhole reset button located on the server
back panel
“Back Panel Pinhole
Switches” on page 75
The SP reset button allows you to manually reset the SP. Use the
reset button if the SP becomes unresponsive, requires a reset,
or fails to boot into standby power mode (activating the button
requires the use of a stylus).
Host Warm ResetRecessed pinhole button located on the server back panel
“Back Panel Pinhole
Switches” on page 75
The Host Warm Reset button allows you to perform an immediate
reboot of the server (activating the button requires the use of a
stylus).
NMI pinhole buttonNon-maskable Interrupt recessed pinhole button located on the
server back panel
“Back Panel Pinhole
Switches” on page 75
The NMI button is used by Service personnel only. Do not press.
CMOD internal
Fault Remind button
Each CMOD has a motherboard-mounted Fault Remind button.
The button is part of the CMOD Fault Remind circuit. The circuit
“Servicing CPU Module (CMOD)
Components (FRU)” on page 159
is charged and allows you to identify a failed DIMM or CPU after
the CMOD has been removed for the server. You must remove the
CMOD from the front panel to access the button.
“CPU Module (CMOD)
Overview” on page 26
“Troubleshooting Using a CMOD Fault
Remind Test Circuit” on page 53
Sixteen (16)
recessed ATTN
(attention) buttons
(two on each DPCC)
The ATTN buttons are used to initiate DPCC removal and install.
Before removing a DPCC, use a stylus to press both ATTN buttons.
After installing a DPCC that contains a PCIe card, press the buttons
again.
“Servicing PCIe Cards and Carriers
(CRU)” on page 146
“Dual PCIe Card Carrier (DPCC)
Indicators” on page 68
Server Chassis Overview
“Back Panel Components” on page 19
About the Oracle Server X7-843
Page 44
44Oracle Server X7-8 Service Manual • April 2018
Page 45
Troubleshooting and Diagnostics
This section provides information about troubleshooting hardware component faults for Oracle
Server X7-8. The following topics are covered.
■
“Detecting and Managing Server Faults” on page 45
■
“Troubleshooting Using Diagnostic Tools” on page 56
■
“Troubleshooting Using Status Indicators” on page 57
■
“Troubleshooting Server Subsystems” on page 80
■
“Attaching Devices to the Server” on page 90
■
“Getting Help” on page 94
■
“Auto Service Requests” on page 95
For more information about server troubleshooting and diagnostics, refer to the Oracle x86Servers Diagnostics and Troubleshooting Guide for Servers With Oracle ILOM 4.0.x at https:
//www.oracle.com/goto/x86admindiag/docs.
Detecting and Managing Server Faults
When a server encounters a fault, the fault is recorded in a common fault database. The fault
is then reported by the server in one of several ways, depending on the type and severity of the
fault.
This section contains maintenance-related information and procedures that you can use to
troubleshoot and repair server hardware issues.
These topics explain how to use diagnostic tools to monitor server status and troubleshoot faults
in the server.
DescriptionSection Links
Troubleshooting overview information and procedure“Troubleshooting Server Hardware Faults” on page 46
Troubleshooting and Diagnostics45
Page 46
Detecting and Managing Server Faults
DescriptionSection Links
Information about how to use the CMOD Fault Remind
Test Circuit
Information related to the cooling subsystem, including
fault causes, actions, and preventative measures
Information related to the power subsystem, including
fault causes, actions, and preventative measures
Contact technical support if the problem persists.“Getting Help” on page 94
Troubleshooting Server Hardware Faults
When a server hardware fault event occurs, the system lights the Fault-Service Required LED
and captures the event in the Oracle ILOM event log. If you set up notifications through Oracle
ILOM, you also receive an alert through the notification method you chose. When you become
aware of a hardware fault, address it immediately.
To investigate a hardware fault, see the following:
“Troubleshooting Using a CMOD Fault Remind Test
Circuit” on page 53
“Troubleshooting System Cooling Issues” on page 53
“Cooling Subsystem” on page 84
“Troubleshooting Power Issues” on page 55
“Power Subsystem” on page 82
■
Basic Troubleshooting Process
“Basic Troubleshooting Steps” on page 46
■
Troubleshoot Hardware Faults Using the Oracle ILOM Web Interface
“Troubleshooting Server Hardware Faults” on page 46
“Identify Hardware Faults (Oracle ILOM)” on page 47
Basic Troubleshooting Steps
When a server encounters a fault, the fault is recorded in a common fault database. The fault
is then reported by the server in one of several ways, depending on the type and severity of the
fault.
Use the following process to address a suspected hardware fault:
1. Review the Oracle Server X7-8 Product Notes for late-breaking server information, and
hardware-related issues.
Refer to Oracle Server X7-8 Product Notes at: https://www.oracle.com/goto/x7-8/docs
2. Investigate the hardware fault.Identify the hardware issue.
Select one of the following methods to identify the failed component and server subsystem
containing the fault.
46Oracle Server X7-8 Service Manual • April 2018
Page 47
Identify Hardware Faults (Oracle ILOM)
■
Log in to Oracle ILOM. See “Identify Hardware Faults (Oracle
ILOM)” on page 47.
■
Log in to Oracle Hardware Management Pack. For information, refer to the Oracle
Hardware Management Pack documentation at: https://www.oracle.com/goto/ohmp/
docs.
■
Log in to the Oracle Solaris OS and issue the fmadm faulty command.
■
Log in to the Oracle ILOM service processor from the Oracle ILOM Fault Management
Shell and issue the fmadm faulty command.
For more information about how to use the Oracle ILOM Fault Management Shell
and supported commands, see the Oracle ILOM User's Guide for System Monitoringand Diagnostics Firmware Release 4.0.x in the Oracle Integrated Lights Out Manager
(ILOM) 4.0 Documentation Library at https://www.oracle.com/goto/ilom/docs.
If you determine that the hardware fault requires service, continue.
3. Prepare the server for service.
See “Preparing for Service” on page 97. You can use Oracle ILOM to power off the
server, activate the Locate Button/LED, and take the server offline.
Obtain physical access to the server. Before servicing the server, prepare the work space to
ensure ESD protection for the server and components.
4. Service replaceable server components.
See “Servicing Components” on page 123 for FRU and CRU removal, installation, and
replacement procedures in this document.
Note - A component designated as a FRU must be replaced by Oracle Service personnel.
Contact Oracle Service.
5. Return the server to service.
See “Returning the Server to Operation” on page 245.
6. Clear the fault in Oracle ILOM (optional).
Most components include a FRU ID to clear the fault automatically. You might need to
clear the fault in Oracle ILOM, depending on the component requirements.
See “Clear Hardware Fault Messages (Oracle ILOM)” on page 51.
Identify Hardware Faults (Oracle ILOM)
Use this procedure to troubleshoot hardware faults with the Oracle ILOM web interface and, if
necessary, prepare the server for service.
Troubleshooting and Diagnostics47
Page 48
Identify Hardware Faults (Oracle ILOM)
Note - The screens and information in this procedure might differ from those for your server.
This example procedure provides one method to troubleshoot hardware faults using Oracle
ILOM web and CLI interfaces. However, the procedure can be performed using only the Oracle
ILOM CLI interface. For more information about Oracle ILOM interfaces, refer to the Oracle
ILOM documentation.
1.
Obtain the latest version of the Oracle Server X7-8 Product Notes at: https://www.
oracle.com/goto/x7-8/docs.
2.
Log in to the server SP Oracle ILOM web interface.
Open a browser and type in the IP address of the server SP. Enter a user name (with
administrator privileges) and password at the Login screen. The Summary Information page
appears.
3.
View the Status section of the Summary Information page to identify the server
subsystem that requires service.
The Status section of the Summary Information screen provides information about the server
subsystems, such as:
■
“Processor Subsystem” on page 80
■
“Memory Subsystem” on page 81
■
“Power Subsystem” on page 82
■
“Cooling Subsystem” on page 84
■
“Storage Subsystem” on page 87
■
“Networking Subsystem” on page 88
■
“PCI Devices Subsystem” on page 89
48Oracle Server X7-8 Service Manual • April 2018
Page 49
The Status table lists components that require service.
Identify Hardware Faults (Oracle ILOM)
In the above example, the Status table shows that the Memory subsystem requires service. This
indicates that a hardware component within the subsystem is in a fault state.
4.
To identify the component, click on the subsystem name.
Troubleshooting and Diagnostics49
Page 50
Identify Hardware Faults (Oracle ILOM)
The subsystem information page appears.
The above example shows the Memory subsystem screen and indicates that DIMM 8 on CPU 0
has an uncorrectable ECC fault.
5.
To get more information, click one of the Open Problems links.
The Open Problems screen provides detailed information, such as the time the event occurred,
the component and subsystem name, and a description of the issue. It also includes a link to a
KnowledgeBase article.
Tip - The System Log provides a chronological list of all the system events and faults that have
occurred since the log was last reset and includes additional information, such as severity levels
and error counts. To access it, click the System Log link.
In this example, the hardware fault with DIMM 8 of CPU 0 requires local (physical) access to
the server.
6.
Before going to the server, review the Oracle Server X7-8 Product Notes for
information related to the issue or the component.
For up-to-date information about the server, including hardware-related issues, refer to Oracle
Server X7-8 Product Notes at: https://www.oracle.com/goto/x7-8/docs
7.
To prepare the server for service, see “Preparing for Service” on page 97.
50Oracle Server X7-8 Service Manual • April 2018
Page 51
Clear Hardware Fault Messages (Oracle ILOM)
Note - After servicing the component, you might need to clear the fault in Oracle ILOM. Refer
the service procedure for the component for more information.
Managing Server Hardware Faults Using the
Oracle ILOM Fault Management Shell
The Oracle ILOM Fault Management Shell enables you to view and manage fault activity on a
managed servers and other types of devices.
For more information about how to use the Oracle ILOM Fault Management Shell, see the
Oracle ILOM User's Guide for System Monitoring and Diagnostics Firmware Release 4.0.x in
the Oracle Integrated Lights Out Manager (ILOM) 4.0 Documentation Library at https://www.
oracle.com/goto/ilom/docs.
Clear Hardware Fault Messages (Oracle ILOM)
After servicing the following components, you must clear the fault event in Oracle ILOM:
■
Processor (CPU)
■
PCIe card
■
HBA
■
Front Indicator Module (FIM)
This procedure uses the Oracle ILOM CLI interface. Use the Oracle ILOM CLI to access the
Fault Management Shell, fmadm.
For more information about how to use the Oracle ILOM Fault Management Shell and
supported commands, see the Oracle ILOM User's Guide for System Monitoring andDiagnostics Firmware Release 4.0.x in the Oracle Integrated Lights Out Manager (ILOM) 4.0
Documentation Library at https://www.oracle.com/goto/ilom/docs.
Caution - The purpose of the Oracle ILOM Fault Management Shell is to help Oracle Service
personnel diagnose system problems. Customers should not launch this shell or run fault
management commands in the shell unless requested to do so by Oracle Service personnel.
1.
Log in to the SP Oracle ILOM CLI.
Troubleshooting and Diagnostics51
Page 52
Clear Hardware Fault Messages (Oracle ILOM)
Log in as a user with root or administrator privileges. For example, open an SSH session, and at
the command line type:
ssh root@ipaddress
Where ipaddress is the IP address of the server SP.
For more information, see “Using Oracle ILOM” in Oracle Server X7-8 Installation Guide.
The Oracle ILOM CLI prompt appears: ->
2.
To access fmadm, type:
->start /SP/faultmgmt/shell
The fmadm prompt appears: faultmgmtsp>
3.
To view a list of command options for displaying or clearing a fault with fmadm,
type:
faultmgmtsp>help fmadm
The following output appears:
where <subcommand> is one of the following:
faulty [-asv] [-u <uuid>] : display list of faulty resources
faulty -f [-a] : display faulty FRUs
faulty -r [-a] : display faulty FRUs (summary)
acquit <FRU> : acquit faults on a FRU
acquit <UUID> : acquit faults associated with UUID
acquit <FRU> <UUID> : acquit faults specified by (FRU, UUID) combination
replaced <FRU> : replaced faults on a FRU
repaired <FRU> : repaired faults on a FRU
repair <FRU> : repair faults on a FRU
rotate errlog : rotate error log
rotate fltlog : rotate fault log
4.
Use fmadm faulty and the following options to display active faulty components:
■
-a – Show active faulty components.
■
-f – Show active faulty FRUs.
■
-r – Show active faulty FRUs and their fault management states.
■
-s – Show a one-line fault summary for each fault event.
■
-u uuid – Show fault diagnosis events that match a specific universal unique identifier
(uuid).
For command specifics, see the Oracle ILOM documentation at: https://www.oracle.com/
goto/ilom/docs.
52Oracle Server X7-8 Service Manual • April 2018
Page 53
Clear Hardware Fault Messages (Oracle ILOM)
5.
Type fmadm to clear the fault.
Select acquit, repair, replaced, or repaired.
6.
Close the Oracle ILOM session.
Troubleshooting Using a CMOD Fault Remind Test
Circuit
The CMODs have an internal test circuit with indicators that you can use to locate failed
DIMMs and verify a failed CPU after removing the CMOD from the server. The DIMM and
CPU Fault Remind circuits hold an electrical charge for 10 minutes after power is removed
from the server, allowing enough time to remove the CMOD and use the circuit LEDs to locate
faulty components.
Each CMOD has a motherboard-mounted Fault Remind button. The button is part of the
CMOD Fault Remind circuit. The circuit is charged and allows you to identify a failed DIMM
or CPU after the CMOD has been removed for the server. You must remove the CMOD from
the front panel to access the button.
For more information, see “Identify and Remove a Faulty DIMM” on page 189 and “Identify
and Remove a Faulty Processor” on page 171.
Troubleshooting System Cooling Issues
Maintaining the proper internal operating temperature of the server is crucial to the health of the
server. To prevent server shutdown and damage to components, address over temperature and
hardware-related issues as soon as they occur. If your server has a temperature-related fault, use
the information in the following table to troubleshoot the issue.
Cooling IssueDescriptionActionPrevention
External Ambient
Temperature Too
High
Airflow BlockageThe server cooling system uses fans
The server fans pull cool air into the
server from its external environment.
If the ambient temperature is too
high, the internal temperature of the
server and its components increases.
This can cause poor performance and
component failure.
to pull cool air in from the server
Verify the ambient temperature
of the server space against the
environmental specifications for
the server. If the temperature is not
within the required operating range,
remedy the situation immediately.
Inspect the server front and back
panel vents for blockage from dust
Periodically verify the ambient
temperature of the server space to
ensure that it is within the required
range, especially if you made any
changes to the server space (for
example, added additional servers).
The temperature must be consistent
and stable.
Periodically inspect and clean the
server vents using an ESD certified
Troubleshooting and Diagnostics53
Page 54
Clear Hardware Fault Messages (Oracle ILOM)
Cooling IssueDescriptionActionPrevention
front intake vents and exhaust warm
air out the server back panel vents. If
the front or back vents are blocked,
the airflow through the server is
disrupted and the cooling system
or debris. Additionally, inspect
the server interior for improperly
installed components or cables that
can block the flow of air through the
server.
vacuum cleaner. Ensure that all
components, such as cards, cables,
fans, air baffles and dividers are
properly installed. Never operate the
server without the top cover installed.
fails to function properly causing the
server internal temperature to rise.
Cooling Areas
Compromised
The component filler panels, and
server top cover maintain and direct
the flow of cool air through the
server. These server components
must be in place for the server
to function as a sealed system. If
Inspect the server interior to ensure
that the components are properly
installed. Ensure that all externalfacing slots (storage drive, PCIe) are
occupied with either a component or
a component filler panel.
When servicing the server, ensure
that the components are installed
correctly and that the server has no
unoccupied external-facing slots.
these components are not installed
correctly, the airflow inside the
server can become chaotic and nondirectional, which can cause server
components to overheat and fail.
Hardware
Component
Failure
Components, such as power supplies
and fan modules, are an integral
part of the server cooling system.
When one of these components fails,
the server internal temperature can
rise. This rise in temperature can
cause other components to enter
into an over-temperature state.
Additionally, some components,
such as processors, might overheat
when they are failing, which can also
generate an over-temperature event.
Investigate the cause of the
over-temperature event, and
replace failed components
immediately. For hardware
troubleshooting information, see
“Troubleshooting Server Hardware
Faults” on page 46.
Component redundancy is provided
to allow for component failure
in critical subsystems, such as
the cooling subsystem. However,
once a component in a redundant
system fails, the redundancy no
longer exists, and the risk for server
shutdown and component failures
increases. Therefore, it is important
to maintain redundant systems
and replace failed components
immediately.
To reduce the risk related to
component failure, power supplies
and fan modules are installed in pairs
to provide redundancy. Redundancy
ensures that if one component in
the pair fails, the other functioning
component can continue to maintain
the subsystem. For example, power
supplies serve a dual function; they
provide both power and airflow. If
one power supply fails, the other
functioning power supply can
maintain both the power and the
cooling subsystems.
54Oracle Server X7-8 Service Manual • April 2018
Page 55
Clear Hardware Fault Messages (Oracle ILOM)
Troubleshooting Power Issues
If your server does not power on, the cause of the problem might be server AC power
connections or power supplies (PS0-3).
In maximally configured systems, it is possible that the worst-case power consumption of
the system could exceed the capacity of a single PS. The PSs provide an over-subscription
mode, which allows the system to operate with fault-tolerance even with modest excursions
beyond the rated capacity of a single PS. This over-subscription support is accomplished using
hardware signaling between the PS and motherboard circuitry, which can force the system to
throttle processor (CPU) and memory power in the event that a PS is lost. The resulting power
savings will be enough to allow the system to continue to run (in a lower-performance state)
until the power problem is resolved.
If your server does not power on, use the information in the following table to troubleshoot the
issue.
Power IssueDescriptionActionPrevention
AC Power
Connection
Power Supplies
(PS0-3)
The AC power cords are the direct
connection between the server power
supplies and the power sources. The
server power supplies need separate
stable AC circuits. Insufficient
voltage levels or fluctuations in
power can cause server power
problems. The power supplies are
designed to operate at a particular
voltage and within an acceptable
range of voltage fluctuations.
■ AC OK indicators next to the
AC inlets on the server back
panel are green when the power
is connected, and off when it is
not connected.
■ The AC OK and DC OK
indicators on the PS indicator
panels on the front panel are
green when the PS is functioning
properly.
The server power supplies (PS)
provide the necessary server voltages
from the AC power outlets. If the
power supplies are inoperable,
unplugged, or disengaged from the
internal connectors, the server cannot
power on.
Verify that both AC power cords
are connected to the server. Verify
that the correct power is present
at the outlets and monitor the
power to verify that it is within the
acceptable range. You can verify
proper connection and operation
by verifying the power supply (PS)
indicator panels, which are located at
the back of the server on the power
supplies. Lit green AC OK indicators
show a properly functioning power
supply. An amber AC OK indicator
indicates that the AC power to the
power supply is insufficient.
Verify that the AC cables are
connected to both power supplies.
Verify that the power supplies
are operational (the PS indicator
panel must have a lit green AC OK
indicator). Ensure that the power
supply is properly installed. A power
Use the AC power cord Velcro
retaining clips and position the cords
to minimize the risk of accidental
disconnection. Ensure that the
AC circuits that supply power
to the server are stable and not
overburdened.
When a power supply fails,
replace it immediately. To ensure
redundancy, the server SMOD
has two power supplies. This
redundant configuration prevents
server downtime, or an unexpected
shutdown, due to a failed power
Troubleshooting and Diagnostics55
Page 56
Troubleshooting Using Diagnostic Tools
Power IssueDescriptionActionPrevention
Note - Use the Velcro straps on
the back of the server to secure the
power cord connectors to the back
of the power supplies. The Velcro
retaining straps minimize the risk of
accidental disconnection.
supply that is not fully engaged with
its internal connector does not have
power applied and does not have a lit
green AC OK indicator.
supply. The redundancy allows
the server to continue to operate
if one of the power supplies fails.
However, when a server SMOD is
being powered by a single power
supply, the redundancy no longer
exists, and the risk for downtime or
an unexpected shutdown increases.
When installing a power supply,
ensure that it is fully seated and
engaged with its connector inside
the drive bay. A properly installed
power supply has a lit green AC OK
indicator.
Troubleshooting Using Diagnostic Tools
The server and its accompanying software and firmware contain diagnostic tools and features
that can help you isolate component problems, monitor the status of a functioning system,
and exercise one or more subsystems to disclose more subtle or intermittent hardwarerelated problems. Diagnostic tools range in complexity from a comprehensive validation test
suite (Oracle VTS) to a chronological event log (Oracle ILOM System Log). Tools include
standalone software packages, firmware-based tests, and hardware-based LED status indicators.
Each diagnostic tool has its own specific strength and application. Review the tools listed in
this section and determine which tool might be best to use for your situation. Once you have
determined the tool to use, you can access it locally, while at the server, or remotely.
Diagnostic Tools
The following table summarizes server diagnostic tools and identifies where you can find more
information about diagnostic tools.
Diagnostic ToolDiagnostic TypeFunctionAvailability and AccessLinks
Oracle ILOMSP firmwareMonitors environmental
condition and component
functionality sensors,
generates alerts, performs
fault isolation, and provides
remote access.
56Oracle Server X7-8 Service Manual • April 2018
Access either in Standby power
or Main power mode. OS
independent.
Local or remote access using
CLI or web interface
“Identify Hardware
Faults (Oracle
ILOM)” on page 47
https://www.oracle.
com/goto/ilom/docs
Page 57
Troubleshooting Using Status Indicators
Diagnostic ToolDiagnostic TypeFunctionAvailability and AccessLinks
View Oracle ILOM System
Log.
Oracle Hardware
Management Pack
Status IndicatorsHardware and SP
UEFI DiagnosticsSuite of diagnostic
Oracle VTSDiagnostic tool
Oracle Solaris
commands
Power-on Self-Test
(POST)
OSView System Log. Monitor
firmware
tests
standalone software
Operating system
software
Host firmwareTest core system
environmental conditions
and component functionality
sensors, view alerts, isolate
faults.
View status of overall
system and particular
components, system
indicators and sensors
Manually or automatically
run remote UEFI
Diagnostics tests from
Oracle ILOM to view
results onscreen or in log
files.
Exercise and stress the
system, run tests in parallel.
View system information.Requires Oracle Solaris
components including
CPUs, memory, and
motherboard I/O bridge
integrated circuits.
Access either in Standby power
or Main power mode. OS
independent.
Local or remote access using
CLI or web interface
View hardware-based LED
indicators when system power is
available.
Local or remote access. Sensor
and status indicators are
accessible from Oracle ILOM
web interface or CLI.
Oracle x86 Servers
Diagnostics and
Troubleshooting Guide
for Servers With Oracle
ILOM 4.0.x at https://
www.oracle.com/goto/
x86admindiag/docs
Troubleshooting Using Status Indicators
These sections describe the server front panel and back panel status indicators:
Troubleshooting and Diagnostics57
Page 58
Troubleshooting Using Status Indicators
■
“Front Indicator Module (FIM) Panel” on page 58
■
“Power Supply (PS) Indicators” on page 62
■
“Fan Module (FM) Indicators” on page 61
■
“Storage Drive Indicators” on page 67
■
“System Module (SMOD) Indicators” on page 63
■
“Dual PCIe Card Carrier (DPCC) Indicators” on page 68
■
“AC Power Block Inlet Indicators” on page 69
Related Information
■
“Controls and Indicators” on page 70
■
“Replaceable Components” on page 22
Front Indicator Module (FIM) Panel
The front indicator module (FIM) panel is located at the top left corner of the server (as viewed
from the front of the server). Use buttons to control the server. Use indicators to determine
server status. The FIM provides controls and indicators for three system configurations.
■
Single 4-socket: The FIM provides controls and indicators for System A (SMOD0) only.
System B (SMOD1) buttons and indicators are not operational.
■
Dual 4-socket: The FIM provides separate controls and indicators for System A (SMOD0)
and System B (SMOD1).
■
8-socket: The FIM provides controls and indicators for System A (SMOD0) and System B
CMODs 4-7. Other System B (SMOD1) buttons and indicators are not operational.
The following figure shows FIM buttons and indicators.
58Oracle Server X7-8 Service Manual • April 2018
Page 59
Call
Status LED or ButtonIcon and
Out
1Locate button/LED (chassis
SMOD0 System A)
Color
White
Troubleshooting Using Status Indicators
Description
Indicates the location of the SMOD System A in the server:
■ Off – Server is operating normally.
■ Fast blink – Use Oracle ILOM to activate this LED to enable you to locate a
system quickly and easily.
On when SMOD0 System A Locate Button on the server is pressed.
2Fault-Service Required
(chassis SMOD0 System A)
3System OK (chassis SMOD0
System A)
4On/Standby button (chassis
SMOD0 System A)
(recessed)
Amber
Green
None
See “Managing the Locate Button/LED” on page 119.
Indicates a fault state in SMOD0 System A:
■ Off – Server is operating normally.
■ Steady On – A fault is present in chassis SMOD0 System A.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
Indicates the operational state of the SMOD0 System A:
■ Off – AC power is not present or the Oracle ILOM boot is not complete.
■ Flashing – SMOD0 System A is booting.
■ Steady On – OS has booted, power is on and chassis SMOD0 System A is
running.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
Use to locally control chassis SMOD0 System A system power:
■ Four second or less press – Initiates a graceful shutdown.
■ Five seconds or more press – Initiates an immediate shutdown.
Control chassis SMOD0 System A power locally, when physically present at the
server. The duration of the button press determines the type of power off (graceful
or immediate).
Troubleshooting and Diagnostics59
Page 60
Troubleshooting Using Status Indicators
Call
Status LED or ButtonIcon and
Out
Color
5SP OK (chassis SMOD0
System A)
Green
6System Overtemperature
Warning (chassis SMOD0
System A)
Amber
7System A/Chassis rear Fault-
Service Required LED
(chassis SMOD0 System A)
Amber
8CMOD Fault-Service
Required LEDs 0, 1, 2, 3
Description
See “Powering Down the Server” on page 110 and “Power On the
Server” on page 246.
Indicates when SMOD0 System A service processor (SP) is booting:
■ Flashing – SP is booting.
■ Steady On – Oracle ILOM is operational.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
Indicates that a fault might have occurred in the cooling subsystem. The system
Fault-Service Required LED might also be lit.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
Indicates that a fault might have occurred in SMOD0 System A or the server
chassis.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
Indicates that a fault might have occurred in the corresponding CMODs supporting
chassis SMOD0 System A.
9Locate Button/LED
(SMOD1 System B)
10Fault-Service Required
(SMOD1 System B)
11System OK (SMOD1
System B)
Amber
White
Amber
Green
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
Indicates the SMOD1 System B location in the server when pressed.
■ Off – Server is operating normally,
■ Fast blink – Use Oracle ILOM to activate this LED to enable you to locate a
system quicklly and easily.
See “Managing the Locate Button/LED” on page 119.
Indicates a fault state in SMOD1 System B:
■ Off – Server is operating normally,
■ Steady On – A fault is present in SMOD1 System B.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
Indicates the operational state of SMOD0 System B:
■ Off – SMOD1 System B AC power is not present or the Oracle ILOM boot is
not complete.
■ Flashing – SMOD1 System B is booting.
■ Steady On – OS has booted, power is on and SMOD1 System B is running.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
60Oracle Server X7-8 Service Manual • April 2018
Page 61
Call
Status LED or ButtonIcon and
Out
12On/Standby button (SMOD1
System B) (recessed)
13SP OK (SMOD1 System B)
14System Overtemperature
Warning (SMOD1 System
B)
15Chassis rear Fault-Service
Required LED (SMOD1
System B)
16CMOD Fault-Service
Required LEDs 4, 5, 6, 7
Color
None
Green
Amber
Amber
Troubleshooting Using Status Indicators
Description
Use to locally control SMOD1 System B system power:
■ Four seconds or less press – Initiates a graceful shutdown.
■ Five seconds or more press – Initiates an immediate shutdown.
See “Powering Down the Server” on page 110 and “Power On the
Server” on page 246.
Indicates when SMOD1 System B SP service processor (SP) is booting:
■ Flashing – SP is booting.
■ Steady On – Oracle ILOM is operational.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
Indicates that a fault might have occurred in the SMOD1 system B cooling
subsystem. The system Fault-Service Required LED might also be lit.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
Indicates that a fault might have occurred in SMOD1 System B.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Using Status Indicators” on page 57.
Indicates that a fault might have occurred in the corresponding CMODs supporting
SMOD1 System B.
See “System Module (SMOD) Indicators” on page 63 and “Troubleshooting
Amber
Using Status Indicators” on page 57.
Fan Module (FM) Indicators
Each fan module (FM) has two indicators arranged in a single row and from left to right as
shown in the following figure of the front of the FM.
Troubleshooting and Diagnostics61
Page 62
Troubleshooting Using Status Indicators
Call
Status LED or
Out
Button
1Fault-Service
Required
2OK
Icon and ColorDescription
Amber
Indicates a fault state in a fan module:
■ Off – Fan module is operating normally,
■ Steady On – A fault is present in the fan module.
Indicates the functional state of the fan module:
■ Off – Fan module is powered off or functioning abnormally.
■ Steady On – Fan module is powered on and functioning normally.
Green
Power Supply (PS) Indicators
Each power supply (PS) has three indicators arranged in a single row from left to right. Power
supplies for System A are PS2 and PS3. Power supplies for System B are PS0 and PS1.
62Oracle Server X7-8 Service Manual • April 2018
Page 63
Call
Status LED or
Out
Button
1Locate button/LED
Icon and
Color
Troubleshooting Using Status Indicators
Description
Indicates the location of the power supply in the server:
Fault-Service
Required
Amber
2OK
Power Supply OK
LED
Green
3AC OK LEDInput ~AC
Green
System Module (SMOD) Indicators
■ Off – Power supply is operating normally,
■ Fast blink – Use Oracle ILOM to activate this LED to enable you to locate a power
supply quickly and easily.
■ Steady On –Lights steady on when the power supply is in a fault state.
Indicates the functional state of the power supply:
■ Off – PS is disconnected
■ Steady On – PS is powered on and functionaing normally. When this LED is lit, the AC
OK LED is also lit.
Note - Oracle ILOM signals a fault on any installed power supply that is not connected to an
AC power source, since it might indicate a loss of redundancy.
Indicates the operational state of the power supply:
■ Off – PS is not connected to an AC power source.
■ Steady On – PS is connected to a properly rated AC power source.
The back panel indicators located on the SMOD allow you manage the server and determine
server status. The SMOD back panel indicator includes some indicators and buttons not
Troubleshooting and Diagnostics63
Page 64
Troubleshooting Using Status Indicators
found on the front indicator module (FIM), including reset switches and indicators for SMOD
components.
The following figure shows the back panel SMOD indicators.
Call
Status LED or ButtonIcon and ColorDescription
Out
1Locate button/LED
White
2Fault-Service Required
Amber
3System OK
Green
When activated remotely, lighting the SMOD Locate LED helps you find the server
and SMOD0 System A or SMOD1 System B. Press this button to prove physical
presence at the server chassis, as required for some Oracle ILOM tasks.
■ Off – Server is operating normally,
■ Fast blink – Use Oracle ILOM to activate this LED to enable you to locate a
system quicklly and easily.
See “Managing the Locate Button/LED” on page 119.
Indicates a fault state in SMOD:
■ Off – SMOD is operating normally,
■ Steady On – A fault is present in SMOD.
Other amber (fault) indicators might also be lit, which can help you isolate the fault to
a particular subsystem.
See “Troubleshooting Using Status Indicators” on page 57.
Indicates the operational state of the SMOD:
■ Off – AC power is not present or the Oracle ILOM boot is not complete.
■ Flashing – SMOD is booting.
■ Steady On – OS has booted, power is on and chassis SMOD is running.
64Oracle Server X7-8 Service Manual • April 2018
Page 65
Call
Status LED or ButtonIcon and ColorDescription
Out
4SP OK
Green
5NET MGT
10/100/1000 Ethernet
Activity: Top
left, Green
port Activity and Speed
LEDs
Link speed:
Top right
Bi-colored:
Amber/ Green
Troubleshooting Using Status Indicators
Along with the SP indicator (below), the System OK LED provides the SMOD power
status.
See “Troubleshooting Using Status Indicators” on page 57.
Indicates when SMOD service processor (SP) is booting:
■ Flashing – SP is booting.
■ Steady On – Oracle ILOM is operational.
Along with the System OK indicator (above), the SP OK LED provides the status of
the system power.
See “Troubleshooting Using Status Indicators” on page 57.
The service processor NET MGT port is the optional connection to the Oracle ILOM
service processor. The NET MGT port is configured by default to use Dynamic Host
Configuration Protocol (DHCP). The service processor NET MGT port uses an RJ-45
cable for a 10/100/1000BASE-T connection.
NET MGT Activity LED: Top left Green
Indicates when the Oracle ILOM service processor (SP) network management (NET
MGT) RJ-45 10/100/1000BASE-T port is active.
■ Steady On – Link up. Lights when the Network (NET) 10/100/1000BASE-T RJ45 Gigabit Ethernet (GbE) NET0 port is active. Indicates a live network.
■ Off– No activity. No link. Not operational.
■ FLASHING – Packet activity. Blinks with network traffic.
6SMOD Fault-Service
Required
7SMOD System OK
Amber
Green
NET MGT Link speed LED: Top right Bi-colored: Amber/Green
■ Off – 10BASE-T link (if link up) (10 GigabitEthernet 10GBASE-T)
■ Amber ON – 100BASE-T link (Fast Ethernet 100 BASE-TX)
■ Green ON – 1000BASE-T link (GigabitEthernet 1000BASE-T)
Indicates a fault state in he tSMOD:
■ Off – SMOD is operating normally,
■ Steady On – A fault is present in SMOD.
Lights when the SMOD requires service.
Indicates the operational state of the SMOD:
■ Off – AC power is not present or the Oracle ILOM boot is not complete.
■ Flashing – SMOD is booting.
■ Steady On – OS has booted, power is on, and chassis SMOD is running.
Along with the SP indicator (above), the System OK LED provides the SMOD power
status.
See “Troubleshooting Using Status Indicators” on page 57.
Troubleshooting and Diagnostics65
Page 66
Troubleshooting Using Status Indicators
Call
Status LED or ButtonIcon and ColorDescription
Out
8HBA Fault-Service
Required
Amber
9NET2 10 GbE Ethernet
port Activity and Speed
Activity: Top
left, Green
LEDs
Link speed:
Top right
Bi-colored:
Amber/ Green
10NET3 10 GbE Ethernet
port Activity and Speed
Activity: Top
left, Green
LEDs
Link speed:
Top right
Bi-colored:
Amber/ Green
Indicates a fault state in the SMOD internal HBA:
■ Off – HBA is operating normally.
■ Steady On – A fault is present in SMOD HBA.
Lights when the internal HBA requires service.
NET2 Activity LED: Top left Green
■ Steady On – Link up. Lights when the Network (NET) 10 GbE Gigabit Ethernet
(GbE) RJ-45 NET2 port is active. Indicates a live network.
■ Off– No activity. No link. Not operational.
■ FLASHING – Packet activity. Blinks with network traffic.
NET2 Link speed LED: Top right Bi-colored: Amber/Green
■ Off – 10BASE-T link (if link up) (10 GigabitEthernet 10GBASE-T)
■ Amber ON – 100BASE-T link (Fast Ethernet 100 BASE-TX)
■ Green ON – 1000BASE-T link (GigabitEthernet 1000BASE-T)
NET3 Activity LED: Top left Green
■ Steady On – Link up. Lights when the Network (NET) 10 GbE Gigabit Ethernet
(GbE) RJ-45 NET3 port is active. Indicates a live network.
■ Off– No activity. No link. Not operational.
■ FLASHING – Packet activity. Blinks with network traffic.
NET3 Link speed LED: Top right Bi-colored: Amber/Green
11NET0 10 GbE Ethernet
port Activity and Speed
LEDs
12NET1 10 GbE Ethernet
port Activity and Speed
LEDs
Activity: Top
left, Green
Link speed:
Top right
Bi-colored:
Amber/ Green
Activity: Top
left, Green
Link speed:
Top right
Bi-colored:
Amber/ Green
■ Off – 10BASE-T link (if link up) (10 GigabitEthernet 10GBASE-T)
■ Amber ON – 100BASE-T link (Fast Ethernet 100 BASE-TX)
■ Green ON – 1000BASE-T link (GigabitEthernet 1000BASE-T)
NET0 Activity LED: Top left Green
■ Steady On – Link up. Lights when the Network (NET) 10/100/1000BASE-T RJ45 Gigabit Ethernet (GbE) NET0 port is active. Indicates a live network.
■ Off– No activity. No link. Not operational.
■ FLASHING – Packet activity. Blinks with network traffic.
NET0 Link speed LED: Top right Bi-colored: Amber/Green
■ Off – 10BASE-T link (if link up) (10 GigabitEthernet 10GBASE-T)
■ Amber ON – 100BASE-T link (Fast Ethernet 100 BASE-TX)
■ Green ON – 1000BASE-T link (GigabitEthernet 1000BASE-T)
NET1 Activity LED: Top left Green
■ Steady On – Link up. Lights when the Network (NET) 10 GbE Gigabit Ethernet
(GbE) RJ-45 NET1 port is active. Indicates a live network.
■ Off– No activity. No link. Not operational.
■ FLASHING – Packet activity. Blinks with network traffic.
66Oracle Server X7-8 Service Manual • April 2018
Page 67
Call
Status LED or ButtonIcon and ColorDescription
Out
Note - The server does not provide video ports on the SMODs. Video display is only available
using the Oracle ILOM Remote Console Plus interface.
Storage Drive Indicators
Storage drives are installed in carriers. Each storage drive carrier has three indicators arranged
in a single stacked row and from bottom to top.
The following illustration shows the front of the storage drive carrier and the storage drive
indicators.
Troubleshooting Using Status Indicators
NET1 Link speed LED: Top right Bi-colored: Amber/Green
■ Off – 10BASE-T link (if link up) (10 GigabitEthernet 10GBASE-T)
■ Amber ON – 100BASE-T link (Fast Ethernet 100 BASE-TX)
■ Green ON – 1000BASE-T link (GigabitEthernet 1000BASE-T)
Call
Status LED or ButtonIcon and
Out
1Ready to Remove
2Fault-Service Required
LED
Color
Blue
Amber
Description
Indicates the removal status of the storage drive:
■ Off – Server is operating normally,
■ On – Lights when the storage drive is ready to be removed from the server in
response to an action initiated from the server OS.
Indicates a fault state has been detected in the storage drive:
■ Off – Storage drive is operating normally,
■ Steady On – A fault is present in the storage drive.
Troubleshooting and Diagnostics67
Page 68
Troubleshooting Using Status Indicators
Call
Status LED or ButtonIcon and
Out
3OK/Activity LED
Color
Green
Dual PCIe Card Carrier (DPCC) Indicators
Each DPCC has two indicator panels, one for each PCIe slot inside the server. Each panel
contains a green OK indicator, an amber Fault-Service Required LED, and a recessed pinhole
Attention (ATTN) button. The ATTN buttons are used to initiate DPCC removal and install.
Before removing a DPCC, use a stylus to press both ATTN buttons. After installing a DPCC
that contains a PCIe card, press the ATTN buttons again.
Description
Indicates the operational state of the storage drive:
■ Off – AC power is not present or the Oracle ILOM boot is not complete.
■ Flashing – Blinks to show storage drive activity. Storage drive indicators blink
rates vary by activity. See “Status Indicator Blink Rates” on page 76.
■ Steady On – The storage drive is functioning normally.
68Oracle Server X7-8 Service Manual • April 2018
Page 69
Troubleshooting Using Status Indicators
Call
Status LEDIcon and
Out
1ATTNATTNAttention (ATTN) DPCC recessed pinhole button to initiate DPCC removal and install
2Fault-Service Required/
Locate LED
Color
Amber
Description
Indicates a fault state in DPCC:
■ Off – Server is operating normally.
■ Steady On – A fault is present in chassis SMOD0 System A.
DPCC Locate LED:
■ Off – DPCC is operating normally.
■ Fast blink – Use Oracle ILOM to activate this LED to enable you to locate a DPCC
quickly and easily.
3DPCC OK indicator
AC Power Block Inlet Indicators
Each power inlet on the AC power block at the server back panel has a single green OK
indicator that turns steady on only when the power at the connector is sufficient for the power
supply unit. The following figure shows AC inlets 0-3.
Green
Indicates the operational state of DPCC:
■ Off – DPCC power is not present.
■ Flashing – DPCC is booting.
■ Steady On – DPCC power is on and running.
The server back panel AC inlets have the following designations.
Troubleshooting and Diagnostics69
Page 70
Troubleshooting Using Status Indicators
Call
Status LED or ButtonIcon and
Out
1AC 0~ACAC 0 (SMOD1) System B
2AC 1~ACAC 1 (SMOD1) System B
3AC 2~ACAC 2 (SMOD0) System A
4AC 3~ACAC 3 (SMOD0) System A
Do not attach power cables to the power supplies until you finish connecting the data cables to
the server. The server goes into Standby power mode, and the Oracle ILOM service processor
initializes when the AC power cables are connected to the power source. System messages
might be lost after 60 seconds if the server is not connected to a terminal, PC, or workstation.
About Controls and Indicators
The following sections describe the controls, indicators, connectors, and drives located on the
front and back panels.
Color
Description
■
“Controls and Indicators” on page 70
■
“Back Panel Pinhole Switches” on page 75
■
“Status Indicator Blink Rates” on page 76
Controls and Indicators
Use the buttons, switches, and status indicators on the front and back of the server, server
management software, and Oracle ILOM to troubleshoot the server:
■
Server Boot Process and Normal Operating State Indicators
■
Locate Button/LED Indicator On
■
Over Temperature Condition
■
PSU Fault
■
Memory Fault
■
CPU Fault
■
Fan Module Fault
■
SP Fault
■
Front Panel Lamp Test
70Oracle Server X7-8 Service Manual • April 2018
Page 71
Troubleshooting Using Status Indicators
Note - For the error state scenarios described below, the OK indicator state depends on presence
of redundant components and the severity of the fault.
Server Boot Process and Normal Operating State Indicators
A normal server boot process involves System A (SMOD0) or System B (SMOD1) service
processor SP OK indicator and System OK indicator. The following illustration shows SMOD0
System A (callout 1) and SMOD1 System B (callout 2).
Call
Out
1System A
2System B
SystemActivity
SMOD0
SMOD1
■ CMOD 0 - 3 indicates the status of CMODs 0 - 3. The remaining A-system (SMOD0) controls and
indicators provide information and control for System A.
■ Single 4 socket: The FIM provides controls and indicators for System A (SMOD0) only.
■ Dual 4 socket: The FIM provides separate controls and indicators for System A (SMOD0) and System B
(SMOD1).
■ 8 socket: The FIM provides controls and indicators for System A (SMOD0) and System B CMODs 4-7.
Other System B (SMOD1) buttons and indicators are not operational.
■ CMOD 4 - 7 indicates the status of CMODs 4 - 7.
■ Single 4 socket: System B (SMOD1) buttons and indicators are not operational.
■ Dual 4 socket: The FIM provides separate controls and indicators for System A (SMOD0) and System B
(SMOD1).
■ 8 socket: The FIM provides controls and indicators for System A (SMOD0) and System B CMODs 4-7.
Other System B (SMOD1) buttons and indicators are not operational.
The following table describes the indicator activity during a normal boot sequence.
System ConditionSP IndicatorPower OK Indicator
AC power applied to server. SP is booting.BlinksOff
Troubleshooting and Diagnostics71
Page 72
Troubleshooting Using Status Indicators
System ConditionSP IndicatorPower OK Indicator
SP is booted and ready to use. Host is off.Steady OnBlinks at single blink rate (quick
SP is running. Host is booting.Steady OnBlinks at fast rate
SP and host are running. This is the normal operating state of the
system.
Steady OnSteady On
flash every 3 seconds)
Locate Button/LED Indicator On
Locate Button/LEDs are white combination button/indicators that are located on both the front
FIM and back panel at SMOD0 and SMOD1. To deactivate (or activate) the Locate Button/
LED, press and release the Locate button. When the Locate Button/LED is on, the LED blinks
at the fast blink rate. You can turn the Locate Button/LED off remotely from Oracle ILOM,
or by pressing a Locate button on the chassis. The buttons on the server front and back allow
you to manage System A (SMOD0) and System B (SMOD1) Locate Buttons/LED indicators
locally.
■
Turn a Locate Button/LED on remotely from Oracle ILOM to locate the server in a rack.
Typically, a server readied for service is placed in Standby power mode and the SMOD0 or
SMOD1 Locate indicator is lit.
■
Press the SMOD0 or SMOD1 Locate Button/LED button to prove physical presence. Some
service procedures require you to prove physical presence by pressing the Locate Button/
LED button.
The following figure shows two Locate Button/LEDs for System A [callout 1] and System B
[callout 2] on the server front panel FIM.
72Oracle Server X7-8 Service Manual • April 2018
Page 73
Troubleshooting Using Status Indicators
Call
Out
1System A
2System B
SystemActivity
SMOD0
SMOD1
■ Single 4 socket: The FIM Locate Button/LED indicator provides controls and indicators for System A
(SMOD0) only.
■ Dual 4 socket: The FIM Locate Button/LED indicator provides separate controls and indicators for System
A (SMOD0) and System B (SMOD1).
■ 8 socket: The FIM provides controls and indicators for System A (SMOD0) and System B CMODs 4-7.
Other System B (SMOD1) buttons and indicators are not operational.
■ Single 4 socket: System B (SMOD1) Locate Button/LED indicator is not operational.
■ Dual 4 socket: The FIM Locate Button/LED indicator provides separate controls and indicators for System
A (SMOD0) and System B (SMOD1).
■ 8 socket: The FIM Locate Button/LED indicator System B (SMOD1) buttons and indicators are not
operational.
Over Temperature Condition
For a server in an over-temperature state, the server amber over-temperature indicator and the
amber Fault-Service Required LEDs (front and back) are steady on. The states of the front and
back green On Standby, System OK, and the green SP indicators depend on the severity of the
condition.
Troubleshooting and Diagnostics73
Page 74
Troubleshooting Using Status Indicators
PS Fault
For a server with a power supply (PS) in a fault state, the server amber Fault-Service Required
LEDs (front and back) and the amber Fault-Service Required indicator on the PS0-3 are steady
on. The front and back green On/Standby, System OK, and the green SP indicators are steady
on.
Memory Fault
For a server with a fault in the memory subsystem, the server amber Fault-Service Required
LEDs (front and back) and an amber CMOD Fault-Service Required LED are steady on. The
front and back green On/Standby, System OK, and the green SP indicators are steady on.
CPU Fault
For a server with a fault in the processor subsystem, the server amber Fault-Service Required
LEDs (front and back) and an amber CMOD Fault-Service Required LED are steady on. The
activity of front and back green On/Standby, System OK, and the green SP indicators vary
depending on whether the server can boot successfully. The server might not be able to boot out
of Standby power mode.
Fan Module Fault
For a server with a fan module fault, the server amber Fault-Service Required LEDs (front and
back) and an amber Fault-Service Required LED on a fan module are steady on. The front and
back green On/Standby, System OK indicator, and the green SP indicators are steady on.
Service Processor Fault
For a server with an SP (service processor) fault, the server amber Fault-Service Required
LEDs (front and back) are steady on. The front and back System OK indicators and the SP OK
indicator are off.
74Oracle Server X7-8 Service Manual • April 2018
Page 75
Troubleshooting Using Status Indicators
Front Panel Lamp Test
To perform a lamp test of all front panel indicators, press the Locate Button/LED three times
within a five second period. All the front and back indicators light up and remain steady on for
15 seconds (see “Unison Steady On” on page 79).
Back Panel Pinhole Switches
This section shows the location of the back panel pinhole switches.
Call
ButtonIconDescription
Out
1Non-maskable Interrupt (NMI)
button (recessed) SMOD0
2SP Reset button (recessed) SMOD0Performs an immediate System A (SMOD0) SP reboot and requires a stylus.
3Host Warm Reset button (recessed)
SMOD0
4Non-maskable Interrupt (NMI)
button (recessed) SMOD1
5SP Reset button (recessed) SMOD1Performs an immediate System B (SMOD1) SP reboot and requires a stylus.
6Host Warm Reset button (recessed)
SMOD1
Do not press. This button is used by Oracle Service personnel only and
requires a stylus.
Performs an immediate System A (SMOD0) host reboot and requires a stylus.
Do not press. This button is used by Oracle Service personnel only and
requires a stylus.
Performs an immediate System B (SMOD1) host reboot and requires a stylus.
Troubleshooting and Diagnostics75
Page 76
Troubleshooting Using Status Indicators
Status Indicator Blink Rates
This section describes the following indicator blink rates:
■
Steady On
■
Steady Off
■
Slow Blink Rate
■
Fast Blink Rate
■
Single (Standby) Blink Rate
■
Slow Unison Blink Rate
■
Insertion Blink
■
Unison Steady On
■
Alternating (Invalid FRU) Blink Rate
■
Feedback Flash
■
Data Blink Rate
■
Sequential (Diagnostic) Blink Rate
Steady On
For the steady on state, an indicator is continually on (lit) and does not blink. This indicates a
continuing condition, for example, an operational state (green) or a Fault-Service Required fault
state (amber).
Steady Off
For the steady off state, an indicator is continually off (not lit) and does not blink. This indicates
that a system is not operational, for example, no AC power (unlit green OK indicator) or a
subsystem not in a fault state (unlit amber Fault-Service Required LED).
76Oracle Server X7-8 Service Manual • April 2018
Page 77
Troubleshooting Using Status Indicators
Slow Blink Rate
For the slow blink rate, the indicator (typically green) repeatedly lights for half a second during
a one second interval (1 Hz) and turns off for half a second. The slow blink rate indicates an
on-going activity, for example, device rebuilding, booting, or in transition from one mode to
another.
Fast Blink Rate
For the fast blink rate, the indicator repeatedly blinks twice (on, off, on) during a one second
interval (2 Hz). The fast blink rate indicates activity or data transfer.
Troubleshooting and Diagnostics77
Page 78
Troubleshooting Using Status Indicators
Single (Standby) Blink Rate
For the single blink rate, the indicator repeatedly flashes once at the beginning of a three second
interval. This indicates a system or component in Standby mode. For example, a server in
Standby power mode or a hot spare device waiting to be used (also used with amber indicators
to indicate a predicted fault).
Slow Unison Blink Rate
For the slow unison blink rate, the indicators on the component blink in unison for half a second
during a one second interval (1 Hz). Typically, this is limited to three successive blinks. This
confirms the successful insertion of a removable device (for example, a storage drive) into a
powered system (confirming the power connection).
Insertion Blink
The insertion blink is three successive blinks of a hot-swap component's primary status
indicator, for example, the green OK indicator. The insertion blink occurs immediately after
three successive unison blinks (see “Slow Unison Blink Rate” on page 78) of all the
component indicators.
78Oracle Server X7-8 Service Manual • April 2018
Page 79
Troubleshooting Using Status Indicators
Unison Steady On
For the unison steady on, all indicators are simultaneously steady on (see “Steady
On” on page 76. This occurs during the front panel lamp test (see “Front Panel Lamp
Test” on page 75). This is the only time that the Locate Button/LED indicator is steady on.
Alternating (Invalid FRU) Blink Rate
The alternating (invalid FRU) blink rate is a repeating sequence of lit green and amber
indicators at 1 Hz. This indicates that a component has an incorrect version or mismatch, for
example, a power supply with a lower rating than the one specified. The blink rate is also used
for an unsupported component, or a component in an unsupported slot.
Feedback Flash
The indicator flashes on and off during periods of activity, commensurate with the activity, but
the flashing does not exceed the 2 Hz fast blink rate (see “Fast Blink Rate” on page 77).
Troubleshooting and Diagnostics79
Page 80
Troubleshooting Server Subsystems
For example, this blink rate occurs during disk drive read and write activity and communication
port transmit and receive activity.
Data Blink Rate
For this blink rate, a normally on indicator repeatedly turns off twice during a one-second
interval (2 Hz) (see “Fast Blink Rate” on page 77) while data activity is taking place.
Sequential (Diagnostic) Blink Rate
This blink rate is a repeating sequence in which each indicator successively lights for 0.5 sec
to indicate that diagnostics are running. This blink rate is used only on systems or components
capable of running diagnostics.
Troubleshooting Server Subsystems
These sections describe the server subsystems:
■
“Processor Subsystem” on page 80
■
“Memory Subsystem” on page 81
■
“Power Subsystem” on page 82
■
“Cooling Subsystem” on page 84
■
“Storage Subsystem” on page 87
■
“Networking Subsystem” on page 88
■
“PCI Devices Subsystem” on page 89
For component serviceability, locations, and designations, see “Replaceable
Components” on page 22.
Processor Subsystem
Use the Oracle Integrated Lights Out Manager (ILOM) Processors page to view the health of
the CPUs installed on the CMODs.
The server processor subsystem consists of the following:
80Oracle Server X7-8 Service Manual • April 2018
Page 81
Troubleshooting Server Subsystems
On-demand configuration for one 4-socket server, two independent 4-socket servers, or one 8socket server.
8-socket: Up to 6.0 TB DRAM (with 64 GB DIMMs) of DDR4 interface memory
■
4-socket: Up to 3.0 TB DRAM (with 64 GB DIMMs) of DDR4 interface memory
■
12 DIMMs/6 channels per CMOD. See “Servicing DIMMs (CRU)” on page 182
memory population rules for order of installation.
Troubleshooting and Diagnostics81
Page 82
Troubleshooting Server Subsystems
■
DDR4 interface (2666 MT/s)
■
2666 MT/s 2DPC (DIMMs per channel)
Power Subsystem
Use the Oracle Integrated Lights Out Manager (ILOM) Power page to view the overall health
and power consumption of the power supplies installed in your system. Review the Power
Supplies table for details about the health and location of individual power supplies.
Chassis power is provided by four hot-serviceable front panel accessible power supply
units (PSUs). The four PSUs provide dual (1+1) redundancy. Therefore, the minimum PSU
configuration is two. To ensure redundancy, at least two separate circuits should supply server
power. The following figure shows the indicator panel on the front of the power supplies.
82Oracle Server X7-8 Service Manual • April 2018
Page 83
Call
Status LED or ButtonIcon and
Out
1Fault-Service Required
Locate LED
2OK
Color
Amber
White
Troubleshooting Server Subsystems
Description
Indicates the location of the power supply in the
server:
■ Off – Power supply is operating normally,
■ Fast blink – Use Oracle ILOM to activate this
LED to enable you to locate a power supply
quickly and easily.
Lights steady on when the power supply is in a fault
state.
Indicates the functional state of the power supply:
Power Supply OK LED
Green
3AC OK LED~AC
Green
■ Off – Power supply is disconnected.
■ Steady On – Power supply is powered on and
functionaing normally. When this LED is lit, the
AC OK LED is also lit.
Note - Oracle ILOM signals a fault on any installed
power supply that is not connected to an AC power
source, since it might indicate a loss of redundancy.
Indicates the operational state of the power supply:
■ Off – Power supply is not connected to an AC
power source.
■ Steady On – Power supply is connected to a
properly rated AC power source.
Each power supply is rated for 3060W continuous output. Input is 220V VAC only (50-60Hz).
Main output is 12V @244A. Standby output is 12V at 5A. The MAX input line current (200277 VAC input) is less than 16 Amps RMS. The minimum holdup is 12ms for Main output and
40ms for Standby.
When the AC power cords are connected to AC inputs at the back of the chassis, the power
supplies supply power to the Ethernet ports, the system sensors and inventory circuits, and the
service processor (SP). When power is supplied to the SP, the SP boots, and the server enters
the low-power Standby power mode.
Once the SP boots into Standby power, Main power is initiated by pressing and releasing the
chassis front panel On/Standby button or by powering on the server remotely from Oracle
ILOM.
For more information about power control, see “Power Control, Shutdown, and Reset
States” on page 118.
Troubleshooting and Diagnostics83
Page 84
Troubleshooting Server Subsystems
In the following figure, callout 1 shows the AC OK indicator for inlet ~AC 0.
Call
Status LED or ButtonIcon and
Out
1AC OK LED~AC
Color
Description
Indicates the operational state of the power supply:
Green
■ Off – PS is not connected to an AC power
source.
■ Steady On – PS is connected to a properly rated
AC power source.
Cooling Subsystem
Use the Oracle Integrated Lights Out Manager (ILOM) Cooling page to view the health and
number of fans installed in your system. Additionally, you can view the server inlet and exhaust
temperatures. Review the Fans table for details about the health and location of individual fans.
System cooling air flows from front to back. Primary cooling is provided by eight redundant
front panel accessible 100 watt hot-swappable cooling fan modules.
To maintain the integrity of the chassis cooling system, ensure that:
■
Empty slots have filler panels. All necessary fillers ship with the system.
■
Each drive bay contains a storage device or a drive slot filler.
84Oracle Server X7-8 Service Manual • April 2018
Page 85
Troubleshooting Server Subsystems
■
All DPCCs are installed regardless of whether they contain a card or not.
■
Both fan frames are populated with fan modules.
■
Each fan frame and fan module is populated.
■
All CMOD processors have a heatsink.
■
Each SMOD bay has an SMOD.
Cooling Zones
The server has five front-to -back cooling zones. The cooling zones are numbered from left to
right (from the front of the server) as zone 0 to zone 4.
The airflow cooling in zone 0 is concentrated through the power supplies (PSs) and is
provided by the internal PS fan modules. In a 4-socket configuration, zones 1 and 2 operate
independently from zones 3 and 4.
The fan modules (FM0- FM7) provide the airflow for cooling zones 1-4. Each zone has a pair
of dedicated FMs:
■
Zone 1 airflow cooling is concentrated on the CPU modules (CMODs) CMOD0 and
CMOD1 and is provided by FM0 and FM1.
■
Zone 2 airflow cooling is concentrated on CMOD2 and CMOD3 and is provided by FM2
and FM3.
■
Zone 3 airflow cooling is concentrated on CMOD4 and CMOD5 and is provided by FM4
and FM5.
■
Zone 4 airflow is concentrated on CMOD6 and CMOD7 and is provided by FM6 and FM7.
Note - In a four-CMOD server configuration, the fan modules for cooling zones 3 and 4 are
not powered. However, to maintain the integrity of the cooling subsystem, FMs 4-7 must be
installed in the server.
Troubleshooting and Diagnostics85
Page 86
Troubleshooting Server Subsystems
Call OutDescriptionCooling provided by:
0Zone 0: Power suppliesFour power supply fans
1Zone 1: CMOD0 and CMOD1FM0 and FM1
2Zone 2: CMOD2 and CMOD3FM2 and FM3
3Zone 3: CMOD4 and CMOD5FM4 and FM5
4Zone 4: CMOD6 and CMOD7FM6 and FM7
Cooling Fan Power
Power for the internal PSU cooling fans (zone 0) is provided by the PSUs. Power for the fan
modules (zones 1-4) is supplied by CMOD0, CMOD2, CMOD4, and CMOD6.
■
The chassis cooling fans operate only when the chassis is in Main power mode (see “Power
Control, Shutdown, and Reset States” on page 118).
■
The PSU fans operate when the system is in Main power or Standby power mode.
The following table lists the CMODs and the fan modules to which they supply power.
CMODFan Modules Powered
CMOD0FM0 and FM1
CMOD2FM2 and FM3
CMOD4FM4 and FM5
86Oracle Server X7-8 Service Manual • April 2018
Page 87
Troubleshooting Server Subsystems
CMODFan Modules Powered
CMOD6FM6 and FM7
Note - The fan power connectors for CMODs in slots 1, 3, 5, and 7 are not used.
Fan Module Redundancy
The eight fan modules (FMs) provide airflow for chassis cooling zones 1-4. For redundancy,
each zone has two dedicated FMs. Replace a failed fan module immediately. The FMs are hotserviceable.
Caution - Data Loss. Do not remove more than one fan module from a column while the system
is in Main power mode. This action removes power from the CMODs and causes an immediate
shutdown. On an eight-CMOD system, this applies to all fan modules. On a four-CMOD
system, this applies to the fan modules in the left-hand fan frame.
For FM reference and servicing information, see “Servicing Fan Modules (CRU) and Fan
Frames (CRU)” on page 129.
Storage Subsystem
Use the Oracle Integrated Lights Out Manager (ILOM) Storage page to view tables listing
health and inventory information for storage devices detected on your server.
The server storage subsystem consists of the following:
■
Storage Drives: 8 hot-swappable SAS3 HDD or SSD SFF drives, four per SMOD.
SMOD0: bays 0-3; SMOD1: bays 0-3.
■
Controllers
■
Volumes
■
Expanders
When a fault occurs on a server drive, the amber Fault-Service Required LED lights on the
front of the drive. This amber LED enables you to locate the faulted drive in the system.
Additionally, the front and rear panel Fault-Service Required LEDs also light when the server
detects a hard drive fault.
Troubleshooting and Diagnostics87
Page 88
Troubleshooting Server Subsystems
Networking Subsystem
Use the Oracle Integrated Lights Out Manager (ILOM) Networking page to view networking
information, including the status of Ethernet Controllers and Infiniband Controllers.
The server networking subsystem consists of the following:
■
Ethernet Controllers for network ports:
SMOD0 System A NET0-3 10 GbE Network ports labeled NET0, NET1, NET2, and NET3
SMOD1 System B NET0-3 10 GbE Network ports labeled NET0, NET1, NET2, and NET3
Two 10/100/1000 GbE Network Ethernet ports, one NET MGT port each in SMOD0 and
SMOD1
Ethernet ports enable you to connect the system to the network. The Ethernet ports use RJ45 cables for 10/100/1000BASE-T connections.
Ethernet Port Status Indicators are two status indicators (LEDs) that are visible from the back of
the server.
Status Indicator
Name
ActivityTop left Green■ ON – Link up. Lights when the Network (NET) 10/100/1000BASE-T RJ-
Link speedTop right Bi-
Location and
Color
colored: Amber/
Green
See “Back Panel Connector Locations” on page 91.
88Oracle Server X7-8 Service Manual • April 2018
State and Meaning
45 Gigabit Ethernet (GbE) NET0 port is active. Indicates a live network.
■ Off– No activity. No link. Not operational.
■ FLASHING – Packet activity. Blinks with network traffic.
■ Off – 10BASE-T link (if link up) (10 GigabitEthernet 10GBASE-T)
■ Amber ON – 100BASE-T link (Fast Ethernet 100 BASE-TX)
■ Green ON – 1000BASE-T link (GigabitEthernet 1000BASE-T)
Page 89
Troubleshooting Server Subsystems
I/O Subsystem
The server input/ouput I/O subsystem consists of the following:
■
8 or 16 PCIe Gen3 IO slots (up to eight 16-lane and eight 8-lane)
■
Two 10/100/1000 GbE Network Ethernet ports, one SER MGT port each in SMOD0 and
SMOD1
■
4 USB 3.0 ports (2 external, one each in SMOD0 and SMOD1, 2 internal, one each in
SMOD0 and SMOD1)
Note - Internal USB ports are not used.
PCI Devices Subsystem
Use the Oracle Integrated Lights Out Manager (ILOM) PCI Devices page to view inventory
properties for the PCIe add-in cards and the built-in devices that are detected on your server.
To view the inventory properties for the devices shown on the PCI Devices page, follow these
steps:
1. Click the link at the top of the page for the appropriate PCI device.
2. View the inventory properties appearing in the table. If applicable, mouse-over the Details
column to view additional device properties.
The server PCI devices subsystem consists of the following components:
■
Installed add-in cards and devices: PCI Card optional component
■
On-board devices: Ethernet Controller NET0-3 (Ethernet NIC 1-4)
■
On-board devices: Internal HBAs (SAS controllers) in SMODs
Dual PCIe Card Carrier (DPCC)
In the following figure, callout 1 shows the location of the dual PCIe card carrier (DPCC) bays.
The eight DPCCs are directly accessible from the server back panel and are located below the
SMOD. Each DPCC holds one or two PCIe cards.
Troubleshooting and Diagnostics89
Page 90
Attaching Devices to the Server
Attaching Devices to the Server
The following sections contain procedures for attaching devices to the server. Attach devices to
access diagnostic tools when troubleshooting and servicing the server:
■
“Attach Devices to the Server” on page 90
■
“Back Panel Connector Locations” on page 91
■
“Configuring Serial Port and Network Port Sharing” on page 92
■
“Ethernet Device Naming” on page 94
Attach Devices to the Server
This section provides instructions for connecting remote and local devices to the server so you
can interact with the service processor (SP) and the server console.
For port and connector information, see “Back Panel Connector Locations” on page 91 and
“Back Panel Components” on page 19.
1.
Connect four Ethernet cables to the Gigabit Ethernet (NET) connectors as
needed for OS support.
90Oracle Server X7-8 Service Manual • April 2018
Page 91
Attach Devices to the Server
2.
To connect to Oracle ILOM over the network, connect an Ethernet cable to the
Ethernet port labeled NET MGT.
3.
To access the Oracle ILOM command-line interface (CLI) locally using the
management port, connect a serial null modem cable to the RJ-45 serial port
labeled SER MGT.
Back Panel Connector Locations
The following illustration shows and describes the locations of the back panel connectors. Use
this information to set up the server, so that you can access diagnostic tools and manage the
server during service.
The following figure shows the locations of the server back panel connectors and ports.
CalloutDescriptionSystem/SMODAvailable On
1Net management port (NET MGT)System B – SMOD1Dual 4-socket systems
2Serial management port (SER MGT)System B – SMOD1Dual 4-socket systems
Troubleshooting and Diagnostics91
only
only
Page 92
Attach Devices to the Server
CalloutDescriptionSystem/SMODAvailable On
3USB 3.0 portSystem B – SMOD1Dual 4-socket systems
4■ NET0, NET1, NET2, and NET3
5Net management port (NET MGT)System A – SMOD0All systems
6Serial management port (SER MGT)System A – SMOD0All systems
7USB 3.0 portSystem A – SMOD0All systems
8NET0, NET1, NET2, and NET3 ports on
9Power connectors 2 and 3System A – SMOD0Always connect all four
10Power connectors 0 and 1System B –SMOD1Always connect all four
ports on dual 4-socket systems
■ NET0, NET1, NET2, and NET3
ports on single 8-socket systems
■ Unused ports on single 4-socket
systems
all systems
only
System B – SMOD1Dual 4-socket and single
8-socket systems
System A – SMOD0All systems
power supplies. Connect to
200-240 VAC only.
power supplies. Connect to
200-240 VAC only.
Configuring Serial Port and Network Port Sharing
By default, the SER MGT port connects to the Oracle ILOM CLI. You can assign serial port
output using either the Oracle ILOM web interface or the command-line interface (CLI). For
instructions, see the following sections:
■
“Assign Serial Port Output (Oracle ILOM CLI)” on page 93
■
“Assign Serial Port Output (Oracle ILOM Web Interface)” on page 93
By default, the NET MGT serial port connects to the SP console. Using Oracle ILOM, you can
configure the NET MGT serial port to connect to the host console instead. This feature is useful
for Windows kernel debugging, as it enables you to view non-ASCII character traffic from the
host console.
Do not configure the NET MGT port to connect to the host console until after you have
configured the Oracle ILOM network connection. Otherwise you cannot connect to Oracle
ILOM to switch it back from the host console.
For more details about restoring access to the server port on your server, see the Oracle
Integrated Lights Out Manager (ILOM) 4.0 Documentation Library at: https://www.oracle.
com/goto/ilom/docs.
92Oracle Server X7-8 Service Manual • April 2018
Page 93
Assign Serial Port Output (Oracle ILOM CLI)
Assign Serial Port Output (Oracle ILOM CLI)
1.
Log in to the System A or System B SP Oracle ILOM CLI.
Log in as a user with root or administrator privileges. For example:
ssh root@ipaddress
Where ipaddress is the IP address of the server SP.
The Oracle ILOM CLI prompt appears: ->
For more information, see “Using Oracle ILOM” in Oracle Server X7-8 Installation Guide.
2.
To set the serial port owner, type:
-> set /SP/serial/portsharing owner=host
Note - The serial port sharing value by default is owner=SP.
3.
Connect a serial host to the server.
Assign Serial Port Output (Oracle ILOM Web Interface)
1.
Log in to the SP Oracle ILOM web interface.
To log in, open a web browser and direct it using the IP address of the server SP.
Log in as root or a user with administrator privileges. For more information, see “Using Oracle
ILOM” in Oracle Server X7-8 Installation Guide.
The Summary Information page appears.
2.
Select ILOM Administration → Connectivity from the navigation menu on the left
side of the screen.
3.
Select the Serial Port tab.
The Serial Port Settings page appears.
Note - The serial port sharing setting by default is Service Processor.
4.
In the Serial Port Settings page, select Host Server as the serial port owner.
5.
Click Save for the changes to take effect.
Troubleshooting and Diagnostics93
Page 94
Getting Help
6.
Connect a serial host to the server.
Ethernet Device Naming
This section contains information about the boot order and device naming for the four 10Gigabit Ethernet ports on the back panel of the server. For location information, see “Back
Panel Components” on page 19. From right to left, the ports are numbered NET0 to NET3.
Note - Naming used by the interfaces might vary from that listed below depending on which
devices are installed in the system.
The device naming for the Ethernet interfaces is reported differently by different interfaces and
operating systems. The following table lists the logical (operating system) and physical (BIOS)
naming conventions used for each interface. These naming conventions might vary depending
on conventions of your operating system and which devices are installed in the server.
PortBIOSOracle SolarisLinuxWindows
Net 30703igb 3eth 3net4
Net 20702igb 2eth 2net3
Net 10701igb 1eth 1net2
Net 00700igb 0eth 0net
Getting Help
The following sections describe how to get additional help to resolve server-related problems.
■
“Contacting Support” on page 94
■
“Locating the Chassis Serial Number” on page 95
Contacting Support
If the troubleshooting procedures in this chapter fail to solve your problem, use the following
table to collect information that you might need to communicate to support personnel.
94Oracle Server X7-8 Service Manual • April 2018
Page 95
System Configuration Information NeededYour Information
Service contract number
System model
Operating environment
System serial number
Peripherals attached to the system
Email address and phone number for you and
a secondary contact
Street address where the system is located
Superuser password
Summary of the problem and the work being
done when the problem occurred
IP address
Server name (system host name)
Network or internet domain name
Proxy server configuration
Auto Service Requests
Locating the Chassis Serial Number
You might need to have your server's serial number when you ask for service on your system.
Record this number for future use. Use one of the following methods to locate your server's
serial number:
■
On the front panel of the server, look at the middle left of the bezel to locate the server's
serial number.
■
The serial number is recorded on the yellow Customer Information Sheet (CIS). Locate the
yellow Customer Information Sheet (CIS) attached to your server packaging.
■
Using Oracle ILOM:
■
From the command-line interface (CLI), type the command: show /SYS.
■
From the web interface, view the serial number on the System Information screen.
Auto Service Requests
Oracle Auto Service Requests (ASR) is a feature available to customers having Oracle Premier
Support and is provided to those customers at no additional cost. Oracle ASR is the fastest
Troubleshooting and Diagnostics95
Page 96
Auto Service Requests
way to restore system availability if a hardware fault occurs. Oracle ASR software is secure
and customer installable, with the software and documentation downloadable from My Oracle
Support at https://support.oracle.com. When you log in to My Oracle Support, refer to the
"Oracle Auto Service Request" Knowledge Article document (ID 1185493.1) for instructions on
downloading the Oracle ASR software.
When a hardware fault is detected, Oracle ASR opens a service request with Oracle and
transfers electronic fault telemetry data to help expedite the diagnostic process. Oracle
diagnostic capabilities then analyze the telemetry data for known issues and delivers immediate
corrective actions. For security, the electronic diagnostic data sent to Oracle includes only what
is needed to solve the problem. The software does not use any incoming Internet connections
and does not include any remote access mechanisms.
For more information about the Oracle Auto Service Request feature, go to: https://www.
This section describes how to prepare the server for servicing. The topics describe safety
considerations and provide prerequisite procedures and information about replacing
components within the server.
■
“Electrostatic Discharge and Static Prevention Measures” on page 97
■
“Required Tools and Equipment” on page 100
■
“Preparing the Server for Component Replacement” on page 100
■
“Prepare the Server for Hot Service (Oracle ILOM CLI)” on page 102
■
“Prepare the Server for Hot Service (Oracle ILOM Web Interface)” on page 103
■
“Prepare the Server for Warm Service (Oracle ILOM CLI)” on page 104
■
“Prepare the Server for Warm Service (Oracle ILOM Web Interface)” on page 106
■
“Prepare the Server for Cold Service (Oracle ILOM CLI)” on page 107
■
“Prepare the Server for Cold Service (Oracle ILOM Web Interface)” on page 109
■
“Powering Down the Server” on page 110
■
“Managing the Locate Button/LED” on page 119
Electrostatic Discharge and Static Prevention Measures
Electrostatic discharge (ESD) sensitive devices, such as the PCIe cards, storage drives,
processors (CPUs), and memory cards, require special handling.
Using an Antistatic Wrist Strap
Wear an antistatic wrist strap when handling components such as storage drive assemblies,
circuit boards, or PCIe cards. When servicing or removing server components, attach an
antistatic strap to your wrist and then to a metal area on the server chassis. If your wrist strap is
Preparing for Service97
Page 98
Safety Symbols
equipped with a banana connector, insert it into the grounding socket on the right-hand side of
the chassis front panel.
Following this practice equalizes the electrical potentials between you and the server.
Using an Antistatic Mat
In addition to wearing an antistatic wrist strap when handling components, create an ESD-free
work place by using an antistatic mat as a work surface and as a place to set ESD-sensitive
components such as printed circuit boards, DIMMs, and processors (CPUs). You can use the
following items as antistatic mats:
■
Antistatic bag used to wrap a replacement part
■
ESD mat (orderable from Oracle)
■
A disposable ESD mat (shipped with some optional system components)
Safety Symbols
The following symbols might appear in this document. Note their meanings.
Caution - Risk of personal injury or equipment damage. To avoid personal injury or
equipment damage, follow the instructions.
Caution - Hazardous voltages are present. To reduce the risk of electric shock and danger to
personal health, follow the instructions.
Caution - Hot surface. Avoid contact. Surfaces are hot and might cause personal injury if
touched.
Warning Label
The following warning label is visible from the front of the server when you remove a fan
module. It warns you to not insert your hands or any object into the space left vacant by the
removal of the fan module. Fan modules are hot-swap components. Removing a fan module
Oracle ILOM includes a key identity properties (KIP) auto-update feature that ensures product
information that is used for service entitlement and warranty coverage is accurately maintained
by the server at all times, including during hardware replacement activities.
KIPs include the server product name, product part number (PPN), and product serial number
(PSN). KIPs are stored in the FRUID (field-replaceable unit identifiers) container of the three
server FRUs that are designated quorum members.
The quorum members include:
■
Disk backplane (DBP), designated as a primary quorum member.
■
Motherboard (MB), designated as a backup quorum member.
■
Power supply (PS), designated as a backup quorum member.
When a server FRU that contains the KIP is removed and a replacement component is installed,
the KIP of the replacement component is programmed by Oracle ILOM to contain the same
KIP as the other two components.
Only one of the quorum members can be replaced at a time. Automated updates can only be
completed when two of the three quorum members contain matching key identity properties.
Related Information
■
“Servicing Components” on page 123
Preparing for Service99
Page 100
Required Tools and Equipment
Required Tools and Equipment
The server can be serviced with the following tools:
■
ESD mat and grounding strap
■
Antistatic wrist strap
■
No. 2 Phillips screwdriver
■
Non-conducting stylus
■
Labels and a pen for labeling cables
■
Mechanical lift
■
Torx T30 screwdriver (Processor replacement)
■
12.0 in-lbs (inch-pounds) torque driver with Torx T30 bit (Processor replacement)
You might also need a system console device, such as one of the following:
■
PC or workstation with RS-232 serial port
■
ASCII terminal
■
Terminal server
■
Patch panel connected to a terminal server
Related Information
■
“Electrostatic Discharge and Static Prevention Measures” on page 97
Preparing the Server for Component Replacement
This section provides procedures to set up the server for hot, warm, or cold serviceability so you
can safely remove, replace, or install components.
Before you can remove and install components that are inside the server, you must perform
certain procedures in the following sections:
■
“Prepare the Server for Hot Service (Oracle ILOM CLI)” on page 102
■
“Prepare the Server for Hot Service (Oracle ILOM Web Interface)” on page 103
■
“Prepare the Server for Warm Service (Oracle ILOM CLI)” on page 104
■
“Prepare the Server for Warm Service (Oracle ILOM Web Interface)” on page 106
■
“Prepare the Server for Cold Service (Oracle ILOM CLI)” on page 107
■
“Prepare the Server for Cold Service (Oracle ILOM Web Interface)” on page 109
100Oracle Server X7-8 Service Manual • April 2018
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.