This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except
as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform,
publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is
prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation,
delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental
regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the
hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous
applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all
appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this
software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of
SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered
trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are
not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement
between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content,
products, or services, except as set forth in an applicable agreement between you and Oracle.
Access to Oracle Support
Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?
ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.
Ce logiciel et la documentation qui l'accompagne sont protégés par les lois sur la propriété intellectuelle. Ils sont concédés sous licence et soumis à des restrictions d'utilisation et
de divulgation. Sauf stipulation expresse de votre contrat de licence ou de la loi, vous ne pouvez pas copier, reproduire, traduire, diffuser, modifier, accorder de licence, transmettre,
distribuer, exposer, exécuter, publier ou afficher le logiciel, même partiellement, sous quelque forme et par quelque procédé que ce soit. Par ailleurs, il est interdit de procéder à toute
ingénierie inverse du logiciel, de le désassembler ou de le décompiler, excepté à des fins d'interopérabilité avec des logiciels tiers ou tel que prescrit par la loi.
Les informations fournies dans ce document sont susceptibles de modification sans préavis. Par ailleurs, Oracle Corporation ne garantit pas qu'elles soient exemptes d'erreurs et vous
invite, le cas échéant, à lui en faire part par écrit.
Si ce logiciel, ou la documentation qui l'accompagne, est livré sous licence au Gouvernement des Etats-Unis, ou à quiconque qui aurait souscrit la licence de ce logiciel pour le
compte du Gouvernement des Etats-Unis, la notice suivante s'applique :
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation,
delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental
regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the
hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
Ce logiciel ou matériel a été développé pour un usage général dans le cadre d'applications de gestion des informations. Ce logiciel ou matériel n'est pas conçu ni n'est destiné à être
utilisé dans des applications à risque, notamment dans des applications pouvant causer un risque de dommages corporels. Si vous utilisez ce logiciel ou ce matériel dans le cadre
d'applications dangereuses, il est de votre responsabilité de prendre toutes les mesures de secours, de sauvegarde, de redondance et autres mesures nécessaires à son utilisation dans
des conditions optimales de sécurité. Oracle Corporation et ses affiliés déclinent toute responsabilité quant aux dommages causés par l'utilisation de ce logiciel ou matériel pour des
applications dangereuses.
Oracle et Java sont des marques déposées d'Oracle Corporation et/ou de ses affiliés. Tout autre nom mentionné peut correspondre à des marques appartenant à d'autres propriétaires
qu'Oracle.
Intel et Intel Xeon sont des marques ou des marques déposées d'Intel Corporation. Toutes les marques SPARC sont utilisées sous licence et sont des marques ou des marques
déposées de SPARC International, Inc. AMD, Opteron, le logo AMD et le logo AMD Opteron sont des marques ou des marques déposées d'Advanced Micro Devices. UNIX est une
marque déposée de The Open Group.
Ce logiciel ou matériel et la documentation qui l'accompagne peuvent fournir des informations ou des liens donnant accès à des contenus, des produits et des services émanant de
tiers. Oracle Corporation et ses affiliés déclinent toute responsabilité ou garantie expresse quant aux contenus, produits ou services émanant de tiers, sauf mention contraire stipulée
dans un contrat entre vous et Oracle. En aucun cas, Oracle Corporation et ses affiliés ne sauraient être tenus pour responsables des pertes subies, des coûts occasionnés ou des
dommages causés par l'accès à des contenus, produits ou services tiers, ou à leur utilisation, sauf mention contraire stipulée dans un contrat entre vous et Oracle.
Accès aux services de support Oracle
Les clients Oracle qui ont souscrit un contrat de support ont accès au support électronique via My Oracle Support. Pour plus d'informations, visitez le site http://www.oracle.com/
pls/topic/lookup?ctx=acc&id=info ou le site http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs si vous êtes malentendant.
Page 5
Contents
Using This Documentation ............. ................ ................ ................ ................ ... 11
Oracle Server X5-4 Service Manual Overview ........... ................ ................ ........ 13
Oracle Server X5-4 Overview ............... ................ ................ ................ ............. 15
Server Overview ........... ................ ................ ................ ................ ................ 16
External Components and Features .............. ................ ................ ..................... 17
Server Front Panel Features .......... ................ ................ ................ .......... 17
Server Back Panel Features .......................................... ................ ........... 18
Server Subsystems Overview ........ ................ ................ ................ ................ ... 19
System Block Diagrams ...... ................................................................... 20
Post Codes From hostdiags .................................................................... 350
Index ............. ................ ................ ................ ................ ................ ................ ... 351
9
Page 10
10Oracle Server X5-4 Service Manual • December 2015
Page 11
Using This Documentation
This section describes how to get the latest firmware and software for the system,
documentation and feedback, and a document change history.
■
“Oracle Server X5-4 Model Naming Convention” on page 11
■
“Getting the Latest Firmware and Software” on page 11
■
“Documentation and Feedback” on page 12
■
“About This Documentation” on page 12
■
“Contributors” on page 12
■
“Change History” on page 12
Oracle Server X5-4 Model Naming Convention
The Oracle Server X5-4 name identifies the following:
■
X identifies an x86 product.
■
The first number, 5, identifies the generation of the server.
■
The second number, 4, identifies the number of processor sockets in the server.
Getting the Latest Firmware and Software
Firmware, drivers, and other hardware-related software for each Oracle x86 server are updated
periodically.
You can obtain the latest version in the following ways:
■
Oracle System Assistant: A factory-installed option for Oracle x86 servers. It has all the
tools and drivers you need and resides on an internal USB flash stick.
■
My Oracle Support: The Oracle support web site located at https://support.oracle.com.
Using This Documentation11
Page 12
Documentation and Feedback
Documentation and Feedback
DocumentationLink
All Oracle products
Oracle Server X5-4
Oracle Integrated Lights Out Manager (ILOM). Refer to
the documentation for your supported version of Oracle
ILOM as listed in the Product Notes.
Oracle Hardware Management Pack. Refer to the
documentation for your supported version as listed in the
Product Notes.
Provide feedback on this documentation at: http://www.oracle.com/goto/docfeedback.
About This Documentation
https://docs.oracle.com/
http://www.oracle.com/goto/X5-4/docs-videos
http://www.oracle.com/goto/ILOM/docs
http://www.oracle.com/goto/ohmp/docs
This documentation set is available in both PDF and HTML. The information is presented in
topic-based format (similar to online help) and therefore does not include chapters, appendixes,
or section numbering.
Contributors
Primary Authors: Ray Angelo, Mark McGothigan, Ralph Woodley, Michael Bechler
Contributors: Kenny Tung, Johnny Hui, Prafull Singhal, Barry Wright, Cynthia Chin-Lee,
David Savard, Tamra Smith-Wasel, Todd Creamer, William Schweickert
Change History
The following lists the release history of this documentation set:
■
December 2015. Technical updates.
■
August 2015: Minor revisions and updates to docs and library.
■
June 2015: Initial publication.
12Oracle Server X5-4 Service Manual • December 2015
Page 13
Oracle Server X5-4 Service Manual Overview
This document contains service information and maintenance procedures for the Oracle®
Server X5-4. The following table describes the major sections of this manual.
DescriptionLink
Server system overview.“Oracle Server X5-4 Overview” on page 15
Troubleshooting and diagnostic procedures and
information.
Server service-related information and procedures.“Servicing the Server” on page 69
Procedures for preparing to service the server.“Preparing to Service the Server” on page 93
Procedures for servicing customer-replaceable units
(CRUs).
Procedures for servicing field-replaceable units (FRUs).“Servicing FRU Components” on page 179
Procedures for preparing the server for operation.“Returning the Server to Operation” on page 259
Accessing the BIOS setup program.“BIOS Setup Utility Menu Options” on page 273
Listing of Power On Self-Test (POST) error codes and
their meaning.
“Troubleshooting and Diagnostics” on page 37
“Servicing CRU Components” on page 113
“POST and Checkpoint Codes” on page 341
Oracle Server X5-4 Service Manual Overview13
Page 14
14Oracle Server X5-4 Service Manual • December 2015
Page 15
Oracle Server X5-4 Overview
This section describes the major features, components, and capabilities of the server.
DescriptionLink
Server overview statement“Server Overview” on page 16
Components and features of the
server front and back panels
Server subsystem components“Server Subsystems Overview” on page 19
“External Components and Features” on page 17
Oracle Server X5-4 Overview15
Page 16
Server Overview
Server Overview
The Oracle Server X5-4 is a 3RU rack-mount server system. The following table lists the
server-supported components.
MemoryUp to eight memory riser cards are supported (two risers per CPU) in the server chassis. Each
Supported configurations:
■ Two processors installed in sockets 0 and 1
■ Four processors installed in sockets 0 through 3
memory riser supports up to twelve DDR3-1600 ECC low-voltage registered or load-reduced
DIMMs, allowing up to twenty-four DIMMs per processor. Installed DIMMs must be the
same type and size.
■ In a two CPU system, you can install up to a maximum of 1.5 TB of system memory.
■ In a four CPU system, you can install up to a maximum of 3 TB of system memory.
For information on supported DIMM configurations, see “Supported DIMMs and DIMM
Population Rules” on page 157.
Storage devicesFor internal storage, the server chassis provides:
■ Six 2.5-inch drive bays, accessible through the front panel.
All bays can be populated with SAS-3 HDDs or SSDs.
Four of the six drive bays (2 through 5) can also support NVMe SSD drives.
Note - NVMe drive support requires the purchase an optional PCIe NVMe Switch card
during the initial factory order of the server. It cannot be added later.
■ An optional tray-load DVD+/-RW drive on the front of the server, below the drive bays.
■ Oracle Storage 12 Gb/s SAS RAID PCIe HBA, Internal.
This card supports RAID levels 0, 1, 5, 6, 10, 50, and 60, with a minimum of 1 Gb
data cache, and Battery Backed Write Cache (BBWC) using an ESM (Energy Storage
Module).
USB 2.0 ports■ Four external high-speed USB ports: two in front and two in back.
■ Two internal high-speed USB ports on the motherboard.
One internal port holds the optional factory-installed Oracle System Assistant (OSA)
flash drive. A second internal port can hold a USB flash drive for system booting.
VGA portsTwo high-density DB-15 video ports: one in front and one in back.
The server includes an embedded VGA 2D graphics controller with 8 MB supporting
resolutions up to 1600 x 1200 x 16bits @ 60 Hz (1024 x 768 when viewed remotely using
Oracle ILOM RKVMS).
Note - The back VGA port supports VESA Device Data Channel for monitor identification.
Service processorEmulex Pilot 3 base management controller (BMC):
■ Supports the industry-standard IPMI feature set
■ Supports remote KVMS, DVD, and ISO image over IP
■ Includes serial port
■ Supports Ethernet access to SP through a dedicated 10/100/1000 RJ-45 Gigabit Ethernet
(GbE) management port and optionally through one of the host GbE ports (sideband
management)
Power suppliesTwo hot-swappable power supplies, each with 1030/2060 watts (low line/high line) capacity,
Cooling fans■ Six hot-swappable, redundant fans at chassis front (top-loading)
Management
software
Service labelsThe system comes with two handy service labels for quick reference information. One on the
auto-ranging, light load efficiency mode and redundant over-subscription
■ Two redundant fans, one in each power supply
The following options are available:
■ Oracle Integrated Lights Out Manager (ILOM) on the Service Processor
■ Oracle System Assistant (OSA) on an optional internal USB flash drive
■ Oracle Hardware Management Pack.
■ Oracle Enterprise Management Ops Center, which can be downloaded from the Oracle
site
exterior of the server (describing exterior components), and one on the underside of the top
cover (describing interior components).
External Components and Features
The following sections call out the features of the server front and back panels.
■
“Server Front Panel Features” on page 17
■
“Server Back Panel Features” on page 18
Server Front Panel Features
The following illustration shows the server front panel and describes its features.
Oracle Server X5-4 Overview17
Page 18
External Components and Features
CalloutDescription
1Locator LED/button: white
2Service Action Required indicator: amber
3System OK indicator: green
4Power button
5SP OK indicator: green
6Service Action Required indicators (3) for fan module (FAN), Processor (CPU) and Memory:
10Service processor (SP) RJ-45 network management (NET MGT) port
11Service processor (SP) RJ-45 serial management (SER MGT) port
12DB-15 video connector
Server Subsystems Overview
Server Subsystems Overview
This section provides information about the server subsystems:
■
“System Block Diagrams” on page 20
■
“Processor Subsystem” on page 22
■
“Memory Subsystem” on page 25
■
“Cooling Subsystem” on page 25
Oracle Server X5-4 Overview19
Page 20
Server Subsystems Overview
■
“Power Subsystem” on page 28
■
“Storage Subsystem” on page 30
■
“Input/Output (I/O) Subsystem” on page 31
■
“System Management Subsystem” on page 34
System Block Diagrams
The server can be configured with two or four CPUs. This section shows the system block
diagrams for these two server configurations:
■
“Two-CPU Block Diagram” on page 21
■
“Four-CPU Block Diagram” on page 22
20Oracle Server X5-4 Service Manual • December 2015
Page 21
Two-CPU Block Diagram
Server Subsystems Overview
Oracle Server X5-4 Overview21
Page 22
Server Subsystems Overview
Four-CPU Block Diagram
Processor Subsystem
The Oracle Server X5-4 uses the Intel Xeon E7-8895 v3 18-core 2.6 GHz processor
and supports two CPU-based configurations: a two-CPU configuration and a four-CPU
configuration.
22Oracle Server X5-4 Service Manual • December 2015
Page 23
Server Subsystems Overview
Two-CPU Configuration
Servers with two CPUs have CPUs and heatsinks in sockets 0 and 1 and CPU cover plates
installed in sockets 2 and 3. This configuration requires four memory riser cards and an
air baffle to control airflow for maximum cooling. The following illustration shows the
components in a two-CPU server configuration.
CalloutDescription
1Air baffle
2CPU P1
3CPU P0
4Memory riser card P1/MR1
5Memory riser card P1/MR0
6Memory riser card P0/MR1
7Memory riser card P0/MR0
For more information, see “Two-CPU Block Diagram” on page 21.
Oracle Server X5-4 Overview23
Page 24
Server Subsystems Overview
Four-CPU Configuration
In addition to four CPUs, this configuration requires eight memory riser cards. The following
illustration shows the components in a four-CPU server configuration.
The four-CPU configuration offers a greater level of resiliency with redundant QPI
interconnects that allow working CPUs to route around a disabled CPU as the system starts.
For more information, see “Four-CPU Block Diagram” on page 22.
24Oracle Server X5-4 Service Manual • December 2015
Page 25
Server Subsystems Overview
Memory Subsystem
This section describes the server memory subsystem.
For more information about system memory (including DIMM population rules) and MR cards,
see: “Memory Riser Card and DIMM Reference” on page 154
Memory Slot Capacity
System memory resides on memory riser (MR) cards. Each card has 12 DIMM slots. The slot
capacity of the server depends on the number of MR cards in the server, which in turn depends
on the number of CPUs in the server. The server is available in a two CPU configuration
and a four CPU configuration. Each CPU requires two MR cards. Therefore, a two-CPU
configuration has 48 slots (four MR cards) and a four-CPU configuration has 96 slots (eight
MR cards).
Memory Channels and Buffers
MR cards contain twelve DIMM slots, four DDR3 channels, and two memory buffer ASICs.
Each memory buffer has two channels (A and B) and links to three DIMM slots per channel.
Each memory buffer is connected to the processor's built-in memory controller by an SMI-2
link.
Memory Performance
For balanced performance, each channel for each memory buffer on the MR card must be
populated. The only exception supported is a minimum factory configuration of two 16 GB
DIMMs per MR card in DIMM slots D0 and D3.
Cooling Subsystem
The internal components in the system are cooled by air that is pulled in through the front of the
server and exhausted out the back of the server. Cooling occurs in two areas of the chassis, the
power supply area and the motherboard area.
Oracle Server X5-4 Overview25
Page 26
Server Subsystems Overview
Power Supply Cooling Area
The power supply area uses fans at the back of the power supplies to draw cool air in past the
drives, through power supplies, and out the back.
Motherboard Cooling Area
The motherboard area is divided into three zones where six 92-mm high-performance fans
pull cool air in from the front of the server, move it across the motherboard, memory risers,
processors, and I/O cards, and exhausts warm air out the back of the server.
The six fan modules are arranged in two rows allowing for a pair of redundant stacked fans for
each of the three motherboard zones. If one of the fan modules fails, the other fan module in the
pair has sufficient power to cool the zone until the failed fan can be replaced. However, if both
fan modules in a pair fail, Oracle ILOM will power off the system to prevent thermal damage.
Pressure Areas
The power supply and motherboard cooling areas have separate air pressures. The pressures are
maintained by a plastic divider that, combined with the top cover, creates a seal between the two
areas. It is important to maintain this seal because separate pressurizations for each area is the
key to maintaining the integrity of the cooling system and the health of the server.
Cooling Zones and Temperature Sensors
The two cooling areas are divided into four zones, one zone for the power supply area and three
zones for the motherboard area. Dividing the cooling into zones allows for greater use of system
resources, since each zone can operate independently at its highest efficiency. The zones are
designated from left to right (from the front of the server) as: zone 0, zone 1, zone 2, and zone 3
(power supply area). Temperature monitoring of each zone is accomplished using motherboardmounted temperature sensors.
The following illustration shows the cooling zones and the approximate location of the
temperature sensors. The accompanying legend table provides sensor NAC names and sensor
motherboard designations:
26Oracle Server X5-4 Service Manual • December 2015
A two-CPU server configuration has fewer components than a fully loaded, four-CPU
configuration. To maximize cooling in a two-CPU configuration, an air baffle is installed
in the memory riser area. The air baffle directs the fan output across the four memory riser
cards and the two CPUs. For more information about the processor subsystem, see “Processor
Subsystem” on page 22.
Over-Temperature Issues
When the server cooling system is compromised by a hardware component failure or an airflow
blockage, the internal temperature of the server can increase and cause component failure.
To protect against over-temperature conditions, the server temperature and components are
monitored using sensors. If the reading from a sensor indicates a temperature outside of the
normal operating range of the component, or if a cooling subsystem-related component (such
as a fan module) fails, the server management software lights the server fault indicator for the
component and logs an event in the system event log (SEL). When a fault event occurs, address
the issue immediately.
For information about troubleshooting the server cooling subsystem, see “Troubleshooting
System Cooling Issues” on page 52.
Power Subsystem
The server is equipped with two 1030/2060 watt auto-ranging hot-swappable power supplies
that support a two CPU configuration at 110–127 VAC, or two or four CPU configuration at
200–240 VAC. The dual power supply configuration provides N+N redundancy.
The server supports the following power modes, server shutdowns, and resets.
Full Power Mode
When full power mode is applied, power is supplied to all the server components, the server
boots, and the operating system (OS) functions. Apply full power mode by pressing the Power
button on the server front panel when the server is in standby power mode. You can also apply
full power to the server from Oracle ILOM. Once the server is operating in full power mode,
the System OK and service processor (SP) indicators are on steady (see “Server Boot Process
and Normal Operating State Indicators” on page 41).
28Oracle Server X5-4 Service Manual • December 2015
Page 29
Server Subsystems Overview
Note - During an initial Power up sequence, the front fans will briefly run at full speed as part
of a power-on test.
Standby Power Mode
Standby power is a non-operating mode (OS does not boot), in which low-level power is
supplied only to the components that are required to run the SP. To enter standby power mode,
connect the AC power cables to the back of the server, but do not press the front panel Power
button. You can also enter standby power mode by powering off the server from full power
mode using one of the power-off methods (see below).
In standby power mode, the green SP indicator blinks while the SP is booting. Once the
SP has booted, this SP indicator is steady on, and the green System OK indicator goes to
standby blink (once every 3 seconds). See “Server Boot Process and Normal Operating State
Indicators” on page 41.
Graceful Shutdown
A graceful shutdown (also referred to as an orderly shutdown) is the safest method of shutting
down the server to standby power mode because it warns users, closes files, and prepares the
file system. To perform a graceful shutdown, use the server OS, Oracle ILOM, or the server
front panel Power button.
A Graceful shutdown via the Power button is done with a single (momentary) press. However,
an Immediate (Emergency) shutdown via the power button is done by pressing and holding the
power button for > 5 seconds.
Immediate Shutdown
An immediate shutdown of the server (also referred to as an emergency shutdown) should be
used only in situations when you know that the loss of data is nonexistent or acceptable. An
immediate shutdown does not warn users, does not properly close files, and does not gracefully
shut down the operating system. Full power is immediately removed and the server goes to
standby power mode.
Complete Power Removal
Shutting down the server from full power mode to standby power mode does not completely
remove power from the server. When it is in standby power mode, the server is in a low-power
state. This low-power state is enough to maintain the service processor (SP), which runs Oracle
ILOM. To completely remove power from the server, you need to remove the AC power cords.
Warm Reset or Reboot
A warm reset is a reboot or restart of the server that occurs when you cycle server power from
full power mode to standby power mode and back to full power mode. For example, a warm
Oracle Server X5-4 Overview29
Page 30
Server Subsystems Overview
reset might be required after a software or firmware update or when you want to launch Oracle
System Assistant or the BIOS Setup Utility.
Cold Reset
A cold reset occurs when you restart the server from a completely powered-off state. A cold
reset might be required to resolve a system issue. To perform a cold reset, place the server in
standby power mode, disconnect the server from its AC power source, wait 30-60 seconds, then
reconnect the server to its AC power source, allow the SP to boot, and then reapply full power.
See Also:
■
“Power On the Server” on page 271
■
“Powering Off the server” on page 101.
■
“Resetting the Host or Service Processor” on page 63.
Storage Subsystem
The server storage subsystem consists of the following:
■
“Six 2.5-inch Drive Bays” on page 30
■
“SATA DVD +/-RW Drive” on page 31
Six 2.5-inch Drive Bays
The six 2.5-inch storage drive bays are located at the front of the server. The supported drive
interfaces for each bay depend on the type of storage controller installed at the factory. SAS
drives require a SAS Host Bus Adapter (HBA) and NVMe drives require a PCIe NVMe Switch
card.
30Oracle Server X5-4 Service Manual • December 2015
Page 31
Server Subsystems Overview
CalloutDescription
1Slots for SAS or NVMe drives
2Slots for SAS drives only
■
When configured with SAS drives (mechanical or SSD), the system must have one Oracle
Storage 12 Gb/s SAS RAID PCIe HBA, Internal (7110117) installed in PCIe slot 2. The
PCIe Gen-3 internal HBA has eight internal SAS3 ports that connect from the card to the
system backplane through two bundled cables.
■
When configured with NVMe SSD drives (up to four), the system must have one Oracle
PCIe NVMe Switch card (7111393) installed in PCIe slot 1. The NVMe switch card has
four NVMe internal ports that connect from the card to the system disk backplane through
four bundled cables.
For slot designations, see “DVD, Storage Drive, and USB Designations” on page 78.
SATA DVD +/-RW Drive
An optional DVD-RW SATA-Gen3 drive is located at the front of the server below the drive
bays. A SATA3 port on the motherboard connects to the disk backplane through a SATA cable
bundled with the HBA SAS1 cable.
For DVD designation, see “DVD, Storage Drive, and USB Designations” on page 78.
Input/Output (I/O) Subsystem
The server I/O storage subsystem consists of the following:
■
“Eleven PCIe Gen 3 Slots” on page 31
■
“Two Internal and Four External High-Speed USB Ports” on page 32
■
“SATA DVD +/-RW Drive” on page 31
Eleven PCIe Gen 3 Slots
The server contains 11 PCIe Gen 3 slots, of which nine are x8 slots and two are x16 slots. All
11 slots are available in a four-CPU configured server. Only the first six slots (1-6) are available
in a two-CPU configured server.
Oracle Server X5-4 Overview31
Page 32
Server Subsystems Overview
Slot 2 is reserved for the HBA that can support up to six SAS/SATA (mechanical or SSD)
drives in all six drive slots.
Slot 1 can be used for a factory option PCIe NVMe Switch Card that can support up to four
NVMe SSDs in drive slots 2 through 5.
Note - The NVMe Switch card is only supported in PCIe slot 1 and will not work in other PCIe
slots.
For slot designation information, see “PCIe Slot Designations” on page 77.
Two Internal and Four External High-Speed USB Ports
The two internal USB ports are located on the motherboard between the disk drive backplane
and the PSU backplane boards. These ports can take a standard USB flash device, which can
be used for system booting. Your system might be equipped with a preinstalled Oracle System
Assistant USB device.
The Oracle System Assistant provides a separately bootable device that aids in the installation
of the primary host OS, server hardware configuration, and the firmware upgrade process. Do
not use the Oracle System Assistant USB drive as the primary host boot device or as server
storage. If installed in your server, the Oracle System Assistant USB drive must be installed in
the port labeled "OSA USB."
For port designations, see “DVD, Storage Drive, and USB Designations” on page 78.
Additionally, the server has four external USB ports, two on the front panel and two on the back
panel. See “External Components and Features” on page 17.
Four Onboard 10GbE Ports
Four 10 GigabitEthernet ports are located on the back panel of the server (see “Back Panel
Connector Locations” on page 60). From left to right, the two bottom ports are NET 0 and
NET 1; the two top ports are NET2 and NET 3, as shown in the following illustration.
32Oracle Server X5-4 Service Manual • December 2015
Page 33
Server Subsystems Overview
BIOS detects the Ethernet ports in the following order during server boot:
1. NET 0
2. NET 1
3. NET 2
4. NET 3
Note - You can change the boot priority using the Boot Device Priority screen available in the
Boot menu of the BIOS Setup Utility.
The device naming for the Ethernet interfaces is reported differently by different interfaces
and operating systems. The following illustration explains the logical (operating system) and
physical (BIOS) naming conventions used for each interface.
Note - Naming used by the interfaces might vary from that listed below, depending on which
devices are installed in the system.
PortBIOSSolarisLinuxWindows
Net 38101igb 3eth 3net 4
Net 28100igb 2eth 2net3
Net 10701igb 1eth 1net2
Net 00700igb 0eth 0net
Oracle Server X5-4 Overview33
Page 34
Server Subsystems Overview
System Management Subsystem
The server has two embedded system management tools (Oracle ILOM and Oracle System
Assistant) and a suite of command line tools that can be run from the host.
Service Processor (SP) Oracle ILOM
The server comes with a removable service processor (SP) daughter card that is mounted on
the server motherboard. The SP supports the industry-standard IPMI feature set and includes
Oracle Integrated Lights Out Manager (ILOM) 3.2.5 and remote redirection of keyboard, video,
mouse, and storage (KVMS).
The SP runs Oracle ILOM, a single-server management tool that allows you to monitor
and maintain your server by providing real-time status and detailed information about the
subsystems and components in your server. Oracle ILOM runs independently of the server
OS and is accessible in both full power and standby power modes. The SP Oracle ILOM is
accessible through the SP 100/1000/10000 Ethernet NET MGT port located on the server
back panel or through one of the host's four built-in 10 GigabitEthernet ports (using sideband
management).
Oracle System Assistant
Your server might also come equipped with Oracle System Assistant. Oracle System Assistant
is a server provisioning and update tool that assists in initial server set up and OS installation
and allows you to easily manage server updates. As an option, Oracle System Assistant is
delivered on a USB flash drive that is factory-installed in the internal USB slot labeled "OSA
USB." The drive is configured with a server-specific version of Oracle System Assistant. You
can start Oracle System Assistant from the server boot screen or from Oracle ILOM.
34Oracle Server X5-4 Service Manual • December 2015
Page 35
Server Subsystems Overview
With Oracle System Assistant, you can:
■
Get a single server-specific bundle of the latest available BIOS, Oracle ILOM, and
hardware firmware and the latest tools and OS drivers from the Oracle support site.
■
Update OS drivers and component firmware and configure RAID.
■
Install supported operating systems with the latest drivers and supported tools.
■
Configure a subset of Oracle ILOM settings.
■
Save and restore customized BIOS settings or revert the BIOS to the factory defaults.
■
Access embedded product documentation.
■
Display system overview and detailed hardware inventory information.
Oracle Hardware Management Pack
Oracle Hardware Management Pack provides scriptable command-line tools that help you
manage and configure your Oracle servers from the host operating system.
Hardware Management Pack enables you to do the following using command-line tools:
■
Configure BIOS (Legacy and UEFI), RAID volumes, and Oracle Integrated Lights Out
Manager (ILOM).
■
Upgrade server component firmware.
■
Access the service processor and perform management tasks using IPMItool.
■
View hardware configuration information and the status of your Oracle servers.
■
Enable in-band monitoring of your Oracle hardware over Simple Network Management
Protocol (SNMP). You can use this information to integrate your Oracle servers into your
data center management infrastructure.
■
Set up an Oracle ILOM trap proxy that forwards SNMP traps from your Oracle ILOM
service processor to the host OS.
Oracle Server X5-4 Overview35
Page 36
36Oracle Server X5-4 Service Manual • December 2015
Page 37
Troubleshooting and Diagnostics
This section describes troubleshooting information and provides procedures to help you
troubleshoot server hardware component faults.
DescriptionLink
Maintenance-related information and
procedures that you can use to troubleshoot
and repair server hardware issues.
Information about software and firmware
diagnostic tools that you can use to isolate
problems, monitor the server, and exercise
the server subsystems.
Information about attaching devices to the
server to perform troubleshooting.
Information on resetting the host or service
processor (including the Oracle ILOM root
password and account).
Information about contacting Oracle support. “Getting Help” on page 66
“Troubleshooting Server Hardware Component
Faults” on page 37
“Troubleshooting With Diagnostic Tools” on page 57
“Attaching Devices to the Server” on page 59
“Resetting the Host or Service Processor” on page 63
Troubleshooting Server Hardware Component Faults
This section describes maintenance-related information and provides procedures that you can
use to troubleshoot and repair server hardware issues.
DescriptionSection Links
Troubleshooting overview information and
procedure.
Discerning the server state using the front
panel indicators.
Explanation of the system Fault Remind Test
Circuit.
“Troubleshooting Hardware Faults Using Oracle
ILOM” on page 38
“Troubleshooting Using the Front Panel Indicators” on page 40
“Troubleshooting Using the Fault Remind Test
Circuits” on page 51
Troubleshooting and Diagnostics37
Page 38
Troubleshooting Hardware Faults Using Oracle ILOM
DescriptionSection Links
Causes, actions, and preventative measures
for problems related to the cooling
subsystem.
Causes, actions, and preventative measures
for problems related to the power subsystem.
Troubleshooting Hardware Faults Using Oracle
ILOM
This section provides a troubleshooting procedure that you can use to investigate server
hardware faults and, if necessary, prepare the server for service
When a server hardware fault event occurs the system lights the Service Action Required
indicator and captures the event in the system event log (SEL). If you have set up notification
through Oracle ILOM, you also receive an alert through the notification method you chose.
When you become aware of a hardware fault, you should address it immediately.
“Troubleshooting System Cooling Issues” on page 52
“Troubleshooting Power Issues” on page 54
1.
Log in to the server SP Oracle ILOM web interface.
Open a browser and enter the IP address of the server SP. At the login screen, type a user name
(with administrator privileges) and password. The Summary screen appears.
The Status section of the Summary screen provides information about the server subsystems,
including:
■
Processors
■
Memory
■
Power
■
Cooling
■
Storage
■
Networking
■
I/O Modules
38Oracle Server X5-4 Service Manual • December 2015
Page 39
Troubleshooting Hardware Faults Using Oracle ILOM
2.
In the Status section of the summary screen, identify the server subsystem that
requires service.
In the above example, the Status screen shows that the Memory subsystem requires service.
This indicates that a hardware component within the subsystem is in a fault state.
3.
To identify the component, click on the subsystem name.
The subsystem screen appears.
The above example shows the processor information screen and indicates that CPU 0 has a
problem.
Troubleshooting and Diagnostics39
Page 40
Troubleshooting Hardware Faults Using Oracle ILOM
4.
To get more information, click one of the Open Problems links.
The Open Problems screen provides detailed information, such as the time the event occurred,
the component and subsystem name, and a description of the issue. It also includes a link to a
KnowledgeBase article.
Tip - The System Log provides a chronological list of all the system events and faults that have
occurred since the log was last reset and includes additional information, such as severity levels
and error counts. The System Log also includes information on devices not reported in the
Subsystem Summary screen. To access it, click the System Log link.
In this example, the hardware fault with DIMM 8 of CPU 0 requires local/physical access to the
server.
5.
Before going to the server, review the server Product Notes document for
information related to the issue or the component.
The Product Notes document contains up-to-date information about the server, including
hardware-related issues.
6.
To prepare the server for service.
See “Preparing to Service the Server” on page 93.
7.
Service the component.
Note - After servicing the component, you might need to clear the fault in Oracle ILOM. For
more information, refer the component service procedure.
Troubleshooting Using the Front Panel Indicators
This section describes the state of the server front panel indicators when the system components
are in a fault state.
The eight indicators on the server front panel show the state of the server. The following
sections describe the conditions of the front panel indicators for various server states:
Note - For more information about the server front panel, see “Server Front Panel
Features” on page 17.
■
“Server Boot Process and Normal Operating State Indicators” on page 41
40Oracle Server X5-4 Service Manual • December 2015
Page 41
Troubleshooting Hardware Faults Using Oracle ILOM
■
“Locator Indicator On” on page 42
■
“Over Temperature Condition” on page 42
■
“PSU Failure” on page 43
■
“Memory Failure” on page 43
■
“CPU Failure” on page 44
■
“Fan Module Failure” on page 44
■
“SP Failure” on page 45
■
“Front Panel Lamp Test” on page 45
■
“Indicator Blink Rates” on page 46
Server Boot Process and Normal Operating State Indicators
A normal server boot process involves two indicators, the service processor (SP) indicator and
the System OK indicator. The process is described below:
1. When the AC power is applied to the server, the service processor (SP) boots. As the SP
boots, its indicator blinks at the slow blink rate and the System OK indicator is off. For
indicator blink rate information, see “Indicator Blink Rates” on page 46.
2. When the SP has successfully booted, the SP indicator is on steady and the System OK
indicator blinks at the single blink rate. This indicates that the server is in standby power
mode (see “Power Subsystem” on page 28).
3. When the server host is booting (full power applied), the System OK indicator blinks at the
fast blink rate and the SP indicator is on steady. Once the server has successfully booted, the
System OK indicator turns on steady.
In its normal operating state, the system OK indicator and SP indicator are on steady and
green.
Troubleshooting and Diagnostics41
Page 42
Troubleshooting Hardware Faults Using Oracle ILOM
Locator Indicator On
The Locator indicator helps identify a server in a rack of servers. It can be activated remotely
from Oracle ILOM or from the front panel (by pressing the Locator button). Once activated, the
indicator blinks at the fast blink rate.
For indicator blink rate information, see “Indicator Blink Rates” on page 46.
For information on remotely turning on the Locator indicator, see “Managing the Locator
Indicator” on page 107.
Over Temperature Condition
For a server in an over-temperature state, the amber Service Action Required and Temperature
indicators are on steady. The green OK indicator and the green SP indicator are on steady.
42Oracle Server X5-4 Service Manual • December 2015
Page 43
Troubleshooting Hardware Faults Using Oracle ILOM
PSU Failure
For a server with one of its PSU's in a failed state, the amber Service Action Required and PS
REAR indicators are on steady. The green System OK indicator and the green SP indicator are
on steady. In addition, the Service Action Required indicator on the failed PSU, as seen from
the back of the system, will also light.
Memory Failure
For a server with a failure in the memory subsystem, the amber Service Action Required and
MEM TOP indicators are on steady. The green OK indicator and the green SP indicator are on
steady.
Troubleshooting and Diagnostics43
Page 44
Troubleshooting Hardware Faults Using Oracle ILOM
CPU Failure
For a server with a fault in the processor subsystem, the amber Service Action Required
and CPU TOP indicators are on steady. The activity of green OK indicator and the green SP
indicator vary depending on whether the server can boot successfully. The server might not be
able to boot out of standby power mode.
For indicator blink rate information, see “Indicator Blink Rates” on page 46.
Fan Module Failure
For a server with a fan module fault, the amber Service Action Required and the FAN TOP
indicators are on steady. The green OK indicator and the green SP indicator are on steady.
For more information on fan indicators, see “Fan Module Reference” on page 135.
44Oracle Server X5-4 Service Manual • December 2015
Page 45
Troubleshooting Hardware Faults Using Oracle ILOM
SP Failure
For a server with an SP fault, the amber Service Action Required indicator is on steady. The
green OK indicator and the green SP indicator are off.
Front Panel Lamp Test
To perform a lamp test of all front panel indicators, press and hold down the Locate button
for at least five seconds. All the indicators light up and remain on steady for 15 seconds (see
“Unison Steady On” on page 49).
Troubleshooting and Diagnostics45
Page 46
Troubleshooting Hardware Faults Using Oracle ILOM
Indicator Blink Rates
Note - The blink rate information described here might not apply to all server types (for
example, blade or rack mount).
This section describes the following indicator blink rates:
■
“Steady On” on page 46
■
“Steady Off” on page 47
■
“Slow Blink Rate” on page 47
■
“Fast Blink Rate” on page 48
■
“Single (Standby) Blink Rate” on page 48
■
“Slow Unison Blink Rate” on page 49
■
“Insertion Blink” on page 49
■
“Unison Steady On” on page 49
■
“Alternating (Invalid FRU) Blink Rate” on page 50
■
“Feedback Flash” on page 51
■
“Data Blink Rate” on page 51
■
“Sequential (Diagnostic) Blink Rate” on page 51
Steady On
For the steady on state, an indicator is continually on (lit) and does not blink. This indicates a
continuing condition, for example, an operational state (green) or a Service Action Required
fault state (amber).
46Oracle Server X5-4 Service Manual • December 2015
Page 47
Troubleshooting Hardware Faults Using Oracle ILOM
Steady Off
For the steady off state, an indicator is continually off (not lit) and does not blink. This indicates
that a system is not operational, for example, no AC power (unlit green System OK indicator)
or a subsystem not in a fault state (unlit amber Service Action Required indicator).
Slow Blink Rate
For the slow blink rate, the indicator (typically green) repeatedly lights for half a second during
a one second interval (1 Hz) and turns off for half a second. The slow blink rate indicates an ongoing activity. For example, the slow blink rate occurs when a device is rebuilding, booting, or
in transition from one mode to another.
Troubleshooting and Diagnostics47
Page 48
Troubleshooting Hardware Faults Using Oracle ILOM
Fast Blink Rate
For the fast blink rate, the indicator repeatedly blinks twice (on, off, on) during a one second
interval (2 Hz). The fast blink rate indicates activity or data transfer.
Single (Standby) Blink Rate
For the single blink rate, the indicator repeatedly flashes once at the beginning of a three second
interval. This indicates a component or system in standby mode. For example, the single blink
rate occurs when a server is in standby power mode or a when hot spare device is waiting to be
used (also used with amber indicators to indicate a predicted fault).
48Oracle Server X5-4 Service Manual • December 2015
Page 49
Troubleshooting Hardware Faults Using Oracle ILOM
Slow Unison Blink Rate
For the slow unison blink rate, the indicators on the component blink in unison for half a second
during a one second interval (1 Hz). Typically, this is limited to three successive blinks. This
confirms the successful insertion of a removable device (for example, a storage drive or blade)
into a powered system (confirming the power connection).
Insertion Blink
The insertion blink is three successive blinks of a hot-swap component's primary status
indicator (for example, the green OK indicator). The insertion blink occurs immediately after
three successive unison blinks (see “Slow Unison Blink Rate” on page 49) of all the
component indicators.
Unison Steady On
For the unison steady on, all indicators are simultaneously on steady (see “Steady
On” on page 46). This occurs during the front panel lamp test (see “Front Panel Lamp
Test” on page 45).
Troubleshooting and Diagnostics49
Page 50
Troubleshooting Hardware Faults Using Oracle ILOM
Alternating (Invalid FRU) Blink Rate
A repeating sequence of lit green and amber indicators at 1 Hz. Indicates that a component has
an incorrect version or mismatch (for example, a power supply with a lower rating than the one
specified). Also used for an unsupported component, a component in an unsupported slot, or a
blade (server module) that causes a power supply to be oversubscribed for that system.
50Oracle Server X5-4 Service Manual • December 2015
Page 51
Troubleshooting Hardware Faults Using Oracle ILOM
Feedback Flash
An indicator flashes on and off during periods of activity, commensurate with the activity,
but flashing does not exceed the 2 Hz fast blink rate (see, “Fast Blink Rate” on page 48).
For example, disk drive read and write activity and communication port transmit and receive
activity.
Data Blink Rate
An indicator that is normally on repeatedly turns off twice during a one-second interval (2 Hz—
see also, “Fast Blink Rate” on page 48) while data activity is taking place.
Sequential (Diagnostic) Blink Rate
A repeating sequence in which each indicator successively lights for 0.5 sec each to indicate
that diagnostics are running. This blink rate is used only on systems or components capable of
running diagnostics (for example, blade servers).
Troubleshooting Using the Fault Remind Test
Circuits
The sever has two internal test circuits, the System Fault Remind Circuit and the DIMM Fault
Remind Circuit. The circuits help you locate failed components. Use the System Fault Remind
Circuit to locate a failed CPU or memory riser card, and use the DIMM Fault Remind Circuit
to locate a failed DIMM. Both circuits hold an electrical charge and have limited operational
capabilities once power is removed from the server. The DIMM Fault Remind Circuit is active
for 10 minutes, and the System Fault Remind Circuit is active for 30 to 60 minutes.
Troubleshooting and Diagnostics51
Page 52
Troubleshooting Hardware Faults Using Oracle ILOM
Once AC power is connected to the system (providing standby power), the System Fault
Remind circuit energy storage capacitor takes about 10 minutes to charge to 63% (which should
be sufficient to turn on the indicator) and about 20 minutes to be fully charged. When you push
down on the Fault Remind button, its power indicator should show green when the circuit has
enough power to identify faulted components.
For information about how to use the circuits to identify failed components, see “Locate a
Failed Memory Riser Card, DIMM, or CPU” on page 86.
Troubleshooting System Cooling Issues
Maintaining the proper internal operating temperature of the server is crucial to a the health of
the server. To prevent server shutdown and damage to components, address over temperatureand hardware-related issues as soon as they occur. If your server has a temperature-related fault,
the cause of the problem might be one of following:
■
“External Ambient Temperature Too High” on page 52
■
“Airflow Blockage” on page 53
■
“Internal Pressures Compromised” on page 53
■
“Hardware Component Failure” on page 53
External Ambient Temperature Too High
Server component cooling relies on the movement of cool ambient air pulled into the server
from its external environment. If the ambient temperature of the server's external environment
is too high, cooling does not occur, and the internal temperature of server and its components
increases. This can cause poor server performance or a failure of one or more components.
Action: Check the ambient temperature of the server space against the environmental
specifications for the server (see the Installation Guide). If the temperature is not within the
required operating range, remedy the situation immediately.
Prevention: Periodically check the ambient temperature of the server space to ensure that it
is within the required range, especially if you have made any changes to the server space (for
example, added additional servers). The temperature must be consistent and stable.
52Oracle Server X5-4 Service Manual • December 2015
Page 53
Troubleshooting Hardware Faults Using Oracle ILOM
Airflow Blockage
The server cooling system uses fans to pull cool air in from the server front intake vents and
exhaust warm air out the server back panel vents. If the front or back vents are blocked, the
airflow through the server is disrupted and the cooling system fails to function properly causing
the server internal temperature to rise.
Action: Inspect the server front and back panel vents for blockage from dust or debris.
Additionally, inspect the server interior for improperly installed components or cables that can
block the flow of air through the server.
Prevention: Periodically inspect and clean the server vents using a vacuum cleaner. Ensure that
all components, such as cards, cable, fans, air baffles and dividers are properly installed. Never
operate the server without the top cover installed.
Internal Pressures Compromised
The server has two main cooling area (see “Cooling Subsystem” on page 25). To function
properly, these areas have separate pressures that are maintained using dividers, baffles,
component filler panels, and the server top cover. These things need to be in place for the server
to function as a sealed system. If the internal pressures are compromised, the server cooling
system, which relies on the movement of cool air through the server, cannot function properly,
and the airflow inside the server becomes chaotic and non-directional.
Action: Inspect the server interior to ensure that the air divider and air baffle (“Two-CPU
Configuration” on page 23) are properly installed. Ensure that all external-facing slots (storage
drive, DVD, PCIe) are occupied with either a component or a component filler panel. Ensure
that the server top cover is in place and sits flat and snug on top of the server.
Prevention: When servicing the server, ensure that the divider and baffle are installed correctly
and that the server has no unoccupied external-facing slots. Never operate the server without the
top cover installed.
Hardware Component Failure
Components, such as power supplies and fan modules, are an integral part of the server cooling
system. When one of these components fails, the server internal temperature can rise. This rise
in temperature can cause other components to enter into an over temperature state. Additionally,
some components, such as processors, might overheat when they are failing, which can also
generate an over-temperature event.
Troubleshooting and Diagnostics53
Page 54
Troubleshooting Hardware Faults Using Oracle ILOM
Action: Investigate the cause of the over-temperature event, and replace failed components
immediately. For hardware troubleshooting information, see “Troubleshooting Server Hardware
Component Faults” on page 37.
Prevention: Component redundancy is provided to a allow for component failure in critical
subsystems, such as the cooling subsystem. However, once a component in a redundant
system fails, the redundancy no longer exists, and the risk for server shutdown and component
failures increases. Therefore, it is important to maintain redundant systems and replace failed
components immediately.
Troubleshooting Power Issues
If your server does not power on, the cause of the problem might be one of the following:
■
“AC Power Connection” on page 54
■
“Power Supplies (PSUs)” on page 55
■
“Top Cover” on page 56
AC Power Connection
The AC power cords are the direct connection between the server power supplies and the power
sources. The server power supplies need separate stable AC circuits. Insufficient voltage levels
or fluctuations in power can cause server power problems. The power supplies are designed to
operate at a particular voltage and within an acceptable range of voltage fluctuations (see the
Installation Guide). A four-CPU configured server needs to operate at 200-240 VAC, while
a two-CPU configured server can operate at either 100-127 VAC or 200-240 VAC. For more
information about the processor subsystem, see “Processor Subsystem” on page 22.
Action: Check that both AC power cords are connected to the server and check that the correct
power is present at the outlets. If necessary, monitor the power to verify that it is within the
54Oracle Server X5-4 Service Manual • December 2015
Page 55
Troubleshooting Hardware Faults Using Oracle ILOM
acceptable range. You can verify proper connection and operation of a power supply by
checking its indicator panel.
A properly installed PSU has its green AC OK indicator lit. A properly functioning PSU has its
green power (DC) OK indicator lit and its amber Service Action Required indicator off.
Prevention: Use the AC power cord retaining clips and position the cords to minimize the risk
of accidental disconnection. Ensure that the AC circuits that supply power to the server are
stable and not overburdened.
Power Supplies (PSUs)
The server power supplies (PSUs) provide the necessary server voltages from the AC power
outlets. If the PSUs are inoperable, unplugged, or disengaged from the internal connectors, the
server cannot power on.
Action: Check that the AC cables are connected to both PSUs and that the PSUs are operational
(the PSU indicator panel should have a lit green AC OK indicator). If not, ensure that the PSU
is properly installed. A PSU that is not fully engaged with its internal connector does not have
power applied and does not have a lit green AC OK indicator.
Prevention: When a power supply fails, replace it immediately. To ensure redundancy, the
server has two PSUs. This redundant configuration prevents server downtime, or an unexpected
shutdown, due to a failed PSU. The redundancy allows the server to continue to operate if one
of the PSUs fails. However, when a server is being powered by a single PSU, the redundancy
no longer exists, and the risk for downtime or an unexpected shutdown increases. When
Troubleshooting and Diagnostics55
Page 56
Troubleshooting Hardware Faults Using Oracle ILOM
installing a power supply, ensure that it is fully seated and engaged with its connector inside the
drive bay.
A properly installed PSU has its green AC OK indicator lit. A properly functioning PSU has its
green power (DC) OK indicator lit and its amber Service Action Required indicator off
Top Cover
The server top cover helps the cooling subsystem maintain pressure areas within the server.
The top cover also protects against damage to internal components and accidental exposure to
hazardous voltages. For these reasons, the server top cover is includes an interlock switch to
server power.
The interlock switch has two components. One component is mounted inside the server on
the housing for power supply, PS1, and includes a wire that plugs into the motherboard. The
other component is mounted on the underside of the top cover. When the cover is installed these
two components align, closing the switch and allowing power to the server. When the cover is
removed, the switch opens removing power from the server. If the cover is removed while the
server is powered on to full power mode, power to the server is immediately switched off.
Action: If the server does not power on, check that switch is intact and properly aligned. Ensure
that the server top cover is in place and sits flat and snug on top of the server. Ensure that the
interlock switch components have not been damaged, removed, or misaligned.
Prevention: After removing the top cover, take care that the cover does not get bent, or that
the component on the underside is not damaged. When servicing the server, take care that the
internally mounted interlock switch component does not get damaged or misaligned. Never
operate the server without the top cover installed.
56Oracle Server X5-4 Service Manual • December 2015
Page 57
Troubleshooting With Diagnostic Tools
This section describes the available diagnostic tools and documentation that you can use to
troubleshoot server issues.
■
“Diagnostic Tools” on page 57
■
“Diagnostic Tool Documentation” on page 58
Diagnostic Tools
The server and its accompanying software and firmware contain diagnostic tools and features
that can help you isolate component problems, monitor the status of a functioning system,
and exercise one or more subsystems to disclose more subtle or intermittent hardware-related
problems.
Each diagnostic tool has its own specific strength and application. Review the tools listed in
this section and determine which tool might be best to use for your situation. Once you have
determined the tool to use, you can access it locally (while at the server) or remotely.
Troubleshooting With Diagnostic Tools
The selection of diagnostic tools available for your server range in complexity from a
comprehensive validation test suite (Oracle VTS) to a chronological event log (Oracle ILOM
System Log). The selection of diagnostic tools also include standalone software packages,
firmware-based tests, and hardware-based LED indicators.
The following table summarizes the diagnostic tools that you can use when troubleshooting or
monitoring your server.
Diagnostic
Tool
Oracle
ILOMSPfirmware
Preboot
MenuSPfirmware
TypeWhat It DoesAccessibilityRemote Capability
Monitors environmental
conditions and component
functionality sensors,
generates alerts, performs
fault isolation, and
provides remote access.
Enables you to restore
some Oracle ILOM default
settings when Oracle
ILOM is not accessible.
Can function on either
standby power mode or
full power mode and is not
OS dependent.
Can function on standby
power and when operating
system is not running.
Designed for remote and local
access.
Local, but remote serial access
is possible if the SP serial port
is connected to a networkaccessible terminal server.
Troubleshooting and Diagnostics57
Page 58
Troubleshooting With Diagnostic Tools
Diagnostic
Tool
Hardwarebased
LED
indicators
Poweron SelfTest
(POST)
U-BootSP
UEFI
Diagnostics
Oracle
Solaris
commands
Oracle
Linux
commands
Oracle
VTS
TypeWhat It DoesAccessibilityRemote Capability
Hardware
and SP
firmware
Host
firmware
firmware
System
BIOS
Operating
system
software
Operating
system
software
Diagnostic
tool
standalone
software
Indicates status of overall
system and particular
components.
Tests core components of
system: CPUs, memory,
and motherboard I/O
bridge integrated circuits.
Initializes and tests aspects
of the service processor
(SP) prior to booting
Oracle ILOM and the
operating system. Tests
SP memory, SP, network
devices and I/O devices.
The UEFI diagnostics can
test and detect problems
on all CPU, memory, disk
drives, and network ports.
It is used on systems that
support UEFI.
Displays various kinds of
system information.
Displays various kinds of
system information.
Exercises and stresses the
system, running tests in
parallel.
Available when system
power is available.
Runs on startup. Available
when the operating system
is not running.
Can function on standby
power and when operating
system is not running.
You can use either the
Oracle ILOM web
interface or the commandline interface (CLI) to run
UEFI diagnostics.
Requires operating system. Local, and over network.
Requires operating system. Local, and over network.
Local, but sensor and indicators
are accessible from Oracle ILOM
web interface or command-line
interface (CLI).
Local, but can be accessed
through Oracle ILOM Remote
Console.
Local, but remote serial access
is possible if the SP serial port
is connected to a networkaccessible terminal server.
Remote access through Oracle
ILOM Remote Console.
View and control over network.
Diagnostic Tool Documentation
The following table identifies where you can find more information about diagnostic tools.
Diagnostic
Tool
Oracle ILOMOracle Integrated Lights Out Manager 3.2
Preboot MenuOracle x86 Servers Diagnostics Guide
58Oracle Server X5-4 Service Manual • December 2015
InformationLocation
Documentation Library
http://www.oracle.com/goto/ILOM/docs
http://www.oracle.com/goto/
x86AdminDiag/docs
Page 59
Attaching Devices to the Server
Diagnostic
Tool
U-Boot
diagnostics
System
indicators and
sensors
POSTBIOS Setup Utility information“BIOS Setup Utility Menu
POSTPower On Self-Test (POST) codes“POST and Checkpoint
UEFI
diagnostics
Oracle VTSOracle VTS software and documentation
InformationLocation
Oracle x86 Servers Diagnostics Guide
Oracle Server X5-4 Service Manual“Troubleshooting Using the Front Panel
Oracle x86 Servers Diagnostics Guide
Attaching Devices to the Server
This section describes server port information and provides procedures for attaching devices to
the server.
■
“Attach Devices to the Server” on page 59
■
“Back Panel Connector Locations” on page 60
■
“Configuring Serial Management Port Ownership” on page 61
■
“Four Onboard 10GbE Ports” on page 32
http://www.oracle.com/goto/
x86AdminDiag/docs
Indicators” on page 40
Options” on page 273
Codes” on page 341
http://www.oracle.com/goto/
x86AdminDiag/docs
http://docs.oracle.com/cd/E19719-01/
index.html
Attach Devices to the Server
This section provides information about connecting devices to server (remotely and locally), so
you can interact with the service processor (SP) and the server console.
1.
Connect an Ethernet cable to the Gigabit Ethernet (NET) connectors.
See “Back Panel Connector Locations” on page 60.
2.
To connect to Oracle ILOM over the network, connect an Ethernet cable to the
Ethernet port labeled NET MGT.
See “Back Panel Connector Locations” on page 60.
Troubleshooting and Diagnostics59
Page 60
Attach Devices to the Server
3.
To access the Oracle ILOM command-line interface (CLI) locally using the
management port, connect a serial null modem cable to the RJ-45 serial port
labeled SER MGT.
See “Back Panel Connector Locations” on page 60.
4.
To interact with the system console locally, connect a monitor [1], keyboard [2]
and mouse [3] to the server front panel connectors as shown in the illustration
below.
Back Panel Connector Locations
The following illustration shows and describes the locations of the back panel connectors. Use
this information to set up the server, so you can access diagnostic tools and manage the server
during service.
60Oracle Server X5-4 Service Manual • December 2015
Page 61
Attach Devices to the Server
CalloutDescription
110 Gigabit Ethernet ports NET 0, 1, 2, 3
2USB 2.0 ports
3DB-15 video connector
4Service processor RJ-45 serial management port (SER MGT)
5Service processor RJ-45 10/100/1000Base-T network management port (NET MGT)
†
For OS port naming information, see “Four Onboard 10GbE Ports” on page 32.
†
Configuring Serial Management Port Ownership
By default, the service processor (SP) uses the serial management port (SER MGT) for serial
console output. Using Oracle ILOM, you can specify that the host be assigned as owner of
the serial management port (configured as COM1). This feature is useful for Windows kernel
debugging, as it enables you to view non-ASCII character traffic from the host console.
You can assign ownership of the serial management port using either the Oracle ILOM web
interface or command-line interface (CLI). For instructions, see the following sections:
■
“Assign Serial Port Ownership Using the CLI” on page 62
■
“Assign Serial Port Ownership Using the Web Interface” on page 63
Troubleshooting and Diagnostics61
Page 62
Assign Serial Port Ownership Using the CLI
Assign Serial Port Ownership Using the CLI
Before You Begin
1.
2.
Set up a network connection to the SP before attempting to change the serial port owner to the
host server. If the network is not set up, and you switch the serial port owner to the host server,
you will be unable to connect using the CLI or web interface to change the serial port owner
back to the SP.
To return the serial port owner setting back to the SP, use the network connection to Oracle
ILOM.
Open an SSH session and at the command line log in to the SP Oracle ILOM CLI.
Log in as a user with root or administrator privileges. For example:
ssh root@ipadress
where ipadress is the IP address of the server SP.
For more information, refer to the Oracle X5 Series Servers Administration Guide at http://
www.oracle.com/goto/x86AdminDiag/docs.
The Oracle ILOM CLI prompt appears:
->
To set the serial port owner to the host, type:
-> set /SP/serial/portsharing owner=host
Note - The serial port sharing value by default is owner=SP.
3.
To set the serial port owner back to the SP, type:
-> set /SP/serial/portsharing owner=sp
Note - If you inadvertently changed ownership of the serial management port (SER MGT)
before setting up a network connection to Oracle ILOM, refer to the Oracle Integrated Lights
Out Manager (ILOM) 3.2 Documentation Library at: http://www.oracle.com/goto/ilom/
docs for details about restoring access to the serial management port on your server.
4.
To log out of Oracle ILOM, type:
-> exit
62Oracle Server X5-4 Service Manual • December 2015
Page 63
Assign Serial Port Ownership Using the Web Interface
Assign Serial Port Ownership Using the Web Interface
Before You Begin
1.
2.
3.
Set up a network connection to the SP before attempting to change the serial port owner to the
host server. If the network is not set up, and you switch the serial port owner to the host server,
you will be unable to connect using the CLI or web interface to change the serial port owner
back to the SP.
To return the serial port owner setting back to the SP, use the network connection to Oracle
ILOM.
Log in to the service processor Oracle ILOM web interface.
To log in, open a web browser and direct it using the IP address of the server SP. Log in as root
or a user with administrator privileges. Refer to the Oracle X5 Series Servers AdministrationGuide at http://www.oracle.com/goto/x86AdminDiag/docs.
The Summary screen appears.
In the ILOM web interface, select ILOM Administration --> Connectivity from the
navigation menu on the left side of the screen.
Select the Serial Port tab.
The Serial Port Settings page appears.
Note - The serial port sharing setting by default is Service Processor.
4.
To set the host as the serial port owner, select Host Server at the Serial Port
page.
5.
To set the SP as the serial port owner, select Service Processor at the Serial Port
page.
6.
Click Save for the changes to take effect.
7.
Log out of Oracle ILOM.
Resetting the Host or Service Processor
This section provides procedures for resetting the host and the SP
Troubleshooting and Diagnostics63
Page 64
Reset the Host or SP Using Oracle ILOM
■
“Reset the Host or SP Using Oracle ILOM” on page 64
■
“Reset the Host or SP Using Back Panel Pinhole Switches” on page 64
■
“Reset the SP Root Account Password or Recover the Root Account” on page 65
Reset the Host or SP Using Oracle ILOM
Before You Begin
1.
2.
3.
The Host Control and Reset (r) role is required to reset a service processor.
Log in to the Oracle ILOM (web or CLI) for the server.
Reset Oracle ILOM using one of the following methods:
■From the Oracle ILOM CLI, enter the command:
reset /SP
■From the Oracle ILOM web interface, click ILOM Administration >
Maintenance > Reset SP.
Note - Resetting the Oracle ILOM SP disconnects your current Oracle ILOM session. You must
log in again to continue working in Oracle ILOM.
Reset the host using one of the following methods:
■From the Oracle ILOM CLI, enter the command:
reset /System
■From the Oracle ILOM web interface, click Host Management > Power
Control, then select your reset method from the drop-down list.
Reset the Host or SP Using Back Panel Pinhole
Switches
This section shows the location of the back panel pinhole switches.
64Oracle Server X5-4 Service Manual • December 2015
Page 65
Reset the SP Root Account Password or Recover the Root Account
CalloutDescription
1SP Reset
2Host Warm Reset
3NMI (Oracle Service use only)
Reset the SP Root Account Password or Recover
the Root Account
If necessary, system administrators can recover the Oracle ILOM root account (if accidentally
deleted) or reset the password for the Oracle ILOM root account to the factory default
password.
To perform either action, you need a local serial management port (SER MGT) connection
to Oracle ILOM. In addition, if the Physical Presence State is enabled (the default) in Oracle
ILOM, you must prove that you are physically present at the server as described in the
following procedure.
To recover the root account or root account password, perform these steps:
1.
Establish a local serial management connection to Oracle ILOM and log in to
Oracle ILOM using the default user account.
For example:
ORACLESP-000000000 login: default
Press and release the physical presence button
Troubleshooting and Diagnostics65
Page 66
Getting Help
Press return when this is completed...
For additional information logging in through the serial management port, see “Log In to Oracle
ILOM CLI Using a Local Serial Connection” in Oracle Server X5-4 Installation Guide.
2.
To prove physical presence at the server, press the Locator button on the front
of the server.
For the location of the Locator button, see “Server Front Panel Features” on page 17.
3.
Return to your serial console and press Enter.
You will be prompted for a password.
4.
Enter the password for the default user account: defaultpassword
5.
Reset the root account password or re-create the root account.
Refer to the Oracle ILOM documentation for details on creating user accounts at: http://www.
oracle.com/goto/ILOM/docs
Getting Help
This sections describes how to get additional help to resolve server-related problems.
■
“Contacting Support” on page 66
■
“Locating the System Serial Number” on page 67
Contacting Support
If the troubleshooting procedures in this chapter fail to solve your problem, use the following
table to collect information that you might need to communicate to support personnel.
System Configuration
Information Needed
Service contract number
System model
Operating environment
System serial number
Your Information
66Oracle Server X5-4 Service Manual • December 2015
Page 67
Getting Help
System Configuration
Information Needed
Peripherals attached to the
system
Email address and phone number
for you and a secondary contact
Street address where the system
is located
Superuser password
Summary of the problem and
the work being done when the
problem occurred
Other Useful Information
IP address
Server name (system host name)
Network or internet domain
name
Proxy server configuration
Your Information
See Also:
■
“Locating the System Serial Number” on page 67
Locating the System Serial Number
You might need to have your system serial number when you ask for service on your system.
Record this number for future use. Use one of the following methods to locate your server serial
number:
■
On the front panel of the server, look at the bottom left of the bezel to locate the server
serial number.
■
Locate the yellow Customer Information Sheet (CIS) attached to your server packaging.
This sheet includes the serial number.
■
From Oracle ILOM:
■
From the Oracle ILOM command-line interface (CLI), type the command: show/SYS.
■
From the Oracle ILOM web interface, view the serial number in the System Information
tab.
■
From Oracle System Assistant, view the serial number in the System Overview (home
screen).
Troubleshooting and Diagnostics67
Page 68
68Oracle Server X5-4 Service Manual • December 2015
Page 69
Servicing the Server
This section describes component serviceability requirements and provides common service
procedures.
DescriptionLink
Component information,
including component location,
serviceability, and system
designations
Procedures for setting up an
ESD-safe work space.
Recommended and required
tools for servicing the server.
Information about component
filler panels.
Procedures for using the fault
remind test circuits.
Procedures about clearing
hardware faults in Oracle ILOM.
“Component Serviceability, Locations, and Designations” on page 69
“Performing Electrostatic Discharge and Static Prevention
Measures” on page 79
“Tools and Equipment” on page 81
“Component Filler Panels” on page 81
“Locating a Failed Memory Riser Card, DIMM, or CPU” on page 82
“Clear Hardware Fault Messages” on page 91
Component Serviceability, Locations, and Designations
This section describes component service designations, serviceability, and locations.
■
“Component Serviceability” on page 70
■
“Location of Replaceable Components” on page 70
■
“Component Designations” on page 73
Servicing the Server69
Page 70
Component Serviceability, Locations, and Designations
Component Serviceability
The replaceable components in your server are designated as either a customer-replaceable unit
(CRU) or a field-replaceable unit (FRU).
■
A part designated as a FRU must be replaced by an Oracle-qualified service technician.
■
A part designated as a CRU can be replaced by a person who is not an Oracle-qualified
service technician.
If the component can be serviced while the server is powered on, it is called a hot-service
component. If the server has to be powered off before the component can be serviced, it is
called a cold-service component.
The following table lists the components, their service designations, and their serviceability.
ComponentService Designation Serviceability
Storage drivesCRUHot
Fan modulesCRUHot
Power suppliesCRUHot
Memory risers and DIMMsCRUCold
PCIe cardsCRUCold
DVD driveCRUCold
System (or RTC) batteryCRUCold
CPUs and heatsinksFRUCold
SAS 12 Gb/s HBA and cables (HBA-to-disk backplane)FRUCold
PCIe NVMe Switch card and cables (card-to-disk backplane)FRUCold
Energy Storage Module (ESM) and cable (HBA-to-ESM)FRUCold
Fan boardFRUCold
Power supply backplaneFRUCold
Storage drive backplaneFRUCold
SP cardFRUCold
MotherboardFRUCold
Location of Replaceable Components
The following illustrations show the Oracle Server X5-4 components:
70Oracle Server X5-4 Service Manual • December 2015
Page 71
■
“Replaceable Components” on page 71
■
“Components (Exploded View)” on page 72
Replaceable Components
Component Serviceability, Locations, and Designations
CalloutDescriptionCalloutDescription
1Motherboard10HBA SAS cables (2)
2SP card11Storage drive backplane board
3HBA card12Heatsinks and CPUs (2 or 4)
4PCIe NVMe Switch card13Memory riser cards (4 or 8)
5Power supplies (2)14Fan modules (6)
6System battery15Fan board
7Power supply backplane16DVD Drive
Servicing the Server71
Page 72
Component Serviceability, Locations, and Designations
72Oracle Server X5-4 Service Manual • December 2015
Page 73
Component Serviceability, Locations, and Designations
CalloutDescriptionCalloutDescription
2Power supply backplane board12Motherboard
3SP card13Storage drive
4System battery14DVD drive
5PCIe NVMe Switch card15Fan module
6HBA card16Fan board
7CPU17Storage drive backplane board
8Heatsink18Server chassis
9Cover19ESM (Energy Storage Module for
10Memory riser card
HBA)
Component Designations
This section describes the naming designations for internal and external slots:
■
“Fan Module Slot Designations” on page 73
■
“CPUs and Memory Riser Card Slots Designations” on page 74
■
“DIMM Slot Designations” on page 75
■
“Power Supply Designations” on page 76
■
“PCIe Slot Designations” on page 77
■
“DVD, Storage Drive, and USB Designations” on page 78
Fan Module Slot Designations
The six fan module slots are at the front of the server and are set in two rows of three slots. The
slots are designated from left to right. As pictured in the illustration below, the three front row
slots are designated as: FM0, FM1, and FM2. The three back row slots are: FM3, FM4, and
FM5.
Servicing the Server73
Page 74
Component Serviceability, Locations, and Designations
CalloutDescriptionCalloutDescription
1Fan Module, FM 04Fan Module, FM 3
2Fan Module, FM 15Fan Module, FM 4
3Fan Module, FM 26Fan Module, FM 5
CPUs and Memory Riser Card Slots Designations
The four CPU sockets are located in the middle of the server and are designated consecutively
from right to left (from the front of the server). The rightmost socket is CPU-0 and is designated
as P0, and the leftmost socket is CPU-3, designated as P3.
The eight memory riser (MR) card slots are located between the fan module slots and the CPU
sockets. Consecutively from right to left, the rightmost slot is slot 0, and the leftmost slot is slot
7.
The slots are also designated by their association with the four CPU sockets (P0-P3). Two slots
are assigned to each CPU socket. For example, slots 0 and 1 are paired with CPU socket, P0,
and are designated as P0/MR0 and P0/MR1. Slots 2 and 3 are paired with CPU socket, P1 and
are designated as P1/MR0 and P1/MR1. This numbering pattern continues for the remaining
slots.
74Oracle Server X5-4 Service Manual • December 2015
Page 75
Component Serviceability, Locations, and Designations
CalloutDescriptionCalloutDescription
1MR card slot P3/MR17MR card slot P0/MR1
2MR card slot P3/MR08MR card slot P0/MR0
3MR card slot P2/MR19CPU-3 (P3)
4MR card slot P2/MR010CPU-2 (P2)
5MR card slot P1/MR111CPU-1 (P1)
6MR card slot P1/MR012CPU-0 (P0)
DIMM Slot Designations
The DIMM slots are located on the memory riser cards. The DIMMs are arranged in two banks
of six slots for a total of 12 slots. The slots are designated numerically from top to bottom. The
left bank of slots are designated as D0–D6. The right bank of slots are designated as D7–D11.
Servicing the Server75
Page 76
Component Serviceability, Locations, and Designations
CalloutDescriptionCalloutDescription
1Slot D07Slot D6
2Slot D18Slot D7
3Slot D29Slot D8
4Slot D310Slot D9
5Slot D411Slot D10
6Slot D512Slot D11
Power Supply Designations
The two power supply slots are located on the right side of the server (from the front of the
server) and are designated from right to left. The slots are accessible from the back of the
server. From the back of the server, the left slot is designated as PS-0, and the right slot is PS-1.
76Oracle Server X5-4 Service Manual • December 2015
Page 77
Component Serviceability, Locations, and Designations
CalloutDescription
1PS 1
2PS 0
PCIe Slot Designations
The eleven PCIe slots are located inside the server at the back. As viewed from the front of the
server, the slots are divided into two groups, a group of six on the right of the SP card and a
group of five on the left of the SP card. The slots are designated from right to left. The six slots
on the right side are designated as PCI-1 to PCI-6. The five slots on the left are designated as
PCI-7 to PCI-11.
Servicing the Server77
Page 78
Component Serviceability, Locations, and Designations
CalloutDescriptionCalloutDescription
1PCIe 17PCIe 7
2PCIe 28PCIe 8
3PCIe 39PCIe 9
4PCIe 410PCIe 10
5PCIe 511PCIe 11
6PCIe 6
DVD, Storage Drive, and USB Designations
The DVD drive is located at the right lower front side of the front of the server.
The six storage drive slots are on the right side of the server and are designated consecutively
from bottom to top. The bottommost slot is designated as HDD-0, and the topmost slot is HDD-
5.
The two internal USB slots are located between the disk backplane board and the power supply
backplane board. An optional Oracle System Assistant flash drive is installed in the port marked
"OSA USB."
78Oracle Server X5-4 Service Manual • December 2015
Page 79
Performing Electrostatic Discharge and Static Prevention Measures
CalloutDescriptionCalloutDescription
1HDD5/NVMe36HDD0
2HDD4/NVMe27DVD
3HDD3/NVMe18OSA USB port
4HDD2/NVMe09USB port
5HDD1
Performing Electrostatic Discharge and Static Prevention
Measures
Electrostatic discharge (ESD) sensitive devices, such as the PCIe cards, hard drives, CPUs, and
memory cards, require special handling.
Servicing the Server79
Page 80
Performing Electrostatic Discharge and Static Prevention Measures
Using an Anti-static Wrist Strap
Wear an anti-static wrist strap when handling components such as disk drive assemblies,
circuit boards, or PCIe cards. When servicing or removing server components, attach an antistatic strap to your wrist and then to a metal area on the server chassis. Following this practice
equalizes the electrical potentials between you and the server.
Note - An anti-static wrist strap is not shipped with the servers. However, anti-static wrist straps
are included with customer-replaceable units (CRUs), field-replaceable units (FRUs), and
optional components.
Using an Anti-Static Mat
In addition to wearing an anti-static wrist strap when handling components, create an ESD-free
work place by using an anti-static mat as a work surface and as a place to set ESD-sensitive
components such as printed circuit boards, DIMMs, and CPUs. You can use the following items
as anti-static mats:
80Oracle Server X5-4 Service Manual • December 2015
Page 81
■
Anti-static bag used to wrap a replacement part
■
ESD mat (orderable from Oracle)
■
A disposable ESD mat (shipped with some optional system components)
Tools and Equipment
To service the system, you need the following tools:
■
No. 2 Phillips screwdriver
■
Anti-static wrist strap
■
ESD mat and grounding strap
You might also need a system console device, such as one of the following:
■
PC or workstation with RS-232 serial port
■
ASCII terminal
■
Terminal server
■
Patch panel connected to a terminal server
Tools and Equipment
Component Filler Panels
Your server might be shipped with module-replacement filler panels for CPUs, storage drives
(HDD or SSD), the DVD drive, and the PCIe cards. A filler panel is an empty metal or plastic
enclosure that does not contain any functioning system hardware or cable connectors.
The filler panels are installed at the factory and must remain in the server until you replace them
with components. This seals system and provides noise, EMI, and airflow containment. If you
remove a filler panel and continue to operate your system with an empty module slot, the server
might overheat due to improper airflow. For instructions on removing or installing a filler panel
for a server component, refer to the section in this guide about servicing that component.
The following illustration shows the storage drives and storage drive filler panels installed in
the server.
Servicing the Server81
Page 82
Locating a Failed Memory Riser Card, DIMM, or CPU
CalloutDescription
1Storage drive filler panels
2Storage drives
Locating a Failed Memory Riser Card, DIMM, or CPU
This section describes the system and DIMM Fault Remind test circuits and provides a
procedure for using the circuits to locate faulty components:
■
“Fault Remind Circuits and Internal Fault Indicator Locations” on page 82
■
“DIMM Fault Remind Circuit Components” on page 85
■
“Locate a Failed Memory Riser Card, DIMM, or CPU” on page 86
Fault Remind Circuits and Internal Fault Indicator
Locations
This section describes the locations of the system Fault Remind circuit components:
■
“System Fault Remind Button and Charge Status Indicator” on page 83
■
“Memory Riser Card and CPU Fault Indicators” on page 83
■
“CPU Fault Indicators” on page 84
82Oracle Server X5-4 Service Manual • December 2015
Page 83
Locating a Failed Memory Riser Card, DIMM, or CPU
System Fault Remind Button and Charge Status Indicator
The Fault Remind Button is located on the divider between the cooling zone 1 and cooling zone
2. The Charge Status Indicator is located next to the button.
Memory Riser Card and CPU Fault Indicators
The memory riser card Fault indicators are visible through the small hole on top of the card.
Servicing the Server83
Page 84
Locating a Failed Memory Riser Card, DIMM, or CPU
CalloutDescription
1Memory riser card Fault indicators
CPU Fault Indicators
The CPU Fault indicators are located on the motherboard between the memory riser cards
and the CPU. To see a lit CPU Fault indicator, look down from the top of the server and
sight through the memory riser cards and the support bracket near the CPU. The following
illustration shows the location of the CPU Fault indicators.
Note - When a CPU fails the fault indicators of its two memory riser cards also light, making it
easier to identify the failed CPU.
84Oracle Server X5-4 Service Manual • December 2015
Page 85
Locating a Failed Memory Riser Card, DIMM, or CPU
CalloutDescription
1CPU Fault indicators
DIMM Fault Remind Circuit Components
The DIMM Fault Remind Test Circuit is located on the memory riser card. The Fault Remind
Button and Charge Status Indicator are located near the right-side bank of DIMM slots at the
front edge of the card. The DIMM Fault indicators are located next to the DIMM slots.
Servicing the Server85
Page 86
Locate a Failed Memory Riser Card, DIMM, or CPU
CalloutDescriptionCalloutDescription
1MR card Fault indicator3Fault Remind button
2DIMM Fault indicators4Charge Status indicator
†
The indicator lights when the circuit is charged.
†
Before You Begin
1.
Locate a Failed Memory Riser Card, DIMM, or CPU
To locate a failed memory riser card, DIMM or CPU, use the fault remind circuits inside the
server. The circuit uses board-mounted indicators to identify the failed component. If the failed
component is a memory riser card or a CPU, the indicators identify the component directly. If
the failed component is a DIMM, the indicators identify the memory riser card containing the
DIMM. Then to locate the failed DIMM, you need to remove the memory riser card and use the
card's DIMM Fault Remind circuitry.
For more information about the system and DIMM Fault Remind circuits, see “Troubleshooting
Using the Fault Remind Test Circuits” on page 51.
To troubleshoot faulty hardware components, see “Troubleshooting Server Hardware
Component Faults” on page 37.
Note - The test circuits are charged, time-limited circuits. Once power is removed from the
server you have 10 minutes to use the DIMM Fault Remind circuit and 30-60 minutes to use the
System Fault Remind circuit.
Prepare for service.
86Oracle Server X5-4 Service Manual • December 2015
Page 87
Locate a Failed Memory Riser Card, DIMM, or CPU
See “Prepare the Server for Cold Service” on page 94.
2.
Press and hold the system Fault Remind button.
The Fault Remind button is located on the divider between the cooling zone 1 and cooling zone
2.
3.
Verify that the system Fault Remind circuit is usable.
When the Fault Remind button is pressed, the Fault Remind power indicator illuminates (green)
to indicate that the remind circuitry is usable.
4.
Look for the lit Fault indicators:
If the circuit is usable, identify the failed component by the lit Fault indicators. Use the
information in the following table to help you find the component.
■To locate a failed CPU, look for the lit MR card Fault indicators and the lit
CPU Fault indicator.
When a CPU is in a fault state, the Fault indicators for the CPU and both MR cards
associated with the CPU light when the system Fault Remind button is pressed.
The following illustration shows the lit indicators for a failed CPU, P0. In this example,
the Fault indicators for memory riser cards, P0/MR0 and P0/MR1 are lit, as is the Fault
indicator for CPU, P0.
For more information, see “CPU Fault Indicators” on page 84.
■To locate a failed MR card, look for the MR card Fault indicator.
When an MR card is in a fault state, the Fault indicator for the card lights when the system
Fault Remind button is pressed. The indicator is visible through the small hole on top of
the card.
88Oracle Server X5-4 Service Manual • December 2015
Page 89
Locate a Failed Memory Riser Card, DIMM, or CPU
The following illustration shows a lit Fault indicator for memory riser card, P1/MR1.
For more information, see “Memory Riser Card and CPU Fault
Indicators” on page 83.
■To locate a failed DIMM, look for an MR card Fault indicator.
When a DIMM is in a fault state, the Fault indicator for the MR card containing the DIMM
lights when the system Fault Remind button is pressed.
Servicing the Server89
Page 90
Locate a Failed Memory Riser Card, DIMM, or CPU
The following illustration shows a lit Fault indicator for memory riser card, P0/MR1. This
card contains the faulty DIMM. To locate the DIMM, remove the card and use the DIMM
Fault Remind circuit.
5.
Next Steps
For more information, see “DIMM Fault Remind Circuit Components” on page 85.
Replace the failed component:
■To replace a failed CPU, see “Replace a Faulty CPU (FRU)” on page 180.
■To replace a failed memory riser card or a DIMM, see “Replace a Faulty
Memory Riser Card” on page 144.
■
“Replace a Faulty CPU (FRU)” on page 180
or
■
“Remove a Memory Riser Card” on page 145
or
■
“Replace a Faulty DIMM” on page 143
90Oracle Server X5-4 Service Manual • December 2015
Page 91
Clear Hardware Fault Messages
This section provides instructions for clearing component faults in Oracle ILOM.
After servicing a component, you might need to manually clear the fault using Oracle ILOM.
Faults are captured by Oracle ILOM's fault manager and stored in the fault management
database. If a component fault needs to be manually cleared, use the fmadm command from the
Oracle ILOM Fault Management shell. The Fault Management shell is accessible by logging
in to the Oracle ILOM CLI. For events logged in the Oracle ILOM event log, use the Oracle
ILOM web interface.
For information about using fmadm, refer to the Oracle ILOM User Guide at http://www.
oracle.com/goto/ILOM/docs
Clear Hardware Fault Messages
Before You Begin
1.
2.
3.
This procedure requires the use of the Oracle ILOM CLI interface.
Open an SSH session and at the command line log in to the SP Oracle ILOM CLI.
Log in as a user with root or administrator privileges. For example:
ssh root@ipadress
where ipadress is the IP address of the server SP.
For more information on accessing Oracle ILOM, refer to the Oracle X5 Series ServersAdministration Guide at http://www.oracle.com/goto/x86AdminDiag/docs.
The Oracle ILOM CLI prompt appears:
->
To access fmadm, type:
start /SP/faultmgmt/shell
The fmadm prompt appears:
faultmgmtsp>
To get a listing of command options for displaying or clearing a fault with
fmadm, type:
help fmadm
The following output appears:
Usage: fmadm <subcommand>
where <subcommand> is one of the following:
faulty [-asv] [-u <uuid>] : display list of faults
Servicing the Server91
Page 92
Clear Hardware Fault Messages
faulty -f [-a] : display faulty FRUs
faulty -r [-a] : display faulty ASRUs
acquit <FRU> : acquit faults on a FRU
acquit <UUID> : acquit faults associated with UUID
acquit <FRU> <UUID> : acquit faults specified by
(FRU, UUID) combination
replaced <FRU> : replaced faults on a FRU
repaired <FRU> : repaired faults on a FRU
repair <FRU> : repair faults on a FRU
rotate errlog : rotate error log
rotate fltlog : rotate fault log
4.
Use fmadm faulty and the following options to display active faulty components:
■
-a – Show active faulty components
■
-f – Show active faulty FRUs.
■
-r – Show active fault FRUs and their fault management states.
■
-s – Show a one-line fault summary for each fault event.
■
-u uuid – Show fault diagnosis events that match a specific universal unique identifier
(uuid).
For command specifics, see the Oracle ILOM User's Guide for System Monitoring andDiagnostics for your version of Oracle ILOM at: http://www.oracle.com/goto/ILOM/docs
5.
Use fmadm to clear the fault.
When you clear a fault, you can specify either acquit, repair, replaced, or repaired for the
component in question.
6.
Close the Oracle ILOM session.
92Oracle Server X5-4 Service Manual • December 2015
Page 93
Preparing to Service the Server
This section provides procedures for preparing the server for service.
DescriptionLinks
Procedural information for preparing to
perform hot service on the server.
Procedural information for preparing to
perform cold service on the server.
Procedural information for releasing the
Cable Management Arm (CMA)
Procedural information for server power-off
options.
Procedural information for using the Locator
indicator.
Procedural information for the server cover.“Remove the Server Cover” on page 110
Prepare the Server for Hot Service
“Prepare the Server for Hot Service” on page 93
“Prepare the Server for Cold Service” on page 94
“Release the CMA” on page 98
“Powering Off the server” on page 101
“Managing the Locator Indicator” on page 107
Before You Begin
The following hot-service components can be removed and replaced while the server is
operating in full power mode:
■
Storage drives
■
Fan modules
■
Power supplies
For more information about component serviceability, see “Component
Serviceability” on page 70.
■
Important: Review the server Product Notes document before performing removal and
installation procedures.
■
For troubleshooting information, see “Troubleshooting and Diagnostics” on page 37.
Preparing to Service the Server93
Page 94
Prepare the Server for Cold Service
■
This procedure uses the Oracle ILOM web interface. However, the procedure can also be
performed using the Oracle ILOM CLI. For more information, refer to the Oracle ILOM
documentation.
1.
Log in to the Oracle ILOM web interface.
To log in, open a web browser and direct it using the IP address of the server SP. Log in as root
or a user with administrator privileges. For more information, refer to the Oracle X5 SeriesServers Administration Guide at http://www.oracle.com/goto/x86AdminDiag/docs.
The Summary screen appears.
2.
In the Actions section of the Summary screen, click the Locator Indicator Turn
On button.
This action activates the Locator indicator on the server front panel. For more Locator indicator
management options, see “Managing the Locator Indicator” on page 107.
3.
Once at the server, press the Locator indicator button to deactivate the indicator.
See “Manage the Locator Indicator Locally” on page 110.
4.
Set up an ESD-safe space at the service location.
See “Performing Electrostatic Discharge and Static Prevention Measures” on page 79.
Next Steps
The following components can be hot-serviced:
■
“Servicing Storage Drives (CRU)” on page 113
■
“Servicing Fan Modules (CRU)” on page 130
■
“Servicing Power Supplies (CRU)” on page 137
Prepare the Server for Cold Service
Note - This procedure uses a combination of the Oracle ILOM web and CLI interfaces.
However, the procedure can also be performed using only the Oracle ILOM CLI interface.
For more information about the Oracle ILOM CLI interface, refer to the Oracle ILOM
documentation.
A cold-service component must be serviced when the server is completely powered off. For
more information about component serviceability, see “Component Serviceability” on page 70.
This procedure describes how to prepare the server for service, so you can:
94Oracle Server X5-4 Service Manual • December 2015
Page 95
■
Remove, replace, or install cold-serviceable components
■
Remove, replace, or install internal components
■
Use the motherboard processor and DIMM fault remind circuitry
Prepare the Server for Cold Service
Before You Begin
1.
■
Important: Review the server Product Notes document before performing removal and
installation procedures.
■
For troubleshooting information, see “Troubleshooting Server Hardware Component
Faults” on page 37.
To power down the server and activate the front panel Locator indicator, do the
following:
a.
Log in to the Oracle ILOM web interface.
Type the server SP IP address into a web browser and log in as a user with root or
administrator privileges. For more information, refer to the Oracle X5 Series ServersAdministration Guide at http://www.oracle.com/goto/x86AdminDiag/docs.
b.
In the Actions section of the Summary screen, click the Power State Turn Off
button.
This action powers off the server to standby power mode. For more power off options, see
“Powering Off the server” on page 101.
c.
In the Actions section of the Summary screen, click the Locator Indicator
Turn On button.
This action activates the Locator indicator on the server front panel. For more Locator
indicator management options, see “Managing the Locator Indicator” on page 107.
Preparing to Service the Server95
Page 96
Prepare the Server for Cold Service
2.
Remove the power cord retainer clips by lifting them up to disengage them from
the power cords.
3.
Remove both server power cords.
Caution - Data loss. Removing the power cords when the server is in full power mode results
in an immediate shut down of the server. Do not remove the power cords if the server is in full
power mode.
4.
To slide the server out of the rack to the maintenance position, do the following:
Most component service procedures can be performed without removing the server entirely
from the rack. Instead, the server can be slid out of the rack on its support rails to an extended
and locked position called the maintenance position.
a.
At the back of the server, verify that the cables have sufficient length
and clearance to extend the server to the maintenance position without
damaging or overextending the cables.
The cable management arm (CMA) that is supplied with the server is hinged to facilitate
extending the server to the maintenance position. Ensure the sliding movement does not
impede or damage the cables. If necessary, label and remove cables from the back of the
server.
96Oracle Server X5-4 Service Manual • December 2015
Page 97
Prepare the Server for Cold Service
b.
(Optional) Release and reposition the CMA to access to the back of the
server.
See “Release the CMA” on page 98.
c.
At the front of the server, release the slide rails by pushing the two green
latches inward.
d.
Slowly pull the server forward until both slide rails lock at the fully extended
maintenance position.
The locking action is accompanied by an audible click. The server is now in the
maintenance position and ready for service.
5.
(Optional) Remove the server entirely from the rack.
See “(Optional) Remove the Server from the Rack” on page 99.
6.
Set up an ESD-safe service location.
Preparing to Service the Server97
Page 98
Release the CMA
7.
See “Performing Electrostatic Discharge and Static Prevention Measures” on page 79.
Remove the top cover.
See “Remove the Server Cover” on page 110.
Caution - Component ESD damage. Circuit boards and hard drives contain electronic
components that are extremely sensitive to static electricity. Do not touch or handle components
unless you are wearing a properly grounded anti-static wrist strap.
Next Steps
For component cold-service procedures, see:
■
“Servicing CRU Components” on page 113
■
“Servicing FRU Components” on page 179
Release the CMA
If you are using a cable management arm (CMA), to gain additional access to the back of the
server, release and reposition the CMA.
98Oracle Server X5-4 Service Manual • December 2015
Page 99
1.
Press and hold the tab.
(Optional) Remove the Server from the Rack
2.
Swing the CMA away from the server.
(Optional) Remove the Server from the Rack
To perform some service procedures, you might find it necessary or more convenient to
completely remove the server from the rack, rather than work on the server while it is the
maintenance position. These optional steps show you how to remove the server entirely from
the rack.
Preparing to Service the Server99
Page 100
(Optional) Remove the Server from the Rack
Caution - Physical or component damage. The server is heavy and cannot be safely removed
from the rack by a single person. Use two or more personnel and a mechanical lift to remove
the server from the rack.
1.
Prepare the server for service.
See “Prepare the Server for Cold Service” on page 94.
2.
Ensure that the server is in the maintenance position.
3.
Set up an ESD-safe service location.
Caution - Component ESD damage. Circuit boards and hard drives contain electronic
components that are extremely sensitive to static electricity. Do not touch or handle components
unless you are wearing a properly grounded anti-static wrist strap.
See “Performing Electrostatic Discharge and Static Prevention Measures” on page 79.
4.
Pull the mounting release brackets [1] toward the front of the server.
100Oracle Server X5-4 Service Manual • December 2015
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.