This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except
as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform,
publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is
prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS. Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered
to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As
such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or
documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous
applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all
appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this
software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of
SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered
trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are
not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement
between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content,
products, or services, except as set forth in an applicable agreement between you and Oracle.
Documentation Accessibility
For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc.
Access to Oracle Support
Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?
ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.
Ce logiciel et la documentation qui l’accompagne sont protégés par les lois sur la propriété intellectuelle. Ils sont concédés sous licence et soumis à des restrictions d’utilisation et
de divulgation. Sauf stipulation expresse de votre contrat de licence ou de la loi, vous ne pouvez pas copier, reproduire, traduire, diffuser, modifier, breveter, transmettre, distribuer,
exposer, exécuter, publier ou afficher le logiciel, même partiellement, sous quelque forme et par quelque procédé que ce soit. Par ailleurs, il est interdit de procéder à toute ingénierie
inverse du logiciel, de le désassembler ou de le décompiler, excepté à des fins d’interopérabilité avec des logiciels tiers ou tel que prescrit par la loi.
Les informations fournies dans ce document sont susceptibles de modification sans préavis. Par ailleurs, Oracle Corporation ne garantit pas qu’elles soient exemptes d’erreurs et vous
invite, le cas échéant, à lui en faire part par écrit.
Si ce logiciel, ou la documentation qui l’accompagne, est concédé sous licence au Gouvernement des Etats-Unis, ou à toute entité qui délivre la licence de ce logiciel ou l’utilise pour
le compte du Gouvernement des Etats-Unis, la notice suivante s’applique:
U.S. GOVERNMENT END USERS. Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered
to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As
such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or
documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
Ce logiciel ou matériel a été développé pour un usage général dans le cadre d’applications de gestion des informations. Ce logiciel ou matériel n’est pas conçu ni n’est destiné
à être utilisé dans des applications à risque, notamment dans des applications pouvant causer des dommages corporels. Si vous utilisez ce logiciel ou matériel dans le cadre d’
applications dangereuses, il est de votre responsabilité de prendre toutes les mesures de secours, de sauvegarde, de redondance et autres mesures nécessaires à son utilisation dans des
conditions optimales de sécurité. Oracle Corporation et ses affiliés déclinent toute responsabilité quant aux dommages causés par l’utilisation de ce logiciel ou matériel pour ce type
d’applications.
Oracle et Java sont des marques déposées d’Oracle Corporation et/ou de ses affiliés. Tout autre nom mentionné peut correspondre à des marques appartenant à d’autres propriétaires
qu’Oracle.
Intel et Intel Xeon sont des marques ou des marques déposées d’Intel Corporation. Toutes les marques SPARC sont utilisées sous licence et sont des marques ou des marques
déposées de SPARC International, Inc. AMD, Opteron, le logo AMD et le logo AMD Opteron sont des marques ou des marques déposées d’Advanced Micro Devices. UNIX est une
marque déposée d’The Open Group.
Ce logiciel ou matériel et la documentation qui l’accompagne peuvent fournir des informations ou des liens donnant accès à des contenus, des produits et des services émanant de
tiers. Oracle Corporation et ses affiliés déclinent toute responsabilité ou garantie expresse quant aux contenus, produits ou services émanant de tiers, sauf mention contraire stipulée
dans un contrat entre vous et Oracle. En aucun cas, Oracle Corporation et ses affiliés ne sauraient être tenus pour responsables des pertes subies, des coûts occasionnés ou des
dommages causés par l’accès à des contenus, produits ou services tiers, ou à leur utilisation, sauf mention contraire stipulée dans un contrat entre vous et Oracle.
Accessibilité de la documentation
Pour plus d’informations sur l’engagement d’Oracle pour l’accessibilité à la documentation, visitez le site Web Oracle Accessibility Program, à l'adresse http://www.oracle.com/
pls/topic/lookup?ctx=acc&id=docacc.
Accès au support électronique
Les clients Oracle qui ont souscrit un contrat de support ont accès au support électronique via My Oracle Support. Pour plus d'informations, visitez le site http://www.oracle.com/
pls/topic/lookup?ctx=acc&id=info ou le site http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs si vous êtes malentendant.
Page 5
Contents
Using This Documentation ............. ................ ................ ................ ................ ... 11
Index ............. ................ ................ ................ ................ ................ ................ ... 201
9
Page 10
10SPARC T5-8 Server Service Manual • November 2015
Page 11
Using This Documentation
■
Overview – Describes how to troubleshooot and maintain the server
■
Audience – Technicians, system administrators, and authorized service providers
■
Required knowledge – Advanced experience troubleshooting and replacing hardware
Product Documentation Library
Late-breaking information and known issues for this product are included in the documentation
library at http://www.oracle.com/goto/T5-8/docs.
Feedback
Provide feedback about this documentation at http://www.oracle.com/goto/docfeedback.
Using This Documentation11
Page 12
12SPARC T5-8 Server Service Manual • November 2015
Page 13
Identifying Components
These topics identify key components of the server, including major modules and
subassemblies, as well as front and rear panel features.
■
“Front Panel Components (Service)” on page 14
■
“Rear Panel Components (Service)” on page 15
■
“DIMM Locations” on page 16
■
“Main Module Internal Component Locations” on page 17
■
“Supported Storage Devices” on page 18
■
“Fan Module Locations” on page 20
■
“Rear I/O Module Port Locations” on page 21
■
“Chassis Subassembly Components” on page 22
■
“Component Service Task Reference” on page 23
Related Information
■
“Detecting and Managing Faults”
■
“Preparing for Service”
■
“Returning the Server to Operation”
Identifying Components13
Page 14
Front Panel Components (Service)
Front Panel Components (Service)
No.DescriptionLinks
1Hard drives (8)“Servicing Hard Drives”
2Processor modules (0 to 3, bottom to top) or
processor filler modules (slot 1 and slot 2)
3Main module“Servicing the Main Module”
4Power supplies (0 to 3, left to right)“Servicing Power Supplies”
Related Information
■
“Rear Panel Components (Service)” on page 15
14SPARC T5-8 Server Service Manual • November 2015
“Supported Storage Devices” on page 18
“Servicing Processor Modules”
“Server Upgrade Process” on page 59
Page 15
■
“Chassis Subassembly Components” on page 22
Rear Panel Components (Service)
After you install the server into the rack, you must access these components from the rear of the
server. These components are not part of the rear chassis subassembly. You must remove these
components to access the rear chassis subassembly.
Rear Panel Components (Service)
No.DescriptionLinks
1Fan modules (10)“Servicing Fan Modules”
“Fan Module Locations” on page 20
2AC power connectors (0 to 3, right to left)“Servicing the Rear Chassis Subassembly”
3Rear I/O module“Servicing the Rear I/O Module”
4PCIe carriers (1 to 16, left to right)“Servicing PCIe Cards”
Identifying Components15
Page 16
DIMM Locations
This illustration shows the rear chassis subassembly removed from the server chassis. The rear
chassis subassembly is removed and serviced as a single unit.
No.DescriptionLinks
1Server chassis“Servicing the Rear Chassis Subassembly”
2Midplane
3Rear chassis subassembly
Related Information
■
“Front Panel Components (Service)” on page 14
■
“Chassis Subassembly Components” on page 22
DIMM Locations
DIMMs are located in each of the processor modules.
16SPARC T5-8 Server Service Manual • November 2015
Page 17
Main Module Internal Component Locations
This figure shows a processor module with all of the DIMM slots populated with DIMMs.
All of the DIMM slots must be populated with the same type of DIMMs. For guidelines, see
“DIMM Configuration” on page 71.
No.DescriptionLink
1DIMMs“Servicing DIMMs”
Main Module Internal Component Locations
These components are accessible after you remove the main module from the front of the
server.
Identifying Components17
Page 18
Supported Storage Devices
No.DescriptionLinks
1Hard drives (8)“Servicing Hard Drives”
2Front I/O module“Servicing the Front I/O Assembly”
3Storage backplanes (2)“Servicing the Storage Backplanes”
4Motherboard“Servicing the Main Module”
5SP“Servicing the Service Processor Card”
6System configuration PROM“Servicing the System Configuration PROM”
7Battery“Servicing the Battery”
Supported Storage Devices
The server supports these storage devices:
■
Fibre channel arrays (SATA, FC, flash, and SAS-2)
■
SAS arrays (SAS-2)
■
ZFS appliances (SAS-2)
The server also supports these types of tape backup and restore devices:
18SPARC T5-8 Server Service Manual • November 2015
Page 19
Supported Storage Devices
■
TCP/IP
■
Fibre channel
■
SAS
■
LVD SCSI
You can install a mixture of storage devices, but the server requires at least one storage device
to be installed and operational.
No.DescriptionLink
1Drive 1“Servicing Hard Drives”
2Drive 0
3Drive 3
4Drive 2
5Drive 5
6Drive 4
7Drive 7
8Drive 6
Identifying Components19
Page 20
Fan Module Locations
Fan Module Locations
No.DescriptionLink
1Fan module 5“Servicing Fan Modules”
2Fan module 0
3Fan module 6
4Fan module 1
5Fan module 7
6Fan module 2
7Fan module 8
8Fan module 3
9Fan module 9
10Fan module 4
20SPARC T5-8 Server Service Manual • November 2015
Page 21
Rear I/O Module Port Locations
The rear I/O module is located on the rear panel above the AC power connectors. The rear I/O
module provides access to all of the internal PCIe ports for the server and the video and USB
ports for input devices.
Rear I/O Module Port Locations
No.PortLabelLinks
1Network managementNET MGTThis Ethernet port enables you to connect the server to your
2Serial managementSER MGTThis serial port enables you to connect directly to the SP.
3Ethernet network
4USBThese ports provide access for input devices.
5VideoThis port provides access to the console output on the SP.
NETxWhere x is the number of the port. You can use these
local network so that you can manage the server from a
remote location.
Ethernet ports to connect the server to local or wide area
networks.
Related Information
■
“Servicing the Rear I/O Module”
■
“Chassis Subassembly Components” on page 22
Identifying Components21
Page 22
Chassis Subassembly Components
Chassis Subassembly Components
No.DescriptionLinks
1Hard drives (8)“Servicing Hard Drives”
2Front I/O assembly“Servicing the Front I/O Assembly”
3Main module“Servicing the Main Module”
4System controls and indicators“Front Panel Components (Service)” on page 14
5Processor modules (4 fully populated) or
processor filler modules (slot 1 and slot 2)
6Chassis“Servicing the Rear Chassis Subassembly”
7Rear chassis subassembly“Servicing the Rear Chassis Subassembly”
8Fan modules (4)“Servicing Fan Modules”
9PCIe carriers (16)“Servicing PCIe Cards”
10Rear I/O module“Servicing the Rear I/O Module”
11Power supplies (4)“Servicing Power Supplies”
22SPARC T5-8 Server Service Manual • November 2015
“Servicing Processor Modules”
Page 23
Component Service Task Reference
Related Information
■
“Front Panel Components (Service)” on page 14
■
“Rear Panel Components (Service)” on page 15
■
“Component Service Task Reference” on page 23
Component Service Task Reference
This table lists the names of server components that you can service. It also lists the system
names and task locations for the components.
ComponentMax. NAC NameSDM NameNotesLinks
Battery1
Chassis1
DIMMs128
Disk drives8
Fan modules10
Front I/O
assembly
1
/SYS/MB/BAT
/SYS/System
/SYS/
PMx/CMx/CMP /BOBx/CHx/D0
/SYS/SASBPx/HDDx/System/Storage/ Disks/
/SYS/RCSA/FBDx/ FMx/System/Cooling/ Fans/
/SYS/FIO
/System/Memory/ DIMMs/
DIMM_x
Disks_x
Fan_x
16 or 32 GB“Servicing DIMMs”
SAS (300 GB or
600 GB) or SSD
(100 GB or 300
GB)
These components
are incuded:
“Servicing the Battery”
Refer to the SPARC T5Server Installation Guide.
“Servicing Hard Drives”
“Servicing Fan Modules”
“Servicing the Front I/O
Assembly”
Main module
motherboard
■ FIO board
with FRU
PROM
■ VGA board
(no FRU
PROM)
■ FIO enclosure
with cables
1
/SYS/MB
These internal
components must
be reused:
■ Front I/O
module
■ Service
processor
■ Disk
backplanes (2)
■ SCC PROM
■ Battery
“Servicing the Main
Module”
Identifying Components23
Page 24
Component Service Task Reference
ComponentMax. NAC NameSDM NameNotesLinks
■ Disk drives
(all)
PCIe carriers16
PCIe cards16
Processor
modules
Rear I/O module 1
Power supplies4
Rear chassis
subassembly
SCC PROM1
SP1
Storage
backplanes
/SYS/RCSA/PCIEx/ CAR
/SYS/RCSA/PCIEx/ CAR/
CARD
4
/SYS/PMx/System/ CPU_Modules/
/SYS/RIO
/SYS/PSx/System/Power/
1
/SYS/RSCA
/SYS/MB/SCC
/SP/SP
2
/SYS/SASB/Px
/System/ PCI_Devices/Addon/Device_x
CPU_Module_x
Power_Supplies/
Power_Supply_x
4 for fully
populated
configuration. 2
for half populated
configuration with
2 processor filler
modules.
“Servicing PCIe Cards”
“Servicing PCIe Cards”
“Servicing Processor
Modules”
“Server Upgrade
Process” on page 59
“Servicing the Rear I/O
Module”
“Servicing Power Supplies”
“Servicing the Rear Chassis
Subassembly”
“Servicing the System
Configuration PROM”
“Servicing the Service
Processor Card”
“Servicing the Storage
Backplanes”
Related Information
■
“Front Panel Components (Service)” on page 14
■
“Rear Panel Components (Service)” on page 15
■
“Chassis Subassembly Components” on page 22
■
“Server Upgrade Process” on page 59
■
SPARC and Netra SPARC T5 Series Servers Administration Guide
24SPARC T5-8 Server Service Manual • November 2015
Page 25
Detecting and Managing Faults
These topics explain how to use various diagnostic tools to monitor server status and
troubleshoot faults in the server. The examples use the PSH fmadmfaulty command.
■
“Understanding Diagnostics” on page 25
■
“Interpreting LEDs” on page 29
■
“Configuring POST” on page 35
■
“Managing Faults” on page 41
■
“Interpreting Log Files and System Messages” on page 45
Related Information
■
“Identifying Components”
■
“Preparing for Service”
■
“Component Service Task Reference” on page 23
■
“Returning the Server to Operation”
■
Oracle ILOM Documentation Library
Understanding Diagnostics
These topics explain the diagnostic process and tools.
■
“Diagnostics Process” on page 25
■
“Tool Availability” on page 27
■
“Log In to Oracle ILOM (Service)” on page 28
■
“Oracle ILOM Service-Related Tools” on page 29
Diagnostics Process
Depending on the fault, you might need to perform all of the steps or just some of them. You
also might have to run diagnostic software that needs to be installed or enabled.
Detecting and Managing Faults25
Page 26
Understanding Diagnostics
Note - The diagnostic tools you use, and the order in which you use them, depend on the nature
of the problem you are troubleshooting. However, for descriptive purposes, this table follows
the steps given in the illustration.
26SPARC T5-8 Server Service Manual • November 2015
Page 27
StepDiagnostic ActionPossible OutcomeLinks
1.Confirm that the Power OK
and AC OK LEDs are lit.
2.Check the server for detected
faults.
3.Check the log files for fault
information.
4.RunOracle VTS software.To run Oracle VTS, the server must be running the
If these LEDs are not lit, check the power source and
power connections to the server.
Use these tools to check for faults:
■ System LEDs on the front and rear panels.
■
fmadmfaulty from the Oracle Solaris prompt or
through the Oracle ILOM fault management shell.
■
showfaulty from the Oracle ILOM. prompt or
through the Open Problems BUI
■ Datacenter management tools, such as Oracle
Enterprise Manager Ops Center.
If system messages indicate a faulty component, replace
it.
Oracle Solaris OS.
■ If Oracle VTS reports a faulty component, replace it.
■ If Oracle VTS does not report a faulty component,
run POST.
“Interpreting LEDs” on page 29
“Check for Faults” on page 41
“Interpreting Log Files and System
Messages” on page 45
■ Refer to the Oracle VTS software
■ “Configuring
■ Contact technical support if the
Understanding Diagnostics
documentation.
POST” on page 35
problem persists.
Related Information
■
“Tool Availability” on page 27
■
“Log In to Oracle ILOM (Service)” on page 28
■
“Oracle ILOM Service-Related Tools” on page 29
■
Oracle ILOM Documentation Library
Tool Availability
This table describes what tools are available at the different states in which the server operates.
ToolOracle ILOM PromptOpenBoot PromptOracle Solaris Prompt
Status LEDsYesYesYes
PSH commandsYesNoYes
Oracle ILOM logs and commandsYesNoNo
OpenBoot commandsNoYesNo
Oracle Solaris logs and commandsNoNoYes
Oracle VTSNoNoYes (if installed)
Third-party softwareNoNoYes (if installed)
Detecting and Managing Faults27
Page 28
Log In to Oracle ILOM (Service)
Related Information
■
“Diagnostics Process” on page 25
■
“Log In to Oracle ILOM (Service)” on page 28
■
“Oracle ILOM Service-Related Tools” on page 29
■
Oracle ILOM Documentation Library
Log In to Oracle ILOM (Service)
1.
At the terminal prompt, type:
ssh root@IP-address
Password: password
Oracle (R) Integrated Lights Out Manager
Version 3.2.1.2 rXXXXX
Copyright (c) 2013, Oracle and/or its affiliates. All rights reserved.
->
Note - To enable first-time login and access to Oracle ILOM, a default Administrator account
and its password are provided with the system. To build a secure environment, you must change
the default password (changeme) for the default Administrator account (root) after your initial
login to Oracle ILOM. If this default Administrator account has since been changed, contact
your system administrator for an Oracle ILOM user account with Administrator privileges.
2.
Enable the Oracle ILOM 3.0 legacy name spaces.
-> set /SP/cli legacy_targets=enabled
Note - In Oracle ILOM 3.1, the name spaces for /SYS and /STORAGE were replaced with
/System. You can still use the 3.0 legacy names in commands at any time, but to expose the
legacy names in the output, you must enable them. This manual uses the legacy names in the
command examples and shows the names in the output examples. For more information about
the new name spaces, see the Oracle ILOM documentation.
Related Information
■
“Diagnostics Process” on page 25
■
“Tool Availability” on page 27
■
“Oracle ILOM Service-Related Tools” on page 29
28SPARC T5-8 Server Service Manual • November 2015
Page 29
Interpreting LEDs
■
“Matching Devices to Device Names” in SPARC and Netra SPARC T5 Series Servers
Administration Guide
■
Oracle ILOM Documentation Library
Oracle ILOM Service-Related Tools
You can use these Oracle ILOM shell commands when performing service-related tasks.
Oracle ILOM CommandDescription
help [command]
set /HOST send_break_action=breakTakes the host server from the OS to either kmdb or
start /HOST/console
show /HOST/console/history
set /HOST/bootmode property=value
stop /System
Displays a list of all available commands with syntax and
descriptions. Specifying a command name as an option
displays help for that command.
OpenBoot prompt (equivalent to a Stop-A), depending on
the mode in which the Oracle Solaris OS was booted.
Connects to the host.
Displays the contents of the host's console buffer.
Controls the method of booting for the host server's
firmware. The value of property can be state, config, or
script.
Powers off the host server.
or stop/SYS
start/System
or start/SYS
reset/System
or reset/SYS
reset/SP
Related Information
■
“Diagnostics Process” on page 25
■
“Tool Availability” on page 27
■
“Log In to Oracle ILOM (Service)” on page 28
■
Oracle ILOM Documentation Library
Interpreting LEDs
Use these steps to determine if an LED indicates that a component has failed in the server.
Powers on the host server.
Generates a hardware reset on the host server.
Reboots the SP.
Detecting and Managing Faults29
Page 30
Interpreting LEDs
StepsDescriptionLinks
1.Check the LEDs on the front and rear of the server. ■ “Front Panel Controls and
2.Check the LEDs on the individual components.
Note - Component LEDs might not be lit
even though the component is faulty. Use the
instructions in these links to determine if the
component has been diagnosed as being faulty.
LEDs” on page 31
■ “Rear Panel Controls and
LEDs” on page 33
■ “Determine if the Main Module Is
Faulty” on page 91
■ “Determine Which Processor Module Is
Faulty” on page 62
■ “Determine Which DIMM Is Faulty
(LEDs)” on page 75
■ “Determine Which Hard Drive Is
Faulty” on page 82
■ “Determine Which Power Supply Is
Faulty” on page 134
■ “Determine Which Fan Module Is
Faulty” on page 142
■ “Determine Which PCIe Card Is
Faulty” on page 155
■ “Determine if the Rear I/O Module Is
Faulty” on page 173
Related Information
■
“Understanding Diagnostics” on page 25
■
“Managing Faults” on page 41
30SPARC T5-8 Server Service Manual • November 2015
Page 31
Front Panel Controls and LEDs
Interpreting LEDs
No.LEDIcon or LabelDescription
1Locator LED and
button(white)
2Service Required
LED(amber)
3Power OK LED
(green)
You can turn on the Locator LED to identify a particular server. When lit, the LED
blinks rapidly. Turn on the Locator LED by pressing the Locator button, or see
“Locate the Server” on page 53.
The fmadmfaulty command provides details about any faults that cause this
indicator to light. See “Check for Faults” on page 41.
Under some fault conditions, individual component fault LEDs are lit in addition to
the Service Required LED.
Indicates these conditions:
■ Off – Server is not running in its normal state. Server power might be off. The
SP might be running.
■ Steady on – Server is powered on and is running in its normal operating state.
No service actions are required.
■ Fast blink – Server is running in standby mode and can be quickly returned to
full function.
Detecting and Managing Faults31
Page 32
Interpreting LEDs
No.LEDIcon or LabelDescription
■ Slow blink – A normal but transitory activity is taking place. Slow blinking
might indicate that server diagnostics are running or that the server is booting.
4Power buttonThe recessed Power button toggles the server on or off. See “Power Off the Server
(Power Button - Graceful)” on page 55.
5System Overtemp
6Fan Module Fault
7PCIe Card Fault
LED(amber)
LED(amber)
LED(amber)
Related Information
■
“Rear Panel Controls and LEDs” on page 33
■
“Understanding Diagnostics” on page 25
Indicates these conditions:
■ Off – Indicates a steady state, no service action is required.
■ Steady on – Indicates that a temperature failure event has been acknowledged
and a service action is required.
Rear FMIndicates these conditions:
■ Off – Indicates a steady state, no service action is required.
■ Steady on – Indicates that a fan module failure event has been acknowledged
and a service action is required on at least one of the fan modules.
Rear PCIeIndicates these conditions:
■ Off – Indicates a steady state, no service action is required.
■ Steady on – Indicates that a failure event has been acknowledged and a service
action is required on at least one of the PCIe cards.
32SPARC T5-8 Server Service Manual • November 2015
Page 33
Rear Panel Controls and LEDs
Interpreting LEDs
No.LEDIcon or LabelDescription
1AC 0 (left) and AC 1 (right)
power LED
2Net MGT port link LEDIndicates these conditions:
3Net MGT port speed LEDIndicates these conditions:
4Network port link LEDIndicates these conditions:
5Network port speed LEDIndicates these conditions:
6AC 2 (left) and AC 3 (right)
power LEDs
Indicates these conditions:
■ Off – No power is applied to the server.
■ Green – Power is applied to the server.
■ Off – No link is established.
■ On or blinking – A link is established.
■ Off – The link is operating as a 10-Mbps connection.
■ On or blinking – The link is operating as a 100-Mbps connection.
■ Off – No link is established.
■ Blinking – A link is established.
■ Off – The link is operating as a 10-Mbps connection or there is no
link.
■ Amber on – The link is operating as a 100-Mbps connection.
■ Green on – The link is operating as a Gigabit connection (1000
Mbps).
■ Amber on – The link is operating as a 100-Mbps connection.
Indicates these conditions:
Detecting and Managing Faults33
Page 34
Interpreting LEDs
No.LEDIcon or LabelDescription
■ Off – No power is applied to the server.
■ Green – Power is applied to the server.
7Locator LED and button
(white)
Turn on the Locator LED by pressing the Locator button, or see
“Locate the Server” on page 53. When lit, the LED blinks rapidly.
8Service Required LED
(amber)
9Power OK LED(green)Indicates these conditions:
10SP LEDSPIndicates these conditions:
11Physical presence buttonThis button can be used to prove physical presence in the case of log-in
The fmadmfaulty command provides details about any faults that
cause this indicator to light. See “Check for Faults” on page 41.
Under some fault conditions, individual component fault LEDs are lit
in addition to the Service Required LED.
■ Off – Server is not running in its normal state. System power might
be off. The SP might be running.
■ Steady on – Server is powered on and is running in its normal
operating state. No service actions are required.
■ Fast blink – Server is running in standby mode and can be quickly
returned to full function.
■ Slow blink – A normal but transitory activity is taking place. Slow
blinking might indicate that system diagnostics are running or that
the system is booting.
■ Off – AC power might have been connected to the power supplies.
■ Steady on, green – SP is running in its normal operating state. No
service actions are required.
■ Blink, green – SP is initializing the Oracle ILOM firmware.
■ Steady on, amber – An SP error has occurred and service is
required.
recovery.
Indicates these conditions:
■ Off – Indicates a steady state, no service action is required.
■ Steady on – Indicates that a temperature failure event has been
acknowledged and a service action is required.
12Overtemp LED (amber)Indicates these conditions:
■ Off – Indicates a steady state, no service action is required.
■ Steady on – Indicates that a temperature failure event has been
acknowledged and a service action is required.
Related Information
■
“Front Panel Controls and LEDs” on page 31
■
“Understanding Diagnostics” on page 25
34SPARC T5-8 Server Service Manual • November 2015
Page 35
Configuring POST
These topics explain how to configure POST as a diagnostic tool.
■
“POST Overview” on page 35
■
“Oracle ILOM Properties That Affect POST Behavior” on page 35
■
“Configure POST” on page 39
■
“Run POST With Maximum Testing” on page 40
POST Overview
POST is a group of PROM-based tests that run when the server is powered on or when it is
reset. POST checks the basic integrity of the critical hardware components in the server.
You can also set other Oracle ILOM properties to control various other aspects of POST
operations. For example, you can specify the events that cause POST to run, the level of testing
POST performs, and the amount of diagnostic information POST displays. These properties are
described in “Oracle ILOM Properties That Affect POST Behavior” on page 35.
Configuring POST
If POST detects a faulty component, the component is disabled automatically. If the server is
able to run without the disabled component, the server boots when POST completes its tests.
For example, if POST detects a faulty processor core, the core is disabled, POST completes its
test sequence, and the server boots using the remaining cores.
Related Information
■
“Oracle ILOM Properties That Affect POST Behavior” on page 35
■
“Configure POST” on page 39
■
“Run POST With Maximum Testing” on page 40
Oracle ILOM Properties That Affect POST
Behavior
Note - The value of keyswitch_state must be normal when individual POST parameters are
changed.
Caution - Setting the verbosity values to max will result in POST taking a longer amount of
time to complete its testing of the server.
Detecting and Managing Faults35
Page 36
Configuring POST
TABLE 1
ValueDescription
normal
/HOST keyswitch_state
The server can power on and run POST (based on the
other parameter settings). This parameter overrides all
other commands.
diag
standby
locked
The server runs POST based on predetermined settings.
The server cannot power on.
The server can power on and run POST, but no flash
updates can be made.
TABLE 2
ValueDescription
off
/HOST/diag mode
POST does not run.
normalPOST runs according to diag level value.
maxIf diag mode=normal, runs all the minimum tests plus
extensive processor and memory tests.
minIf diagmode=normal, runs minimum set of tests.
TABLE 3
ValueDescription
hw-change
/HOST/diag trigger
(default) — Runs POST following a FRU replacement or
an AC power cycle.
all-resets
error-reset
power-on reset
none
Runs POST on all resets.
Runs POST on all error resets.
Runs POST on every power on.
Does not run POST on reset.
TABLE 4
ValueDescription
/HOST/diag hw_change_level
max
min
TABLE 5
ValueDescription
/HOST/diag hw_change_verbosity
min
max
normal
36SPARC T5-8 Server Service Manual • November 2015
Runs the maximum set of tests after a hardware change.
Runs the minimum set of tests after a hardware change.
(default) — Displays the minimum level of output during
the hardware change tests.
Displays information for each step.
Displays a moderate amount of information, including
component names and test results.
Page 37
ValueDescription
debug
none
Displays extensive debugging information.
Disables the output.
Configuring POST
TABLE 6
ValueDescription
max
min
TABLE 7
ValueDescription
min
max
normal
/HOST/diag power_on_level
(default) — Runs the maximum set of tests.
Runs the minimum set of tests.
/HOST/diagpower_on_verbosity
(default) — Displays the minimum level of output.
Displays information for each step.
Displays a moderate amount of information, including
component names and test results.
debug
none
TABLE 8
ValueDescription
/HOST/diag error_reset_level
max
min
Displays extensive debugging information.
Disables the output.
(default) — Runs the maximum set of tests.
Runs a minimum set of tests.
TABLE 9
ValueDescription
min
max
normal
/HOST/diag error_reset_verbosity
(default) — Displays the minimum level of output.
Displays information for each step.
Displays a moderate amount of information, including
component names and test results.
debug
none
TABLE 10
ValueDescription
/HOST/diag verbosity
normal
Displays extensive debugging information.
Disables the output.
Displays all test and informational messages in POST
output.
Detecting and Managing Faults37
Page 38
Configuring POST
ValueDescription
min
max
debug
none
Displays functional tests with a banner and pinwheel in
POST output.
Displays all test, informational, and some debugging
messages in POST output.
Displays extensive debugging information.
Does not display POST output.
This flowchart illustrates the same set of Oracle ILOM set command variables.
Related Information
■
“POST Overview” on page 35
■
“Configure POST” on page 39
■
“Run POST With Maximum Testing” on page 40
38SPARC T5-8 Server Service Manual • November 2015
Page 39
Configure POST
Configure POST
1.
Log in to Oracle ILOM.
See “Log In to Oracle ILOM (Service)” on page 28.
2.
Set the virtual keyswitch to the value that corresponds to the POST
configuration you want to run.
This example sets the virtual keyswitch to normal, which configures POST to run according to
other parameter values.
-> set /HOST keyswitch_state=normal
Set 'keyswitch_state' to 'Normal'
For possible values for the keyswitch_state parameter, see “Oracle ILOM Properties That
Affect POST Behavior” on page 35.
3.
If the virtual keyswitch is set to normal, and you want to define the mode, level,
verbosity, or trigger, set the respective parameters.
Syntax:
set/HOST/diagproperty=value.
See “Oracle ILOM Properties That Affect POST Behavior” on page 35 for a list of
parameters and values.
Examples:
-> set /HOST/diag mode=normal
-> set /HOST/diag verbosity=max
4.
View the current values for settings.
Example:
-> show /HOST/diag
/HOST/diag
Targets:
Properties:
error_reset_level = max
error_reset_verbosity = normal
hw_change_level = max
hw_change_verbosity = normal
level = min
mode = normal
power_on_level = max
power_on_verbosity = normal
trigger = hw_change error-reset
verbosity = normal
Detecting and Managing Faults39
Page 40
Run POST With Maximum Testing
Commands:
cd
set
show
->
Related Information
■
“POST Overview” on page 35
■
“Oracle ILOM Properties That Affect POST Behavior” on page 35
■
“Run POST With Maximum Testing” on page 40
■
Oracle ILOM Documentation Library
Run POST With Maximum Testing
This procedure describes how to configure the server to run the maximum level of POST.
1.
Log in to Oracle ILOM.
See “Log In to Oracle ILOM (Service)” on page 28.
2.
Set the virtual keyswitch to diag so that POST runs in service mode.
Alternatively, you can use the /System target.
-> set /HOST keyswitch_state=diag
Set 'keyswitch_state' to 'Diag'
3.
Run POST.
Alternatively, you can use the /System target.
-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS
4.
Start the system console to view the output of the tests.
-> start /HOST/console
Related Information
■
“POST Overview” on page 35
■
“Oracle ILOM Properties That Affect POST Behavior” on page 35
■
“Configure POST” on page 39
■
Oracle ILOM Documentation Library
40SPARC T5-8 Server Service Manual • November 2015
Page 41
Managing Faults
These topics describe the Predictive Self-Healing (PSH) feature.
■
“PSH Overview” on page 41
■
“Check for Faults” on page 41
■
“Clear a Fault” on page 44
PSH Overview
PSH provides problem diagnosis on the SP and the host. Regardless of where a fault occurs,
you can view and manage the fault diagnosis from the SP or the host.
When possible, PSH initiates steps to take the component offline. PSH also logs the fault to the
syslogd daemon and provides a fault notification with a message ID. You can use the message
ID to get additional information about the problem from the knowledge article database.
Managing Faults
A PSH console message provides this information about each detected fault:
■
Type
■
Severity
■
Description
■
Automated response
■
Impact
■
Suggested action for system administrator
If PSH detects a faulty component, use the fmadmfaulty command to display information
about the fault. See “Check for Faults” on page 41.
Related Information
■
“Check for Faults” on page 41
■
“Clear a Fault” on page 44
Check for Faults
The fmadmfaulty command displays the list of faults detected by PSH. You can run this
command from either the host or through the Oracle ILOM fault management shell.
Detecting and Managing Faults41
Page 42
Check for Faults
1.
Log in to Oracle ILOM.
See “Log In to Oracle ILOM (Service)” on page 28.
2.
Check for PSH-diagnosed faults.
This example shows how to check for faults through the Oracle ILOM fault management shell.
-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n)? y
------------------- ------------------------------------ ------------- -------2012-08-27/19:46:26 4ec16c8d-5cdb-c6ca-c949-e24d3637ef27 PCIEX-8000-8R Major
Problem Status : solved
Diag Engine : [unknown]
System
Manufacturer : Oracle Corporation
Name : SPARC T5-8
Part_Number : 12345678+11+1
Serial_Number : xxxxxxxxxx
---------------------------------------Suspect 1 of 1
Fault class : fault.io.pciex.device-interr-corr
Certainty : 100%
Affects : hc:///chassis=0/motherboard=0/cpuboard=0/chip=0/hostbridge=0/pciexrc=0
Status : faulted but still in service
FRU
Status : faulty
Location : /SYS/PM0
Manufacturer : Oracle Corporation
Name : TLA,PN,NRM,T5 1.2
Part_Number : 7061001
Revision : 01
Serial_Number : 465769T+12445102WR
Chassis
Manufacturer : Oracle Corporation
Name : SPARC T5-8
Part_Number : 12345678+13+2
Serial_Number : xxxxxxxxxx
Description : A fault has been diagnosed by the Host Operation System.
Response : The service required LED on the chassis and on the affected
FRU may be illuminated.
Impact : No SP impact
Action : Refer to the associated reference document at
http://support.oracle.com/msg/PCIEX-8000-8R for the latest
service procedures and policies regarding this diagnosis.
42SPARC T5-8 Server Service Manual • November 2015
Page 43
faultmgmtsp>
In this example, a fault is displayed that includes these details:
■
Date and time of the fault (2012-08-27/19:46:26).
■
UUID (4e16c8d-5cdb-c6ca-c949-e24d3637ef27), which is unique to each fault.
■
Message identifier (PCIEX-8000-8R), which can be used to obtain additional
fault information from Knowledge Base articles.
3.
Consider your next step:
■
If you are checking for faults while adding additional processor
modules, and no faults were detected, return to Broken Link (Target ID:
Z40019D01512366).
■
If a fault is detected, proceed to Step 4.
Check for Faults
4.
Use the message ID to obtain more information about this type of fault.
a.
Obtain the message ID from console output.
b.
Go to https://support.oracle.com, and search on the message ID in the
Knowledge tab.
5.
Follow the suggested actions to repair the fault.
6.
Determine your next step.
■
If you found a fault that must be removed manually, go to “Clear a Fault” on page 44.
■
If you are upgrading the server and found no faults, return to “Server Upgrade
Process” on page 59.
Related Information
■
“PSH Overview” on page 41
■
“Clear a Fault” on page 44
■
“Server Upgrade Process” on page 59
Detecting and Managing Faults43
Page 44
Clear a Fault
Clear a Fault
When PSH detects faults, the faults are logged and displayed on the console. In most cases,
after the fault is repaired, the corrected state is detected by the server, and the fault condition
is repaired automatically. However, this repair should be verified. In cases where the fault
condition is not automatically cleared, you must clear the fault manually.
1.
After replacing a faulty FRU, power on the server.
See “Returning the Server to Operation”.
2.
At the host prompt, determine whether the replaced FRU still shows a faulty
state.
See “Check for Faults” on page 41.
■
If no fault is reported, you do not need to do anything else. Do not perform
the subsequent steps.
■
If a fault is reported, continue to Step 3.
3.
Clear the fault from all persistent fault records.
In some cases, even though the fault is cleared, some persistent fault information remains
and results in erroneous fault messages at boot time. To ensure that these messages are not
displayed, type this PSH command:
faultmgmtsp> fmadm acquitUUID
4.
If required, reset the server.
In some cases, the output of the fmadmfaulty command might include this message for the
faulty component:
faultedandtakenoutofservice.
If this message appears in the output, you must reset the server after you manually repair the
fault.
faultmgmtsp> exit
-> reset /System
Are you sure you want to reset /System? y
Resetting /System ...
5.
Clear the fault in the Oracle Enterprise Manager Ops Center software, if
applicable.
Clearing a fault with the fmadmaquit command does not clear that fault in the Oracle
Enterprise Manager Ops Center software. You must manually clear the fault (that is, incident).
For more information, see 9.9.10 Marking an Incident Repaired in the Oracle EnterpriseManager Ops Center Feature Reference Guide at:
44SPARC T5-8 Server Service Manual • November 2015
Page 45
Interpreting Log Files and System Messages
http://www.oracle.com/pls/topic/lookup?ctx=oc122
6.
Determine your next step.
■
If you are servicing a component, return to the procedure for that component.
■
If you are upgrading the server, return to “Server Upgrade Process” on page 59.
Related Information
■
“PSH Overview” on page 41
■
“Check for Faults” on page 41
■
“Server Upgrade Process” on page 59
Interpreting Log Files and System Messages
With the OS running on the server, you have the full complement of Oracle Solaris OS files and
commands available for collecting information and for troubleshooting.
If PSH does not indicate the source of a fault, check the message buffer and log files for
notifications for faults. Drive faults are usually captured by the Oracle Solaris message files.
These topics explain how to view the log files and system messages.
■
“Check the Message Buffer” on page 45
■
“Understanding Diagnostics” on page 25
■
“Managing Faults” on page 41
Check the Message Buffer
The dmesg command checks the system buffer for recent diagnostic messages and displays
them.
1.
Log in as superuser.
2.
Type:
# dmesg
Related Information
■
“View Log Files (Oracle Solaris)” on page 46
Detecting and Managing Faults45
Page 46
View Log Files (Oracle Solaris)
■
“View Log Files (Oracle ILOM)” on page 46
View Log Files (Oracle Solaris)
The error logging daemon, syslogd, automatically records various system warnings, errors, and
faults in message files. These messages can alert you to system problems such as a device that
is about to fail.
The /var/adm directory contains several message files. The most recent messages are in
the /var/adm/messages file. After a period of time (usually every week), a new messages
file is automatically created. The original contents of the messages file are rotated to a file
named messages.1. Over a period of time, the messages are further rotated to messages.2 and
messages.3, and then deleted.
1.
Log in as superuser.
2.
Type:
# more /var/adm/messages
3.
To view all logged messages, type:
# more /var/adm/messages*
Related Information
■
“Check the Message Buffer” on page 45
■
“View Log Files (Oracle Solaris)” on page 46
View Log Files (Oracle ILOM)
1.
View the event log.
-> show /SP/logs/event/list
2.
View the audit log.
-> show /SP/logs/audit/list
Related Information
■
“Check the Message Buffer” on page 45
46SPARC T5-8 Server Service Manual • November 2015
Page 47
■
“View Log Files (Oracle Solaris)” on page 46
View Log Files (Oracle ILOM)
Detecting and Managing Faults47
Page 48
48SPARC T5-8 Server Service Manual • November 2015
Page 49
Preparing for Service
These topics explain how to prepare to service the server.
StepDescriptionLinks
1.Review safety and handling information.“Safety Information” on page 49
2.Gather the tools for service.“Tools Needed for Service” on page 51
3.Locate the server to be serviced.“Locate the Server” on page 53
4.Find the server serial number.“Find the Server Serial Number” on page 52
5.Locate the component service information.“Component Service Categories” on page 52
6.For cold-service operations, shut down the OS, and
remove the power from the server.
7.Prevent ESD damage before you handle any server
component.
Related Information
■
“Identifying Components”
■
“Detecting and Managing Faults”
■
“Component Service Task Reference” on page 23
■
“Returning the Server to Operation”
“Removing Power From the
Server” on page 53
“Prevent ESD Damage” on page 57
Safety Information
For your protection, observe the following safety precautions when setting up your equipment:
■
Follow all cautions and instructions marked on the equipment and described in the
documentation shipped with your server.
■
Follow all cautions and instructions marked on the equipment and described in the SPARCT5-8 Server Safety and Compliance Guide.
■
Ensure that the voltage and frequency of your power source match the voltage and
frequency inscribed on the equipment's electrical rating label.
■
Follow the ESD safety practices as described in this section.
Preparing for Service49
Page 50
Safety Information
Safety Symbols
Note the meanings of the following symbols that might appear in this document:
Caution - There is a risk of personal injury or equipment damage. To avoid personal injury and
equipment damage, follow the instructions.
Caution - Hot surface. Avoid contact. Surfaces are hot and might cause personal injury if
touched.
Caution - Hazardous voltages are present. To reduce the risk of electric shock and danger to
personal health, follow the instructions.
ESD Measures
ESD-sensitive devices, such as networking adapters, hard drives, and DIMMs require special
handling.
Caution - Circuit boards and hard drives contain electronic components that are extremely
sensitive to static electricity. Ordinary amounts of static electricity from clothing or the work
environment can destroy the components located on these boards. Do not touch the components
along their connector edges.
Caution - You must disconnect all power supplies before servicing any of the components that
are inside the chassis.
Antistatic Wrist Strap Use
Wear an antistatic wrist strap, and use an antistatic mat when you are handling components such
as hard drive assemblies, circuit boards, or networking adapters. When servicing or removing
server components, attach an antistatic strap to your wrist and then to a metal area on the
chassis. Following this practice equalizes the electrical potentials between you and the server.
Antistatic Mat
Place ESD-sensitive components such as motherboards, memory, and other PCBs on an
antistatic mat.
50SPARC T5-8 Server Service Manual • November 2015
Page 51
Related Information
■
“Removing Power From the Server” on page 53
■
“Tools Needed for Service” on page 51
Tools Needed for Service
You will need the following tools for most service operations:
■
Antistatic wrist strap
■
Antistatic mat
■
No. 2 Phillips screwdriver
■
Mechanical lift (for rear chassis subassembly removal if only one person is present)
Related Information
■
“Component Service Categories” on page 52
■
“Filler Panels and Modules” on page 51
Tools Needed for Service
Filler Panels and Modules
Depending on the configuration, the server can include the following types of filler panels and
modules:
■
Hard drive filler panels
■
DIMM filler panels (these are used only to ship new processor modules). You must remove
all of the DIMM filer panels and replace them with DIMMs before installing the new
processor modules. DIMM filler panels are not supported in running processor modules.
■
PCIe card carriers (these function like filler panels when a card is not installed)
■
Processor filler modules (located in slot 1 and slot 2 in a half-populated server)
Caution - To maintain the proper air flow, all filler panels and modules must remain in the
server unless you remove one to install a functioning component at the same time.
Related Information
■
“Safety Information” on page 49
■
“Component Service Categories” on page 52
■
“Server Upgrade Process” on page 59
Preparing for Service51
Page 52
Component Service Categories
Component Service Categories
The following table identifies the server components that are replaceable.
ComponentA/C Power Status
BatteryOff“Servicing the Battery”
DIMMsOff“Servicing DIMMs”
Fan modulesOn or off“Servicing Fan Modules”
Front I/O assemblyOff“Servicing the Front I/O Assembly”
Hard drivesOn or off“Servicing Hard Drives”
Main moduleOff“Servicing the Main Module”
PCIe cardsOn or off“Servicing PCIe Cards”
Power suppliesOn or off“Servicing Power Supplies”
Processor modulesOff“Servicing Processor Modules”
Rear I/O moduleOff“Servicing the Rear I/O Module”
Rear chassis
subassembly
SPOff“Servicing the Service Processor Card”
Storage backplanesOff“Servicing the Storage Backplanes”
SCC PROMOffX“Servicing the System Configuration PROM”
for Removal
OffX“Servicing the Rear Chassis Subassembly”
Authorized
Service
Personnel Only
Remove and Replace Instructions
Related Information
■
“Removing Power From the Server” on page 53
■
“Returning the Server to Operation”
Find the Server Serial Number
If you need technical support for your server, you must to provide the server's serial number.
Use one of the following options to find the serial number:
a.
Locate the manufacturing sticker on the front of the server or on the sticker
on the side of the server.
b.
At the Oracle ILOM prompt, type:
-> show /System
52SPARC T5-8 Server Service Manual • November 2015
Propertiies:
health = OK
health_details = open_problems_count = 0
type = Rack Mount
model = SPARC T5-8
qpart_id = Q9527
part_number = 12345678+11+1
serial_number = xxxxxxxxxx
...
Locate the Server
Locate the Server
1.
At the Oracle ILOM prompt, type:
-> set /SYS/LOCATE value=Fast_Blink
Alternatively, you can type:
-> set /System/locator_indicator on
The white Locator LEDs (one on the front panel and one on the rear panel) blink.
2.
After locating the server with the blinking Locator LED, turn it off by pressing the
Locator button.
Note - Alternatively, you can turn off the Locator LED by running the Oracle ILOM set/SYS/
LOCATE value=off command.
Removing Power From the Server
These topics describe different methods for removing power from the chassis.
■
“Prepare to Power Off the Server” on page 54
Preparing for Service53
Page 54
Prepare to Power Off the Server
■
“Power Off the Server (SP Command)” on page 54
■
“Power Off the Server (Power Button - Graceful)” on page 55
■
“Power Off the Server (Emergency Shutdown)” on page 55
■
“Disconnect the Power Cords” on page 56
Prepare to Power Off the Server
1.
Notify any affected users that the server will be shut down.
Refer to the Oracle Solaris system administration documentation for additional information.
2.
Save any open files, and quit all running programs.
Refer to your application documentation for specific information for these processes.
3.
Shut down all logical domains.
Refer to the Oracle VM system administration documentation for details.
4.
Shut down the Oracle Solaris OS.
Refer to the Oracle Solaris administration documentation for details.
5.
Power off the server.
Related Information
■
“Power Off the Server (SP Command)” on page 54
■
“Power Off the Server (Power Button - Graceful)” on page 55
■
“Power Off the Server (Emergency Shutdown)” on page 55
■
“Configuring Boot and Restart Behavior” in SPARC and Netra SPARC T5 Series Servers
Administration Guide
Power Off the Server (SP Command)
You can use the SP to perform a graceful shutdown of the system. This type of shutdown
ensures that all of your data is saved and that the system is ready for restart.
Note - Additional information about powering off the system is provided in the SPARC T5
Series Servers Administration Guide.
1.
Log in as superuser or equivalent.
Depending on the type of problem, you might want to view system status or log files. You also
might want to run diagnostics before you shut down the system.
54SPARC T5-8 Server Service Manual • November 2015
Page 55
Power Off the Server (Power Button - Graceful)
2.
Switch from the system console to the Oracle ILOM prompt by typing the #.
(Hash Period) key sequence.
3.
At the Oracle ILOM prompt, type the stop/System command.
4.
If you are servicing a cold-service component, or if you are upgrading the server,
disconnect the power cords.
See “Disconnect the Power Cords” on page 56.
Related Information
■
“Power Off the Server (Power Button - Graceful)” on page 55
■
“Disconnect the Power Cords” on page 56
■
“Configuring Boot and Restart Behavior” in SPARC and Netra SPARC T5 Series Servers
Administration Guide
Power Off the Server (Power Button - Graceful)
This procedure places the system in the power standby mode. To service cold-replaceable
components, you must remove the power.
1.
Press and release the recessed Power button.
The Power OK LED blinks rapidly.
2.
If you are servicing a cold-service component, or if you are upgrading the server,
disconnect the power cords.
See “Disconnect the Power Cords” on page 56.
Related Information
■
“Power Off the Server (SP Command)” on page 54
■
“Disconnect the Power Cords” on page 56
■
“Configuring Boot and Restart Behavior” in SPARC and Netra SPARC T5 Series Servers
Administration Guide
Power Off the Server (Emergency Shutdown)
This procedure places the system in the power standby mode. To service cold-replaceable
components, you must remove the power.
Preparing for Service55
Page 56
Disconnect the Power Cords
Caution - All applications and files are closed abruptly without saving changes. File system
corruption might occur.
1.
Press and hold the Power button for four seconds.
2.
If you are servicing a cold-service component, or if you are upgrading the server,
disconnect the power cords.
See “Disconnect the Power Cords” on page 56.
Related Information
■
“Prepare to Power Off the Server” on page 54
■
“Disconnect the Power Cords” on page 56
■
“Configuring Boot and Restart Behavior” in SPARC and Netra SPARC T5 Series Servers
Administration Guide
Disconnect the Power Cords
1.
Ensure that you have shut down the system.
See:
■
“Power Off the Server (SP Command)” on page 54
■
“Power Off the Server (Power Button - Graceful)” on page 55
2.
Disconnect all of the power cords.
Caution - Because 3.3v standby power is always present in the server, you must unplug the
power cords before accessing any cold-serviceable components. See “Component Service
Categories” on page 52.
3.
Determine your next step.
■
If you are servicing a component, return to the procedure for that component.
■
If you are upgrading the server, return to “Server Upgrade Process” on page 59.
Related Information
■
“Prepare to Power Off the Server” on page 54
■
“Power Off the Server (SP Command)” on page 54
■
“Power Off the Server (Power Button - Graceful)” on page 55
■
“Power Off the Server (Emergency Shutdown)” on page 55
56SPARC T5-8 Server Service Manual • November 2015
Page 57
■
“Prevent ESD Damage” on page 57
■
“Server Upgrade Process” on page 59
Prevent ESD Damage
Many components housed within the chassis can be damaged by ESD. To protect these
components from damage, perform the following steps before opening the chassis for service.
1.
Prepare an antistatic surface to set parts on during the removal, installation, or
replacement process.
Place ESD-sensitive components, such as the printed circuit boards, on an antistatic mat. The
following items can be used as an antistatic mat:
■
Antistatic bag used to wrap a replacement part
■
ESD mat
■
A disposable ESD mat (shipped with some replacement parts or optional
server components)
Prevent ESD Damage
2.
Attach an antistatic wrist strap.
When servicing or removing server components, attach an antistatic strap to your wrist and then
to a metal area on the chassis.
Related Information
■
“Safety Information” on page 49
■
“Servicing Processor Modules”
■
“Servicing DIMMs”
■
“Servicing Hard Drives”
■
“Servicing the Main Module”
■
“Servicing the Storage Backplanes”
■
“Servicing the Service Processor Card”
■
“Servicing the System Configuration PROM”
■
“Servicing the Battery”
■
“Servicing the Front I/O Assembly”
■
“Servicing PCIe Cards”
■
“Servicing the Rear I/O Module”
■
“Servicing the Rear Chassis Subassembly”
Preparing for Service57
Page 58
58SPARC T5-8 Server Service Manual • November 2015
Page 59
Servicing Processor Modules
The SPARC T5-8 server supports two configurations:
■
Fully-populated — four processor modules
■
Half-populated — two processor modules and two processor filler modules
Processor modules and processor filler modules are cold-service components that can be
replaced only after you remove all power from the system. For the location of the processor
modules, see “Front Panel Components (Service)” on page 14.
Caution - This procedure requires that you handle components that are sensitive to electrostatic
discharge. This discharge can cause server components to fail.
These topics describe how to service the processor modules:
■
“Server Upgrade Process” on page 59
■
“Processor Module LEDs” on page 61
■
“Determine Which Processor Module Is Faulty” on page 62
■
“Remove a Processor Module or Processor Filler Module” on page 63
■
“Install a Processor Module or Processor Filler Module” on page 67
■
“Verify the Processor Module” on page 70
Related Information
■
“Identifying Components”
■
“Detecting and Managing Faults”
■
“Preparing for Service”
■
“Component Service Task Reference” on page 23
■
“Returning the Server to Operation”
Server Upgrade Process
The SPARC T5-8 server supports two configurations:
Servicing Processor Modules59
Page 60
Server Upgrade Process
■
Fully-populated — four processor modules
■
Half-populated — two processor modules and two processor filler modules
Processor modules and processor filler modules are cold-service components that can be
replaced only after you remove all power from the system. For the location of the processor
modules, see “Front Panel Components (Service)” on page 14.
Caution - This procedure requires that you handle components that are sensitive to electrostatic
discharge. This discharge can cause server components to fail.
This table contains the steps for upgrading the server to a fully-populated configuration. You
can also view an animated demonstration of the upgrade process at:
1.Remove the upgrade components from their packaging, and place
2.Remove the covers from the new processor modules.Step 8 in “Remove a Processor Module or Processor Filler
3.Remove all of the DIMM filler panels in the processor modules.
4.Install the DIMMs. All of the DIMMs must be either 16 or 32 GB,
5.Replace the covers on the new processor modules.Step 1 in “Install a Processor Module or Processor Filler
6.Check for faults. If any fault is present, you must correct the fault
7.Shut down the server.“Removing Power From the Server” on page 53.
8.Remove the processor filler modules from slot 1 and slot 2.“Remove a Processor Module or Processor Filler
9.Install the new processor modules in slot 1 and slot 2.“Install a Processor Module or Processor Filler
10.Return the server to operation.“Returning the Server to Operation”.
11.Check for faults. If any fault is present, you must correct the fault
12.Review the root complex changes.“Root Complex Connections (Four Processor
13.Review the PCIe card load balancing changes. Even though the
them on an antistatic mat.
Module” on page 63.
“Remove a DIMM or DIMM Filler Panel” on page 76.
The steps to remove DIMM filler panels are the same as the steps
for removing DIMMs.
“Install a DIMM” on page 78.
and they must match the size and capacity of the DIMMs that are
already installed in the server.
Module” on page 67.
“Check for Faults” on page 41.
and clear it from the server.
Module” on page 63.
Module” on page 67.
“Check for Faults” on page 41.
and clear it from the server.
Modules)” on page 150
“PCIe Card Installation Guidelines” on page 153
load balancing guidelines change with the upgrade, you do not
need to move any existing PCIe cards.
60SPARC T5-8 Server Service Manual • November 2015
Page 61
Related Information
■
“Remove a DIMM or DIMM Filler Panel” on page 76
■
“Install a DIMM” on page 78
■
“Check for Faults” on page 41
■
“Removing Power From the Server” on page 53
■
“Remove a Processor Module or Processor Filler Module” on page 63
■
“Install a Processor Module or Processor Filler Module” on page 67
■
“Returning the Server to Operation”
■
“Root Complex Connections (Four Processor Modules)” on page 150
■
“PCIe Card Installation Guidelines” on page 153
Processor Module LEDs
Processor Module LEDs
No.LEDIconDescription
1Ready to Remove (blue)Indicates that a processor module can be
2Service Required (amber)Indicates that the processor module has
3OK (green)Indicates if the processor module is available for
removed.
experienced a fault condition.
use.
■ On – The server is running and the processor
module is powered up.
Servicing Processor Modules61
Page 62
Determine Which Processor Module Is Faulty
No.LEDIconDescription
■ Off – The server is powered down and the
processor module is in standby mode. If the
server is powered on, then this indicates that
the processor module is powered down (the
blue Ready to Remove LED will be lit in this
case).
Related Information
■
“Remove a Processor Module or Processor Filler Module” on page 63
■
“Verify the Processor Module” on page 70
Determine Which Processor Module Is Faulty
The following LEDs are lit when a processor module fault is detected:
■
Front and rear System Fault (Service Required) LEDs
■
Service Required LED on the faulty processor module
Note - A faulty processor module at PM0 results in server shutdown and failure to reboot. If
your server experiences a fault at PM0 and you do not have a replacement processor module
available, you can move one of the other processor modules to PM0 and then boot the server in
a degraded state.
Caution - In order to maintain system cooling, all four processor module slots must be occupied
either with a processor module or a processor filler module.
1.
Determine if the Service Required LEDs are lit on the front panel or the rear I/O
module.
See “Interpreting LEDs” on page 29.
2.
From the front of the server, check the processor module LEDs to identify which
processor module needs to be replaced.
See “Processor Module LEDs” on page 61.
3.
Remove the faulty processor module.
See “Remove a Processor Module or Processor Filler Module” on page 63.
Related Information
■
“Remove a Processor Module or Processor Filler Module” on page 63
62SPARC T5-8 Server Service Manual • November 2015
Page 63
Remove a Processor Module or Processor Filler Module
■
“Verify the Processor Module” on page 70
■
“Understanding PCIe Root Complex Connections” on page 149
Remove a Processor Module or Processor Filler Module
The SPARC T5-8 server supports two configurations:
■
Fully-populated — four processor modules
■
Half-populated — two processor modules in PM0 and PM1, and two processor filler
modules in PM2 and PM3
The removal steps are the same for both components. Processor modules and processor filler
modules are cold-service components that can be replaced only after you power off the system.
For the location of the modules, see “Front Panel Components (Service)” on page 14.
Caution - This procedure requires that you handle components that are sensitive to electrostatic
discharge. This discharge can cause server components to fail.
1.
Remove all of the power from the system.
See “Removing Power From the Server” on page 53.
2.
Take the necessary ESD precautions.
See “Prevent ESD Damage” on page 57.
3.
Determine which module you need to remove.
■
If you are replacing a faulty processor module or upgrading the memory, remove that
specific processor module.
■
If you are upgrading the server to a fully-populated configuration, start by removing the
processor filler module in slot 1.
Servicing Processor Modules63
Page 64
Remove a Processor Module or Processor Filler Module
4.
Squeeze the release latches together on the two extraction levers, and pull the
extraction levers out to disengage the processor module or processor filler
module from the server.
5.
Pull the processor module or processor filler module halfway out of the server,
and then close the levers.
64SPARC T5-8 Server Service Manual • November 2015
Page 65
Remove a Processor Module or Processor Filler Module
This will keep the levers from getting damaged when you remove the module from the server.
Caution - Do not touch the connectors at the rear of the module.
6.
Using two hands, completely remove the processor module or processor filler
module, and place the module on an antistatic mat.
7.
Determine your next step:
a.
If you are replacing DIMMs in an existing processor module, go to Step 8.
b.
If you are upgrading the server to a fully-populated configuration, repeat
Step 4 through Step 6 to remove the second processor filler module in slot
2, and then go to “Server Upgrade Process” on page 59.
8.
Remove the cover:
Servicing Processor Modules65
Page 66
Remove a Processor Module or Processor Filler Module
a.
Press down on the green button at the top of the cover to disengage the
cover from the processor module or modules.
b.
Keeping the button pressed down, push the cover toward the rear of the
processor module, and lift the cover up and away from the processor
module.
9.
Determine your next step:
■
If you are replacing DIMMs, see “Servicing DIMMs”.
■
If you are installing new processor modules to upgrade the server, return to
“Server Upgrade Process” on page 59.
■
If you are replacing a faulty processor module, follow these steps:
a.
Remove all of the DIMMs from the faulty processor module, and set
them in a safe place.
See “Remove a DIMM or DIMM Filler Panel” on page 76.
b.
Install the DIMMs into the new processor module.
See “Install a DIMM” on page 78.
c.
Install the processor module.
See “Install a Processor Module or Processor Filler Module” on page 67.
66SPARC T5-8 Server Service Manual • November 2015
Page 67
Install a Processor Module or Processor Filler Module
Related Information
■
“Determine Which Processor Module Is Faulty” on page 62
■
“Install a Processor Module or Processor Filler Module” on page 67
■
“Verify the Processor Module” on page 70
■
“Server Upgrade Process” on page 59
Install a Processor Module or Processor Filler Module
1.
Determine your first step.
■
If you are replacing the cover as part of the upgrade process, or if you are
servicing a processor module, go to Step 2.
■
If you are installing new processor modules as part of the upgrade process,
go to Step 4.
2.
Place the cover back onto the processor module, and slide the cover forward
until the latch clicks into place.
3.
Determine your next step.
■
If you are installing a new processor module, upgrading the memory, or
replacing a faulty DIMM, go to Step 4.
■
If you are replacing the covers on the new processor modules to upgrade
the server, return to “Server Upgrade Process” on page 59.
Servicing Processor Modules67
Page 68
Install a Processor Module or Processor Filler Module
4.
Open the latches on the processor module or processor filler module, and insert
the module into the empty processor module slot in the server.
5.
Push the levers together toward the center of the processor module or
processor filler module, and press the levers firmly against the module to fully
seat the module back into the server.
68SPARC T5-8 Server Service Manual • November 2015
Page 69
Install a Processor Module or Processor Filler Module
The levers should click into place when the module is fully seated in the server.
6.
Determine your next step.
■
If you replaced a faulty processor module or DIMM, see “Returning the
Server to Operation”.
■
If you installed new processor modules to upgrade the server, return to
“Server Upgrade Process” on page 59.
7.
Determine your next step.
■
If you replaced DIMMs, see “Verify the DIMM” on page 80.
■
If you replaced a processor module, see “Verify the Processor
Module” on page 70.
Servicing Processor Modules69
Page 70
Verify the Processor Module
Related Information
■
“Servicing DIMMs”
■
“Verify the Processor Module” on page 70
■
“Server Upgrade Process” on page 59
Verify the Processor Module
1.
Ensure that you have completed the following:
■
Applied power to the server.
See “Connect the Power Cords” on page 191.
■
Started the system.
See “Power On the Server (Oracle ILOM)” on page 192.
2.
If you replaced a faulty PM, log in to the fmadm shell, and use the fmadmfaulty
command to determine if a fault on the PM is shown:
-> start /SP/faultmgmt/shell
Do you want to start the /SP/faultmgmt/shell (y/n)? y
faultmgmtsp> fmadm faulty
a.
If the output shows the replacement PM as enabled, go to Step 3.
b.
If the output shows the replacement PM as disabled, go to “Detecting and
Managing Faults” to clear the fault from the server.
3.
Verify that the OK LED is lit on the PM and that the Fault LED is not lit.
See “Processor Module LEDs” on page 61.
4.
Verify that the front and rear Service Required LEDs are not lit.
See “Front Panel Controls and LEDs” on page 31 and “Rear Panel Controls and
LEDs” on page 33.
5.
Perform one of the following tasks based on your verification results:
■
If a fault was detected, see “Diagnostics Process” on page 25.
■
If no fault was detected, then the processor module was installed
successfully.
70SPARC T5-8 Server Service Manual • November 2015
Page 71
Servicing DIMMs
DIMMs are cold-service components that can be replaced after you remove the processor
module from the system. For the location of DIMMs, see “DIMM Locations” on page 16.
Caution - This procedure requires that you handle components that are sensitive to electrostatic
discharge. This discharge can cause server components to fail.
These topics describe service procedures for the DIMMs in the server.
StepDescriptionLinks
1.Understand how to configure the
DIMMs.
2.Locate a faulty DIMM.■ “Determine Which DIMM Is Faulty (FMA)” on page 73
3.Replace a DIMM.■ “Remove a DIMM or DIMM Filler Panel” on page 76
“DIMM Configuration” on page 71
■ “Determine Which DIMM Is Faulty (LEDs)” on page 75
■ “DIMM Configuration Fault Messages” on page 76
■ “Install a DIMM” on page 78
■ “Verify the DIMM” on page 80
Related Information
■
“Identifying Components”
■
“Detecting and Managing Faults”
■
“Preparing for Service”
■
“Component Service Task Reference” on page 23
■
“Returning the Server to Operation”
DIMM Configuration
Consider these topics when installing, upgrading, or replacing DIMMs.
Servicing DIMMs71
Page 72
DIMM Configuration
DIMM Guidelines
You must follow these guidelines:
■
Use either 16- or 32-Gbyte DDR3 DIMM capacity DIMMs.
■
Use Oracle qualified DIMMs.
■
Fully-populate (32 DIMMs) all processor modules.
Caution - If you ordered processor modules without memory to upgrade the server from a half-
populated configuration to a fully-populated configuration, you must install the same type and
size of DIMMs that are already in the existing processor modules.
If you are reviewing this information because you are upgrading the server, return to “Server
Upgrade Process” on page 59.
DIMM Locations
DIMM addresses, and consequently their NAC names, are based on their location on the
processor module motherboard, as well as the slot in which the processor is installed. For
example, the full address for the DIMM that is installed in the front-left corner of the processor
module that is installed in slot 0 is:
/System/Memory/DIMMs/DIMM_0.
or
/SYS/PM0/CM1/CMP/BOB0/CH0/D0.
This illustration shows the DIMM layout.
72SPARC T5-8 Server Service Manual • November 2015
Page 73
Determine Which DIMM Is Faulty (FMA)
Related Information
■
“DIMM Configuration Fault Messages” on page 76
■
“Determine Which DIMM Is Faulty (FMA)” on page 73
■
“Determine Which DIMM Is Faulty (LEDs)” on page 75
■
“Install a DIMM” on page 78
■
“Server Upgrade Process” on page 59
Determine Which DIMM Is Faulty (FMA)
The FMA fmadmfaulty command displays current server faults, including DIMM failures.
Type fmadmfaulty at the faultmgmtsp prompt.
-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n)? y
------------------- ------------------------------------ ---------------- --------2013-01-18/21:04:40 7040d859-5b03-4a58-8dfd-e3a80875d62f SPSUN4V-8000-CQ MAJOR
Problem Status : solved
Diag Engine : fdd 1.0
System
Manufacturer : Oracle Corporation
Name : SPARC T5-8
Part_Number : 12345678+11+1
Serial_Number : xxxxxxxxxx
System Component
Manufacturer : Oracle Corporation
Name : SPARC T5-8
Part_Number : 12345678-+11+1
Serial_Number : xxxxxxxxxx
-------------------------------------------Suspect 1 of 1
Fault class : fault.memory.dimm-ue
Certainty : 100%
Affects : /SYS/PM0/CM1/CMP/BOB0/CH0/D0
Status : faulted but still in service
Description: The number of correctable errors associated with this memory
module has exceeded acceptable levels.
Response : An attempt will be made to remove the affected memory from
service.
Impact : The dimm may be deconfigured at system restart which would
reduce total system memory capacity.
Action : Use 'fmadm faulty' to provide a more detailed view of this
event. Please refer to the associated reference document at
http://support.oracle.com/msg/SPSUN4V-8000-CQ for the latest
service procedures and policies regarding this diagnosis.
“Determine Which DIMM Is Faulty (LEDs)” on page 75
74SPARC T5-8 Server Service Manual • November 2015
Page 75
Determine Which DIMM Is Faulty (LEDs)
■
“Remove a DIMM or DIMM Filler Panel” on page 76
■
Oracle ILOM documentation
Determine Which DIMM Is Faulty (LEDs)
1.
Check that the Service Required LED is lit on the front of the server.
See “Front Panel Controls and LEDs” on page 31.
2.
Check that the Service Required LED is lit on one of the processor modules.
See “Processor Module LEDs” on page 61.
3.
Remove the PM with the faulty DIMM.
See “Remove a Processor Module or Processor Filler Module” on page 63.
4.
Locate the DIMM Fault Remind button on the front right corner of the
motherboard.
5.
Verify that the DIMM Fault Remind Power LED next to the button is lit.
An illuminated DIMM Fault Remind Power LED indicates that there is power available to light
the faulty DIMM LED after you have pressed the DIMM Fault Remind button.
Servicing DIMMs75
Page 76
DIMM Configuration Fault Messages
6.
Press the DIMM Fault Remind button on the processor module.
This will cause DIMM Fault LED associated with the faulty DIMM to light for a few minutes.
7.
Confirm that the DIMM next to the illuminated DIMM Fault LED is the same DIMM
that was reported to be faulty by the fmadmfaulty command.
See “Determine Which DIMM Is Faulty (FMA)” on page 73.
8.
Visually check to ensure that all of the other DIMMs are seated properly in their
slots.
Related Information
■
“Determine Which DIMM Is Faulty (FMA)” on page 73
■
“Remove a DIMM or DIMM Filler Panel” on page 76
DIMM Configuration Fault Messages
When the system boots, system firmware checks the memory configuration against the rules
described in “DIMM Configuration” on page 71. If it discovers any faults, one or more
rule-specific messages will be displayed in the POST output indicating the type of configuration
fault that has been discovered.
Related Information
■
“DIMM Configuration” on page 71
■
“Determine Which DIMM Is Faulty (FMA)” on page 73
■
“Determine Which DIMM Is Faulty (LEDs)” on page 75
■
“Remove a DIMM or DIMM Filler Panel” on page 76
Remove a DIMM or DIMM Filler Panel
DIMMs are cold-service components that can be replaced after you remove the processor
module from the server.
Caution - This procedure requires that you handle components that are sensitive to electrostatic
discharge. This discharge can cause server components to fail.
Before beginning this procedure, ensure that you are familiar with the cautions and safety
instructions described in “Safety Information” on page 49.
76SPARC T5-8 Server Service Manual • November 2015
Page 77
Remove a DIMM or DIMM Filler Panel
Caution - Do not leave DIMM slots empty. All of the DIMM slots must have a DIMM.
1.
Take the necessary ESD precautions.
See “Prevent ESD Damage” on page 57.
2.
Remove the PM with the faulty DIMM.
See “Remove a Processor Module or Processor Filler Module” on page 63.
3.
Locate the DIMMs that need to be replaced.
See “Determine Which DIMM Is Faulty (FMA)” on page 73 or “Determine Which DIMM
Is Faulty (LEDs)” on page 75.
4.
Push down on the ejector tabs on each side of the DIMM until the DIMM is
released.
Caution - DIMMs and heat sinks on the motherboard might be hot.
5.
Grasp the top corners of the faulty DIMM, and lift it out of its slot.
6.
Place the DIMM on an antistatic mat.
7.
Repeat Step 4 through Step 6 for any other DIMMs that you intend to remove.
8.
Determine your next step.
■
If you are replacing a faulty DIMM, see “Install a DIMM” on page 78.
Servicing DIMMs77
Page 78
Install a DIMM
All of the replacement DIMMs must be the same size and type. See “DIMM
Configuration” on page 71.
■
If you are upgrading the server, continue to remove all of the DIMM filler
panels from the new processor modules.
DIMM filler panels are not supported in running processor modules. After you have
removed all of the DIMM filler panels, you can install the new DIMMs. See “Install a
DIMM” on page 78.
Related Information
■
“DIMM Configuration” on page 71
■
“Determine Which DIMM Is Faulty (FMA)” on page 73
■
“Determine Which DIMM Is Faulty (LEDs)” on page 75
■
“Install a DIMM” on page 78
■
“Server Upgrade Process” on page 59
Install a DIMM
Before beginning this procedure, ensure that you are familiar with the information provided in
these topics:
■
“Safety Information” on page 49
■
“DIMM Configuration” on page 71
1.
Take the necessary ESD precautions.
See “Prevent ESD Damage” on page 57.
2.
Ensure that you have removed the processor module if you are replacing a faulty
DIMM.
See “Remove a Processor Module or Processor Filler Module” on page 63.
3.
Ensure that you have removed the faulty DIMM.
See “Remove a DIMM or DIMM Filler Panel” on page 76.
4.
Unpack the replacement DIMM, and place it on an antistatic mat.
Caution - If you ordered processor modules without memory to upgrade the server from a half-
populated configuration to a fully-populated configuration, you must install the same size and
capacity of DIMMs that are already in the existing processor modules.
78SPARC T5-8 Server Service Manual • November 2015
Page 79
5.
Ensure that the ejector tabs on the connector that will receive the DIMM are in
the open position.
6.
Align the DIMM notch with the key in the connector.
Caution - Ensure that the orientation is correct. The DIMM might be damaged if the orientation
is reversed.
Install a DIMM
7.
Push the DIMM into the connector until the ejector tabs lock the DIMM in place.
If the DIMM does not easily seat into the connector, check the DIMM's orientation.
8.
Determine your next step.
■
If you replaced a faulty DIMM, go to Step 9.
■
If you are upgrading the server, repeat Step 5 through Step 7 until all of the new DIMMs are
installed. Then, go to “Server Upgrade Process” on page 59.
9.
Install the PM.
See “Install a Processor Module or Processor Filler Module” on page 67.
Related Information
■
“DIMM Configuration” on page 71
■
“Remove a DIMM or DIMM Filler Panel” on page 76
■
“Verify the DIMM” on page 80
■
“Server Upgrade Process” on page 59
Servicing DIMMs79
Page 80
Verify the DIMM
Verify the DIMM
1.
Ensure that you have completed the following:
■
■
2.
Log in to Oracle ILOM.
See “Log In to Oracle ILOM (Service)” on page 28.
3.
Start the faultmgmt shell.
-> start SP/faultmgmt/shell
Are you sure you want to start the faultmgmt shell (y/n)? y
faultmgmtsp>
Applied power to the server.
See “Connect the Power Cords” on page 191.
Started the system.
See “Power On the Server (Oracle ILOM)” on page 192.
4.
Use the fmadmfaulty command to determine if the server is operating normally.
■
If a fault was detected, the server is not operating normally.
See “Diagnostics Process” on page 25.
■
If no fault was detected, the DIMM was installed successfully.
Related Information
■
“DIMM Configuration” on page 71
■
“DIMM Configuration Fault Messages” on page 76
■
“Install a DIMM” on page 78
80SPARC T5-8 Server Service Manual • November 2015
Page 81
Servicing Hard Drives
The storage devices in the server are hot-serviceable, meaning that the devices can be removed
and inserted while the server is powered on, depending on the state of the device and the
configuration of the data on that device.
A hard drive is hot-pluggable if the drive is in slot 1 to 7. The hard drive in slot 0 cannot be
removed without shutting down the server unless it is configured with an alternative I/O path.
Taking a drive offline prevents any applications from accessing it, and removes the logical
software links to it.
The following situations inhibit your ability to hot-service a drive:
■
If the drive contains the operating system, and the operating system is not mirrored on
another drive.
■
If the drive cannot be logically isolated from the online operations of the server.
If either of these conditions apply to the drive being serviced, you must take the server offline
(shut down the operating system) before you replace the drive.
For the location of the hard drives, see “Supported Storage Devices” on page 18.
These topics describe service procedures for the hard drives in the server.
StepDescriptionLinks
1.Understand the hard drive LEDs.“Hard Drive LEDs” on page 82
2.Replace a hard drive.■ “Determine Which Hard Drive Is Faulty” on page 82
■ “Remove a Hard Drive” on page 83
■ “Install a Hard Drive” on page 85
■ “Verify the Hard Drive” on page 86
3.Add storage.■ “Install a Hard Drive” on page 85
■ “Verify the Hard Drive” on page 86
Related Information
■
“Identifying Components”
■
“Detecting and Managing Faults”
■
“Preparing for Service”
Servicing Hard Drives81
Page 82
Hard Drive LEDs
■
“Component Service Task Reference” on page 23
■
“Returning the Server to Operation”
Hard Drive LEDs
No.LEDIconDescription
1Ready to Remove
(blue)
2Service Required
(amber)
3OkayOKIndicates normal operation. Blinking indicates that the drive is
Indicates that a drive can be removed during a hot-service
operation.
Indicates the drive's availability for use.
■ On – Read or write activity is in progress.
■ Off – Drive is idle and available for use.
in use.
Related Information
■
“Determine Which Hard Drive Is Faulty” on page 82
■
“Remove a Hard Drive” on page 83
Determine Which Hard Drive Is Faulty
The following LEDs are lit when a hard drive fault is detected:
■
System Service Required LEDs on the front panel and rear I/O module
■
Service Required LED on the faulty drive
82SPARC T5-8 Server Service Manual • November 2015
Page 83
1.
Determine if the System Service Required LEDs are lit on the front panel or the
rear I/O module.
See “Interpreting LEDs” on page 29.
2.
From the front of the server, check the drive LEDs to identify which drive needs
to be replaced.
See “Hard Drive LEDs” on page 82.
3.
Remove the faulty drive.
See “Remove a Hard Drive” on page 83.
Related Information
■
“Remove a Hard Drive” on page 83
■
“Verify the Hard Drive” on page 86
Remove a Hard Drive
Remove a Hard Drive
Hard drives are hot-service components if they are in slots 1 to 7. The hard drive in slot 0
cannot be removed unless it has an alternate I/O path.
Caution - This procedure requires that you handle components that are sensitive to electrostatic
discharge. This discharge can cause server components to fail.
1.
Locate the drive in the server that you want to remove.
■
See “Front Panel Components (Service)” on page 14 for the locations of the
drives in the server.
■
See “Determine Which Hard Drive Is Faulty” on page 82 to locate a faulty
drive.
2.
Determine if you need to shut down the OS to replace the drive, and perform one
of the following actions:
■
If the drive cannot be taken offline without shutting down the OS, follow
instructions in “Power Off the Server (SP Command)” on page 54, and then
go to Step 4.
■
If the drive can be taken offline without shutting down the OS, go to Step 3.
3.
Take the drive offline:
Servicing Hard Drives83
Page 84
Remove a Hard Drive
a.
At the Oracle Solaris prompt, type the cfgadm-al command to list all drives
in the device tree, including drives that are not configured:
# cfgadm -al
This command lists dynamically reconfigurable hardware resources and shows their
operational status. In this case, look for the status of the drive you plan to remove. This
information is listed in the Occupant column.
You must unconfigure any drive whose status is listed as configured, as described in Step
3b.
b.
Unconfigure the drive using the cfgadm-cunconfigure command.
Example:
# cfgadm -c unconfigure c2::w5000cca00a76d1f5,0
Replace c2::w5000cca00a76d1f5,0 with the drive name that applies to your situation.
c.
Verify that the drive's blue Ready to Remove LED is lit.
4.
Press the drive release button to unlock the drive.
84SPARC T5-8 Server Service Manual • November 2015
Page 85
Install a Hard Drive
5.
Pull the drive out of the server.
Caution - The latch is not an ejector. Do not force the latch too far to the right. Doing so can
damage the latch.
6.
Install the replacement drive or a filler tray.
See “Install a Hard Drive” on page 85.
Related Information
■
“Determine Which Hard Drive Is Faulty” on page 82
■
“Install a Hard Drive” on page 85
Install a Hard Drive
1.
Align the replacement drive to the drive slot, and slide the drive in until it is
seated.
Servicing Hard Drives85
Page 86
Verify the Hard Drive
2.
Drives are physically addressed according to the slot in which they are installed. If you are
replacing a drive, install the replacement drive in the same slot as the drive that was removed.
Close the latch to lock the drive in place.
3.
Verify the installation.
See “Verify the Hard Drive” on page 86.
Verify the Hard Drive
1.
Determine if you replaced or installed a hard drive in a running server or not.
■
If you replaced or installed a hard drive in a server that is running (if you hot-plugged the
hard drive), then no further action is necessary. The Oracle Solaris OS will automatically
configure the hard drive.
■
If you replaced or installed a hard drive in a powered-down server, then continue with these
steps to configure the hard drive.
2.
If the OS is shut down, and the drive you replaced was not the boot device, boot
the OS.
Depending on the nature of the replaced drive, you might need to perform administrative tasks
to reinstall software before the server can boot. Refer to the Oracle Solaris OS administration
documentation for more information.
3.
At the Oracle Solaris prompt, type the cfgadm-al command to list all drives in the
device tree, including any drives that are not configured:
# cfgadm -al
86SPARC T5-8 Server Service Manual • November 2015
Page 87
Verify the Hard Drive
This command helps you identify the drive you installed. Example:
Perform one of the following tasks based on your verification results:
■
If the previous steps did not verify the drive, see “Diagnostics
Process” on page 25.
■
If the previous steps indicate that the drive is functioning properly, perform
the tasks required to configure the drive. These tasks are covered in the
Oracle Solaris OS administration documentation.
For additional drive verification, you can run the Oracle VTS software. Refer to the Oracle VTS
documentation for details.
Servicing Hard Drives87
Page 88
Verify the Hard Drive
Related Information
■
“Determine Which Hard Drive Is Faulty” on page 82
■
“Install a Hard Drive” on page 85
88SPARC T5-8 Server Service Manual • November 2015
Page 89
Servicing the Main Module
The main module is a cold-service component that can be replaced only after you have
powered off the server. For the location of the main module, see “Front Panel Components
(Service)” on page 14.
Caution - This procedure requires that you handle components that are sensitive to electrostatic
discharge. This discharge can cause server components to fail.
These topics explain how to service the main module.
StepsDescriptionLinks
1.Replace the main module.■ “Remove the Main Module” on page 91
■ “Install the Main Module” on page 95
2.Remove the main module as part of another
component's service operation.
3.Install the main module as part of another
component's service operation.
“Remove the Main Module” on page 91
“Install the Main Module” on page 95
Related Information
■
“Identifying Components”
■
“Detecting and Managing Faults”
■
“Preparing for Service”
■
“Component Service Task Reference” on page 23
■
“Returning the Server to Operation”
Servicing the Main Module89
Page 90
Main Module LEDs
Main Module LEDs
No.LEDIconDescription
1Service Required LED
(amber)
2Power OK LED(green)Indicates these conditions:
3SP LEDSPIndicates these conditions:
Indicates that service is required.
The fmadm faulty command provides details about any faults that
cause this indicator to light.
Under some fault conditions, individual component fault LEDs are
illuminated in addition to the Service Required LED.
■ Off – System is not running in its normal state. System power
might be off. The SP might be running.
■ Steady on – System is powered on and is running in its normal
operating state. No service actions are required.
■ Fast blink – System is running in standby mode and can be quickly
returned to full function.
■ Slow blink – A normal but transitory activity is taking place. Slow
blinking might indicate that system diagnostics are running or that
the system is booting.
■ Off – The AC power might have been disconnected to the power
supplies.
■ Steady on, green – SP is running in its normal operating state. No
service actions are required.
■ Blink, green – SP is initializing the Oracle ILOM firmware.
90SPARC T5-8 Server Service Manual • November 2015
Page 91
No.LEDIconDescription
■ Steady on, amber – A SP error has occurred and service is
required.
Related Information
■
“Determine if the Main Module Is Faulty” on page 91
■
“Remove the Main Module” on page 91
Determine if the Main Module Is Faulty
Check the Service Required and SP LEDs on the main module.
See “Main Module LEDs” on page 90.
Related Information
■
“Remove the Main Module” on page 91
■
“Verify the Main Module” on page 97
Determine if the Main Module Is Faulty
Remove the Main Module
The main module is a cold-service component that can be replaced only after you have powered
off the server.
Caution - This procedure requires that you handle components that are sensitive to electrostatic
discharge. This discharge can cause server components to fail.
1.
(Optional) If you are replacing a faulty main module, you must back up ILOM
configuration settings.
a.
Configure the SER MGT port to enable the configuration parameters to be
uploaded.
Refer to the ILOM documentation for network configuration instructions.
b.
Back up the ILOM configuration parameters.
See Oracle ILOM documentation.
2.
Shut down the server.
Servicing the Main Module91
Page 92
Remove the Main Module
See “Removing Power From the Server” on page 53.
3.
Locate the main module in the server.
See “Front Panel Components (Service)” on page 14.
4.
Squeeze the green latches together on the two extraction levers, and pull the
extraction levers out to disengage the main module from the server.
92SPARC T5-8 Server Service Manual • November 2015
Page 93
5.
Pull the main module halfway out of the server.
Remove the Main Module
6.
Press the levers back together, toward the center of the main module.
This will keep the levers from getting damaged when you pull the main module out.
7.
Remove the main module completely from the server.
8.
Press down on the green button at the top of the cover to disengage the cover
from the main module, and push the cover toward the rear of the module as you
lift the cover up and away from the chassis.
9.
Determine your next step.
Servicing the Main Module93
Page 94
Remove the Main Module
a.
b.
If you are replacing a main module due to a faulty motherboard, remove all
of these internal components, and transfer them to the new motherboard.
ComponentLink
Front I/O subassembly“Remove the Front I/O Assembly” on page 127
Hard drives“Remove a Hard Drive” on page 83
Storage backplanes“Remove a Storage Backplane” on page 99
System battery“Remove the Battery” on page 123
System configuration PROM“Remove the System Configuration
System processor card“Remove the Service Processor Card” on page 110
PROM” on page 118
If you are replacing a component inside the main module, use one of the
following links:
■
“Servicing the Service Processor Card”
■
“Servicing the Battery”
■
“Servicing the System Configuration PROM”
■
“Servicing the Front I/O Assembly”
■
“Servicing the Storage Backplanes”
Related Information
■
“Determine if the Main Module Is Faulty” on page 91
■
“Install the Main Module” on page 95
94SPARC T5-8 Server Service Manual • November 2015
Page 95
Install the Main Module
1.
Place the cover back onto the main module, and slide the cover forward until the
latch clicks into place.
Install the Main Module
2.
Open the levers so that they are fully open.
3.
Insert the main module back into its slot in the server until the levers begin to
engage.
Servicing the Main Module95
Page 96
Install the Main Module
4.
Press the levers back together toward the center of the module, and then press
the levers firmly against the module to fully seat the module back into the server.
The levers should click into place when the module is fully seated in the server.
5.
Determine your next step:
a.
If you replaced an internal component, return to the procedure for that
component.
■
“Verify the Battery” on page 126
■
“Verify the Front I/O Assembly” on page 131
■
“Verify the Service Processor Card” on page 114.
■
“Verify the Storage Backplane” on page 107
■
“Verify the System Configuration PROM” on page 120
b.
If you replaced the entire main module, see “Verify the Main
Module” on page 97.
96SPARC T5-8 Server Service Manual • November 2015
Page 97
Verify the Main Module
6.
If the you are replacing the main module with a new one, connect a terminal or a
terminal emulator (PC or workstation) to the SER MGT port.
The following message is delivered over the serial management port.
Unrecognized Chassis: This module is installed in an unknown or
unsupported chassis. You must upgrade the firmware to a newer
version that supports this chassis.
7.
Download the system firmware.
a.
Configure the SER MGT port to enable the firmware image to be
downloaded.
Refer to the Oracle ILOM documentation for network configuration instructions.
b.
Download the system firmware.
Follow the firmware download instructions in the Oracle ILOM documentation.
Note - You can load any supported system firmware version, including the firmware revision
that had been installed prior to the replacement of the main module. However, Oracle strongly
recommends installing the newest version of the system firmware.
8.
Power on the server.
See “Returning the Server to Operation”.
Related Information
■
“Remove the Main Module” on page 91
■
“Verify the Main Module” on page 97
Verify the Main Module
1.
Ensure that you have completed the following:
■
Applied power to the server.
See “Connect the Power Cords” on page 191.
■
Started the system.
See “Power On the Server (Oracle ILOM)” on page 192.
2.
Start the faultmgmt shell.
-> start SP/faultmgmt/shell
Servicing the Main Module97
Page 98
Verify the Main Module
Are you sure you want to start the faultmgmt shell (y/n)? y
faultmgmtsp>
3.
Use the fmadmfaulty command to determine if the server is operating normally.
■
■
Related Information
■
■
If a fault was detected, see “Diagnostics Process” on page 25.
If no fault was detected, the main module was installed successfully.
“Determine if the Main Module Is Faulty” on page 91
“Install the Main Module” on page 95
98SPARC T5-8 Server Service Manual • November 2015
Page 99
Servicing the Storage Backplanes
The storage backplances are cold-service components that can be replaced after you remove
the main module. For the location of the storage backplanes, see “Main Module Internal
Component Locations” on page 17.
Caution - This procedure requires that you handle components that are sensitive to electrostatic
discharge. This discharge can cause server components to fail.
These topics describe service procedures for the storage backplanes in the server.
StepsDescriptionLinks
1.Remove a storage backplane.“Remove a Storage Backplane” on page 99
2.Install a storage backplane.“Install a Storage Backplane” on page 103
3.Verify the installation.“Verify the Storage Backplane” on page 107
Related Information
■
“Identifying Components”
■
“Detecting and Managing Faults”
■
“Preparing for Service”
■
“Component Service Task Reference” on page 23
■
“Returning the Server to Operation”
Remove a Storage Backplane
The storage backplances are cold-service components that can be replaced after you remove the
main module.
Caution - This procedure requires that you handle components that are sensitive to electrostatic
discharge. This discharge can cause server components to fail.
1.
Power off the server, and disconnect the power cords.
Servicing the Storage Backplanes99
Page 100
Remove a Storage Backplane
See “Removing Power From the Server” on page 53.
2.
Take the necessary ESD precautions.
See “Prevent ESD Damage” on page 57.
3.
Remove all the hard drives from the front of the server for the storage backplane
that you want to replace.
Note the locations of the drives before removing them so that you can install them in their
original slots. You have to remove only hard drives 0–3 or drives 4–7, depending on which
storage backplane you want to replace. See “Remove a Hard Drive” on page 83.
4.
Remove the main module from the server.
See “Remove the Main Module” on page 91.
5.
Locate the storage backplane that you want to remove.
No.Description
1
2
100SPARC T5-8 Server Service Manual • November 2015
Storage backplane for drives 4 through 7 (SAS_BP1)
Storage backplane for drives 0 through 3 (SAS_BP0)
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.