Sun Microsystems, Inc. provided technical input and review on portions of this material.
Sun Microsystems,Inc. andFujitsu Limited eachown orcontrol intellectualproperty rights relating to products andtechnology described in
this document,and such products, technology andthis documentare protectedby copyright laws, patents andother intellectual property laws
and internationaltreaties. Theintellectual property rights of SunMicrosystems, Inc.and Fujitsu Limited in suchproducts, technologyand this
document include,without limitation, one or moreof theUnited States patentslisted athttp://www.sun.com/patentsand one ormore
additional patentsor patent applications in theUnited States or other countries.
This documentand the product and technologyto whichit pertains are distributedunder licensesrestricting their use, copying, distribution,
and decompilation.No part of such productor technology,or ofthis document, maybe reproducedin anyform by anymeans withoutprior
written authorizationof Fujitsu Limited and SunMicrosystems, Inc.,and their applicablelicensors, ifany.The furnishingof this documentto
you doesnot give you any rightsor licenses, express or implied,with respectto theproduct or technology to whichit pertains, and this
document doesnot contain or represent any commitment ofany kindon the partof FujitsuLimited or SunMicrosystems, Inc.,or any affiliate of
either ofthem.
This documentand the product and technologydescribed inthis document mayincorporate third-partyintellectual propertycopyrighted by
and/or licensedfrom suppliersto Fujitsu Limitedand/or SunMicrosystems, Inc.,including software and font technology.
Per theterms of the GPL orLGPL, a copy of thesource codegoverned by theGPL orLGPL, as applicable,is availableupon requestby the End
User.Please contactFujitsu Limited orSun Microsystems,Inc
This distribution may include materials developed by third parties.
Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark
in the U.S. and in other countries, exclusively licensed through X/Open Company, Ltd.
Sun, Sun Microsystems, the Sun logo, Java, Netra, Solaris, Sun Ray, Answerbook2, docs.sun.com, OpenBoot, and Sun Fire are trademarks or
registered trademarks of Sun Microsystems, Inc., or its subsidiaries, in the U.S. and other countries.
Fujitsu and the Fujitsu logo are registered trademarks of Fujitsu Limited.
All SPARC trademarks are used under license and are registered trademarks of SPARC International, Inc. in the U.S. and other countries.
Products bearing SPARC trademarks are based upon architecture developed by Sun Microsystems, Inc.
SPARC64 is a trademark of SPARC International, Inc., used under license by Fujitsu Microelectronics, Inc. and Fujitsu Limited.
The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sunacknowledges
the pioneering efforts of Xerox in researching and developing theconcept of visual or graphical user interfaces forthe computer industry. Sun
holds anon-exclusive license from Xerox to the Xerox GraphicalUser Interface, which license alsocovers Sun’s licensees who implementOPEN
LOOK GUIs and otherwise comply with Sun’s written license agreements.
United StatesGovernment Rights - Commercial use.U.S. Governmentusers aresubject to thestandard governmentuser license agreements of
Sun Microsystems,Inc. andFujitsu Limited andthe applicableprovisions ofthe FARand itssupplements.
Disclaimer: The only warranties granted by Fujitsu Limited, Sun Microsystems, Inc. or any affiliate of either of them in connection with this
document or any product or technology described herein are those expressly set forth in the license agreement pursuant to which the product
or technology is provided. EXCEPT AS EXPRESSLY SET FORTH IN SUCH AGREEMENT, FUJITSU LIMITED, SUN MICROSYSTEMS, INC.
AND THEIRAFFILIATES MAKENO REPRESENTATIONSOR WARRANTIES OF ANY KIND (EXPRESS OR IMPLIED)REGARDING SUCH
PRODUCT OR TECHNOLOGY OR THIS DOCUMENT, WHICH ARE ALL PROVIDED AS IS, AND ALL EXPRESS OR IMPLIED
CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTY OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE
EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. Unless otherwise expressly set forth in such agreement, to the
extent allowed by applicable law, in no event shall Fujitsu Limited, Sun Microsystems, Inc. or any of their affiliates have any liability to any
third party under any legal theory for any loss of revenues or profits, loss of use or data, or business interruptions, or for any indirect, special,
incidental or consequential damages, even if advised of the possibility of such damages.
DOCUMENTATION IS PROVIDED “AS IS” AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES,
INCLUDING ANYIMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT,
ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.
Entrée et revue tecnical fournies par Sun Microsystems, Incl sur des parties de ce matériel.
Sun Microsystems, Inc. et Fujitsu Limited détiennent et contrôlent toutes deux des droits de propriété intellectuelle relatifs aux produits et
technologies décrits dans ce document. De même, ces produits, technologies et ce document sont protégés par des lois sur le copyright, des
brevets, d’autreslois sur la propriétéintellectuelle et des traités internationaux. Les droits de propriété intellectuelle de Sun Microsystems, Inc.
et Fujitsu Limited concernant ces produits, ces technologies et ce document comprennent, sans que cette liste soit exhaustive, un ou plusieurs
des brevets déposésaux États-Unis et indiqués à l’adresse http://www.sun.com/patents de même qu’un ou plusieursbrevets ou applications
brevetées supplémentaires aux États-Unis et dans d’autres pays.
Ce document, le produit et les technologies afférents sont exclusivement distribués avec des licences qui en restreignent l’utilisation, la copie,
la distribution et la décompilation. Aucune partie de ce produit, de ces technologies ou de ce document ne peut être reproduite sous quelque
forme quece soit, par quelque moyen que ce soit, sans l’autorisation écrite préalable de Fujitsu Limited et de Sun Microsystems, Inc., etde leurs
éventuels bailleurs de licence. Ce document, bien qu’il vous ait été fourni, ne vous confère aucun droit et aucune licence, expresses ou tacites,
concernant le produitou la technologie auxquelsil se rapporte. Par ailleurs, il ne contient nine représente aucun engagement,de quelque type
que ce soit, de la part de Fujitsu Limited ou de Sun Microsystems, Inc., ou des sociétés affiliées.
Ce document, et le produit et les technologies qu’il décrit, peuvent inclure des droits de propriété intellectuelle de parties tierces protégés par
copyright et/ou cédés sous licence par des fournisseurs à Fujitsu Limited et/ou Sun Microsystems, Inc., y compris des logiciels et des
technologies relatives aux polices de caractères.
Par limites du GPL ou du LGPL, une copie du code source régi par le GPL ou LGPL, comme applicable, est sur demande vers la fin utilsateur
disponible; veuillez contacter Fujitsu Limted ou Sun Microsystems, Inc.
Cette distribution peut comprendre des composants développés par des tierces parties.
Des parties de ce produit pourront être dérivées des systèmes Berkeley BSD licenciés par l’Université de Californie. UNIX est une marque
déposée aux Etats-Unis et dans d’autres pays et licenciée exclusivement par X/Open Company, Ltd.
Sun, Sun Microsystems, le logo Sun, Java, Netra, Solaris, Sun Ray, Answerbook2, docs.sun.com, OpenBoot, et Sun Fire sont des marques de
fabrique ou des marques déposées de Sun Microsystems, Inc., ou ses filiales, aux Etats-Unis et dans d’autres pays.
Fujitsu et le logo Fujitsu sont des marques déposées de Fujitsu Limited.
Toutes les marques SPARC sont utilisées sous licence et sont des marques de fabrique ou des marques déposées de SPARC International, Inc.
aux Etats-Unis et dans d’autres pays. Les produits portant les marques SPARC sont basés sur une architecture développée par Sun
Microsystems, Inc.
SPARC64 est une marques déposée de SPARC International, Inc., utilisée sous le permis par Fujitsu Microelectronics, Inc. et Fujitsu Limited.
L’interface d’utilisation graphique OPEN LOOK et Sun™ a été développée par Sun Microsystems, Inc. pour ses utilisateurs et licenciés. Sun
reconnaît les effortsde pionniers de Xerox pour la recherche et le développement du concept des interfaces d’utilisation visuelle ou graphique
pour l’industrie de l’informatique. Sun détient une license non exclusive de Xerox sur l’interface d’utilisation graphique Xerox, cette licence
couvrant également les licenciés de Sun qui mettent en place l’interface d’utilisation graphique OPEN LOOK et qui, en outre, se conforment
aux licences écrites de Sun.
Droits du gouvernement américain - logiciel commercial. Les utilisateurs du gouvernement américain sont soumis aux contrats de licence
standard de Sun Microsystems, Inc. et de Fujitsu Limited ainsi qu’aux clauses applicables stipulées dans le FAR et ses suppléments.
Avis denon-responsabilité: les seulesgaranties octroyéespar Fujitsu Limited,Sun Microsystems, Inc. ou toutesociété affiliée del’une ou l’autre
entité enrapport avec cedocument ou toutproduit ou toutetechnologie décrit(e) dansles présentes correspondent aux garanties expressément
stipulées dans le contrat de licence régissant le produit ou la technologie fourni(e). SAUF MENTION CONTRAIRE EXPRESSÉMENT
STIPULÉE DANS CE CONTRAT, FUJITSU LIMITED, SUN MICROSYSTEMS, INC. ET LES SOCIÉTÉS AFFILIÉES REJETTENT TOUTE
REPRÉSENTATION OU TOUTE GARANTIE, QUELLE QU’EN SOIT LA NATURE (EXPRESSE OU IMPLICITE) CONCERNANT CE
PRODUIT,CETTE TECHNOLOGIE OUCE DOCUMENT, LESQUELS SONT FOURNISEN L’ÉTAT. EN OUTRE,TOUTES LES CONDITIONS,
REPRÉSENTATIONS ET GARANTIES EXPRESSES OU TACITES, Y COMPRIS NOTAMMENT TOUTE GARANTIE IMPLICITERELATIVE À
LA QUALITÉ MARCHANDE, À L’APTITUDE À UNE UTILISATION PARTICULIÈRE OU À L’ABSENCE DE CONTREFAÇON, SONT
EXCLUES, DANS LA MESURE AUTORISÉE PAR LA LOI APPLICABLE. Sauf mention contraire expressément stipulée dans ce contrat, dans
la mesure autoriséepar la loi applicable, en aucun cas Fujitsu Limited,Sun Microsystems, Inc. ou l’une de leurs filiales nesauraient être tenues
responsables envers une quelconque partie tierce, sous quelque théorie juridique que ce soit, de tout manque à gagner ou de perte de profit,
de problèmes d’utilisation ou de perte de données, ou d’interruptions d’activités, ou de tout dommage indirect, spécial, secondaire ou
consécutif, même si ces entités ont été préalablement informées d’une telle éventualité.
LA DOCUMENTATION EST FOURNIE “EN L’ETAT” ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES
OU TACITES SONT FORMELLEMENTEXCLUES, DANSLA MESURE AUTORISEE PAR LA LOIAPPLICABLE, Y COMPRIS NOTAMMENT
TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L’APTITUDE A UNE UTILISATION PARTICULIERE OU A
L’ABSENCE DE CONTREFACON.
Contents
Prefacexiii
1.Safety Precautions for Maintenance1–1
1.1ESD Precautions1–1
1.2Server Precautions1–3
1.2.1Electrical Safety Precautions1–3
1.2.2Equipment Rack Safety Precautions1–3
1.2.3Component Handling Precautions1–4
2.Hardware Overview2–1
2.1Name of Each Part2–1
2.2Operator Panel2–5
2.2.1Operator Panel Overview2–6
2.2.2Switches on the Operator Panel2–7
2.2.3LEDs on the Operator Panel2–9
2.3LED Functions of Components2–11
2.4External Interface Port on Rear Panel2–13
2.5Labels2–17
3.Troubleshooting3–1
3.1Emergency Power Off3–1
v
3.2Failure Diagnostic Method3–2
3.3Checking the Server and System Configuration3–4
3.3.1Checking the Hardware Configuration and FRU Status3–4
3.3.1.1Checking the Hardware Configuration.3–5
3.3.2Checking the Software and Firmware Configurations3–6
3.3.2.1Checking the Software Configuration3–7
3.3.2.2Checking the Firmware Configuration3–7
3.3.2.3Downloading Error Log Information3–7
3.4Error Conditions3–8
3.4.1Predictive Self-Healing Tools3–8
3.4.2Monitoring Output3–10
3.4.3Messaging Output3–10
3.5Using Troubleshooting Commands3–11
3.5.1Using the showhardconf Command3–11
3.5.2Using the showlogs Command3–14
3.5.3Using the showstatus Command3–15
3.5.4Using the fmdump Command3–16
3.5.4.1fmdump -V Command3–16
3.5.4.2fmdump -e Command3–17
3.5.5Using the fmadm Command3–17
3.5.5.1Using the fmadm faulty Command3–17
3.5.5.2fmadm repair Command3–18
3.5.5.3fmadm config Command3–18
3.5.6Using the fmstat Command3–19
3.6General Solaris Troubleshooting Commands3–19
3.6.1Using the iostat Command3–20
3.6.1.1Options3–20
3.6.2Using the prtdiag Command3–21
viSPARC Enterprise M3000 Server Service Manual • November 2009
3.6.2.1Options3–21
3.6.3Using the prtconf Command3–23
3.6.3.1Options3–24
3.6.4Using the netstat Command3–26
3.6.4.1Options3–26
3.6.5Using the ping Command3–27
3.6.5.1Options3–27
3.6.6Using the ps Command3–28
3.6.6.1Options3–29
3.6.7Using the prstat Command3–29
3.6.7.1Options3–30
4.FRU Replacement Preparation4–1
4.1Tools Required for Maintenance4–1
4.2FRU Replacement and Installation Methods4–2
4.2.1FRU Replacement4–2
4.2.2FRU Installation4–3
4.3Active Replacement/Active Addition4–5
4.3.1Releasing a FRU from a Domain4–5
4.3.2FRU Removal and Replacement4–6
4.3.3Configuring a FRU in a Domain4–6
4.3.4Verifying the Hardware Operation4–7
4.4Hot Replacement/Hot Addition4–7
4.4.1FRU Removal and Replacement4–8
4.4.2Verifying the Hardware Operation4–10
4.5Cold Replacement/Cold Addition4–12
4.5.1Powering off the Server4–12
4.5.1.1Power-off by Using the XSCF Command4–12
4.5.1.2Power off by Using the Operator Panel4–13
Contentsvii
4.5.2FRU Removal and Replacement4–13
4.5.3Powering on the Server4–13
4.5.3.1Power-on by Using the XSCF Command4–13
4.5.3.2Power-on by Using the Operator Panel4–14
4.5.4Verifying the Hardware Operation4–15
5.Internal Components Access5–1
5.1Sliding the Server Into and Out of the Equipment Rack5–1
5.1.1Sliding the Server Out from the Equipment Rack5–1
5.1.2Sliding the Server into the Equipment Rack5–3
5.2Removing and Attaching the Top Cover5–3
5.2.1Removing the Top Cover5–3
5.2.2Attaching the Top Cover5–4
5.3Removing and Attaching the Air Duct5–4
5.3.1Accessing the Air Duct5–5
5.3.2Removing the Air Duct5–5
5.3.3Attaching the Air Duct5–6
5.4Removing and Attaching the Fan Cover5–6
5.4.1Removing the Fan Cover5–6
5.4.2Attaching the Fan Cover5–7
6.Motherboard Unit Replacement6–1
6.1Accessing the Motherboard Unit6–4
6.2Removing the Motherboard Unit6–7
6.3Mounting the Motherboard Unit6–8
6.4Reassembling the Server6–9
7.Replacement and Installation of Memory7–1
7.1Memory Mounting Rules7–3
7.1.1Confirmation of DIMM Information7–3
viiiSPARC Enterprise M3000 Server Service Manual • November 2009
7.1.2Memory Mounting Conditions7–4
7.2Accessing the DIMMs7–7
7.3Removing the DIMMs7–8
7.4Installing the DIMMs7–9
7.5Reassembling the Server7–9
8.Replacement and Installation of PCIe Cards8–1
8.1Accessing a PCIe Card8–3
8.2Removing a PCIe Card8–3
8.3Mounting a PCIe Card8–4
8.4Reassembling the Server8–5
9.Replacement and Installation of a Hard Disk Drive (HDD)9–1
9.1Accessing a Hard Disk Drive9–3
9.2Removing a Hard Disk Drive9–3
9.3Installing a Hard Disk Drive9–5
9.4Reassembling the Server9–5
10.Replacing the Hard Disk Drive Backplane10–1
10.1Accessing the Hard Disk Drive Backplane10–2
10.2Removing the Hard Disk Drive Backplane10–3
10.3Mounting the Hard Disk Drive Backplane10–5
10.4Reassembling the Server10–6
11.CD-RW/DVD-RW Drive Unit (DVDU) Replacement11–1
11.1Accessing the CD-RW/DVD-RW Drive Unit11–2
11.2Removing the CD-RW/DVD-RW Drive Unit11–3
11.3Mounting the CD-RW/DVD-RW Drive Unit11–4
11.4Reassembling the Server11–5
12.Power Supply Unit Replacement12–1
Contentsix
12.1Accessing a Power Supply Unit12–3
12.2Removing the Power Supply Unit12–3
12.3Mounting the Power Supply Unit12–5
12.4Reassembling the Server12–5
13.Fan Unit Replacement13–1
13.1Accessing a Fan Unit13–3
13.2Removing a Fan Unit13–3
13.3Mounting a Fan Unit13–5
13.4Reassembling the Server13–5
14.Fan Backplane Replacement14–1
14.1Accessing the Fan Backplane14–2
14.2Removing the Fan Backplane14–5
14.3Mounting the Fan Backplane14–6
14.4Reassembling the Server14–6
15.Operator Panel Replacement15–1
15.1Accessing the Operator Panel15–3
15.2Removing the Operator Panel15–4
15.3Mounting the Operator Panel15–5
15.4Reassembling the Server15–5
A. Components ListA–1
B. FRU ListB–1
B.1Server OverviewB–1
B.2Motherboard UnitB–2
B.2.1Memory (DIMM)B–3
B.2.2PCIe SlotB–3
B.2.3CPUB–4
xSPARC Enterprise M3000 Server Service Manual • November 2009
B.2.4XSCF UnitB–4
B.3DriveB–5
B.3.1Hard Disk DriveB–5
B.3.2CD-RW/DVD-RW Drive Unit (DVDU)B–6
B.4Power Supply UnitB–6
B.5Fan UnitB–7
C. External Interface SpecificationsC–1
C.1Serial PortC–2
C.2UPC PortC–2
C.3USB PortC–3
C.4SAS PortC–3
C.5Connection Diagram for Serial CableC–4
D. UPS ControllerD–1
D.1OverviewD–1
D.2Signal CableD–2
D.3Configuration of Signal LinesD–3
D.4Power Supply ConditionsD–4
D.4.1Input CircuitD–4
D.4.2Output CircuitD–5
D.5UPS CableD–5
D.6ConnectionsD–6
E. DC Power Supply ModelE–1
E.1The Server ViewsE–2
E.2LED Functions of Power Supply UnitE–4
E.3Electricals SpecificationsE–5
E.4Using the showhardconf CommandE–6
Contentsxi
AbbreviationsAbbreviations–1
IndexIndex–1
xiiSPARC Enterprise M3000 Server Service Manual • November 2009
Preface
This manual describes how to service SPARC Enterprise™ M3000 server. It is written
for maintenance providers who have received training under a self-maintenance
contract.
This section includes:
■ “Glossary” on page xiii
■ “Structure and Contents of This Manual” on page xiv
■ “M3000 Server Documentation” on page xv
■ “Text Conventions” on page xviii
■ “Prompt Notations” on page xviii
■ “Syntax of the Command-Line Interface (CLI)” on page xix
■ “Environment Requirements for Using This Product” on page xix
■ “Conventions for Alert Messages” on page xx
■ “Notes on Safety” on page xxi
■ “Alert Labels” on page xxiv
■ “Product Handling” on page xxv
■ “Limitations and Cautions” on page xxvi
■ “Fujitsu Welcomes Your Comments” on page xxviii
Glossary
For the terms used in the “M3000 Server Documentation” on page xv, refer to the
SPARC Enterprise M3000/M4000/M5000/M8000/M9000 Servers Glossary.
xiii
Structure and Contents of This Manual
This manual is organized as described below:
■ CHAPTER 1 Safety Precautions for Maintenance
Provides safety precautions required for maintenance.
■ CHAPTER 2 Hardware Overview
Explains the names of components and also explains the LEDs on the operator
panel and rear panel.
■ CHAPTER 3 Troubleshooting
Explains fault diagnosis information.
■ CHAPTER 4 FRU Replacement Preparation
Explains the method of preparing for the safe replacement of FRUs.
■ CHAPTER 5 Internal Components Access
Explains how to access internal components.
■ CHAPTER 6 Motherboard Unit Replacement
Explains how to replace the motherboard unit.
■ CHAPTER 7 Replacement and Installation of Memory
Explains how to replace and install memory (DIMMs).
■ CHAPTER 8 Replacement and Installation of PCIe Cards
Explains how to replace and install PCIe cards.
■ CHAPTER 9 Replacement and Installation of a Hard Disk Drive (HDD)
Explains how to replace and install hard disk drive.
■ CHAPTER 10 Replacing the Hard Disk Drive Backplane
Explains how to replace the hard disk drive backplane.
■ CHAPTER 11 CD-RW/DVD-RW Drive Unit (DVDU) Replacement
Explains how to replace the CD-RW/DVD-RW drive unit.
■ CHAPTER 12 Power Supply Unit Replacement
Explains how to replace a power supply unit.
■ CHAPTER 13 Fan Unit Replacement
Explains how to replace a fan unit.
xiv SPARC Enterprise M3000 Server Service Manual • November 2009
■ CHAPTER 14 Fan Backplane Replacement
Explains how to replace the fan backplane.
■ CHAPTER 15 Operator Panel Replacement
Explains how to replace the operator panel.
■ APPENDIX A Components List
Explains the server nomenclature and component numbering.
■ APPENDIX B FRU List
Explains FRUs.
■ APPENDIX C External Interface Specifications
Explains connector specifications for external interfaces.
■ APPENDIX D UPS Controller
Explains the UPS controller (UPC) that controls the uninterruptible power
supply (UPS) unit.
■ APPENDIX E DC Power Supply Model
Describes the requirements specific to the DC power supply model.
■ Abbreviations
Provides the full spellings of abbreviations used in this manual.
■ Index
Provides keywords and corresponding reference page numbers so that the
reader can easily search for items in this manual as necessary.
M3000 Server Documentation
The manuals listed below are provided for reference.
Book titlesManual codes
SPARC Enterprise M3000 Server Site Planning GuideC120-H030
This manual uses the following fonts and symbols to express specific types of
information.
Fonts/symbolsMeaningExample
AaBbCc123What you type, when contrasted
with on-screen computer output.
This font represents the example of
command input in the frame.
AaBbCc123The names of commands, files, and
directories; on-screen computer
output.
This font represents the example of
command output in the frame.
ItalicIndicates the name of a reference
manual
" "Indicates names of chapters,
sections, items, buttons, or menus
XSCF> adduser jsmith
XSCF> showuser -p
User Name:jsmith
Privileges:useradm
See the SPARC Enterprise
M3000/M4000/M5000/M8000/M90
00 Servers XSCF User’s Guide
See Chapter 2, "Hardware
Overview."
Prompt Notations
The following prompt notations are used in this manual.
ShellPrompt notations
XSCF
C shellmachine-name%
C shell super usermachine-name#
Bourne shell and Korn shell
Bourne shell and Korn shell
super user
OpenBoot™ PROM
XSCF>
$
#
ok
auditadm
xviii SPARC Enterprise M3000 Server Service Manual • November 2009
Syntax of the Command-Line Interface
(CLI)
The command syntax is as follows:
■ A variable that requires input of a value must be enclosed in <>.
■ An optional element must be enclosed in [ ].
■ A group of options for an optional keyword must be enclosed in [ ] and delimited
by |.
■ A group of options for a mandatory keyword must be enclosed in {} and
delimited by |.
■ The command syntax is shown in a box.
Example:
XSCF> showuser -a
Environment Requirements for Using
This Product
This product is a computer that is intended to be used in a data center. For details of
the system requirements, refer to SPARC Enterprise M3000 Server Installation Guide.
Prefacexix
Conventions for Alert Messages
This manual uses the following conventions to show alert messages, which are
intended to prevent injury to the user or bystanders as well as property damage, and
important messages that are useful to the user.
WARNING:
This indicates a hazardous situation that could result in death or serious personal
injury (potential hazard) if the user does not perform the procedure correctly.
CAUTION:
This indicates a hazardous situation that could result in minor or moderate personal
injury if the user does not perform the procedure correctly. This signal also indicates
that damage to the product or other property may occur if the user does not perform
the procedure correctly.
IMPORTANT:
This indicates information that could help the user to use the product more
effectively.
Alert messages in the text
An alert message in the text consists of a signal indicating an alert level followed by
an alert statement. Alert messages are indented to distinguish them from regular
text. Also, a space of one line precedes and follows an alert statement.
WARNING:
The tasks listed below for this product and optional product provided by Fujitsu
Computers should be performed only by field engineer.
The user must not perform these tasks. Incorrect operation of these tasks may cause
electric shock, injury, or fire.
■ Installation and reinstallation of all components
■ Removal of front panel and top cover
■ Mounting/unmounting of optional internal devices
■ Connecting/disconnecting of external interface cables
■ Maintenance (repair and regular diagnosis and maintenance)
Also, important alert messages are shown in “Important Alert Messages” on
page xxi.
xx SPARC Enterprise M3000 Server Service Manual • November 2009
Notes on Safety
Important Alert Messages
This manual provides the following important alert signals:
Caution – The WARNING signal indicates a dangerous situation could result in
death or serious injury if the user does not perform the procedure correctly.
TaskWarning
Normal
operation
EmergencySmoking or fire
Electric shock, fire
Do not damage, break, or modify the power cords. Cord damage may
cause electric shock or fire.
If smoke or fire is generated from the server, press down the power switch
for 4 seconds or more, power off the server, and then remove the cord
clamp and disconnect the power cord.
Prefacexxi
Caution – The CAUTION signal indicates a hazardous situation could result in
minor or moderate personal injury if the user does not perform the procedure
correctly. This signal also indicates that damage to the product or other property
may occur if the user does not perform the procedure correctly.
TaskWarning
Normal
operation
Equipment damage
Be sure to follow the precautions below when installing the main unit.
Otherwise, the equipment may be damaged.
• Do not block ventilation slits.
• Avoid installing the equipment in a place exposed to direct sunlight or
near devices that becomes extremely hot.
• Avoid installing the equipment in a dusty place or a place directly
exposed to corrosive gas or salty air.
• Avoid installing the equipment in a place exposed to strong vibration.
Also, install the equipment on a level surface so that it is stable.
• The grounding resistance must not be greater than 10
method varies by the building where you install the server. Make sure
that the facility administrator or a qualified electrician verifies the
grounding method for the building and performs the grounding work.
• Do not run any cable beneath any equipment. Also, prevent cables from
becoming taut. Never disconnect any power cord from the equipment
while power is being supplied to the equipment.
• Do not place anything on top of the main unit. Do not use the main unit
as a workspace.
• Avoid exposing the equipment to rapid changes in the ambient
temperature, such as a rapid increase during transport in winter. A rapid
increase in the ambient temperature causes moisture to condense in the
equipment. Use the equipment only after the difference between its
temperature and the ambient temperature is negligible.
• Avoid installing the equipment near a copy machine, air conditioner,
welding machine, or any other devices generating electronic noise.
• Take preventive action to minimize static electricity at the installation
location. Note that static electricity is easily generated in some carpets
and can cause the equipment to malfunction.
• Confirm that the power supply voltage and frequency during operation
match the rated values indicated on the equipment.
• Do not insert any object into an opening in the equipment. Components
inside the equipment use high voltage. Conductive foreign matter, such
as a metal object, inserted into the equipment, may cause a short circuit
between components, resulting in fire, electric shock, or equipment
damage.
• For maintenance of the equipment, contact your authorized service
personnel.
Ω. The grounding
xxii SPARC Enterprise M3000 Server Service Manual • November 2009
TaskWarning
Normal
operation
Data destruction
Confirm the items listed below before turning off the power. Otherwise,
data may be destroyed.
• All applications have completed processing.
• No user is using the equipment.
• When the main unit power is turned off, the POWER LED on the
operation panel is turned off. Be sure to confirm that the POWER LED is
off before turning off the main power (uninterruptible power supply
[UPS], power input, etc.).
If necessary, back up files before turning off the system power.
Data destruction
Do not forcibly stop a domain that is operating normally. Otherwise, data
may be destroyed.
Data destruction
Do not disconnect the power cord from the power input while power is
being supplied. Otherwise, data stored on hard disk units may be
destroyed.
MaintenanceFailure
Unpacking and maintenance of this equipment and Fujitsu optional
products must always be performed by a certified field engineer. Customer
shall never perform this work by themselves, as this could lead to failures.
Equipment damage
When handling parts, be sure to wear an antistatic wrist strap and connect
the clip to the grounding port of the server. Place removed parts on an
antistatic conductive mat. Failure to do so may result in serious damage or
injury.
Electric shock
Before doing the maintenance, unplug the power cords. This product uses
double pole/neutral fusing which could create an electric shock hazard.
Prefacexxiii
Alert Labels
The following are labels attached to this product:
Caution – Never peel off the labels.
SPARC Enterprise M3000
(Front View)
xxiv SPARC Enterprise M3000 Server Service Manual • November 2009
Product Handling
Maintenance
Caution – Certain tasks in this manual should only be performed by a certified
service engineer. User must not perform these tasks. Incorrect operation of these
tasks may cause electric shock, injury, or fire.
■ Installation and reinstallation of all components, and initial settings
■ Removal of front panel and top cover
■ Mounting/de-mounting of optional internal devices
■ Plugging or unplugging of external interface cards
■ Maintenance and inspections (repairing, and regular diagnosis and maintenance)
Caution – The following tasks regarding this product and the optional products
provided from Fujitsu should only be performed by a certified service engineer.
Users must not perform these tasks. Incorrect operation of these tasks may cause
malfunction.
■ Unpacking optional adapters and such packages delivered to the users
■ Plugging or unplugging of external interface cards
Remodeling/Rebuilding
Caution – Any modification and/or recycling of this product and its components
may be carried out only by a certified service engineer and must not be done by the
customer under any circumstances. Otherwise, electric shock, injury or fire may
result.
Prefacexxv
Emission of Laser Beam (Invisible)
Caution – The server contains modules that generate invisible laser radiation. Laser
beams are generated while the equipment is operating, even if an optical cable is
disconnected or a cover is removed. Do not look at any laser light source directly or
through an optical apparatus (e.g., magnifying glass, microscope).
Limitations and Cautions
Power Control and Operator Panel Mode Switch
When you use the remote power control utilizing the RCI function or the automatic
power control system (referred to below as APCS), you can disable this remote
power control or the APCS by switching to Service mode on the operator panel.
Disabling these features ensures that you do not unintentionally switch the system
power on or off during maintenance. Note system power off with the APCS cannot
be disabled with the mode switch. Therefore, be sure to turn off automatic power
control via APCS before starting maintenance.
If you switch the mode while using the RCI or the automatic power control, the
system power is controlled as follows.
FunctionMode switch
LockedService
RCIRemote power-on/power-off
operations are enabled.
Automatic
power control
To use the RCI function, see the SPARC Enterprise
M3000/M4000/M5000/M8000/M9000 Servers RCI Build Procedure and the SPARC
Enterprise M3000/M4000/M5000/M8000/M9000 Servers RCI User’s Guide which are
available on the website of manuals.
xxvi SPARC Enterprise M3000 Server Service Manual • November 2009
Automatic power-on/poweroff operations are enabled.
Remote power-on/power-off
operations are disabled.
Automatic power-on is
disabled, but power-off
remains enabled.
To use the APCS, see the Enhanced Support Facility User's Guide for MachineAdministration Automatic Power Control Function (Supplement Edition).
Prefacexxvii
Fujitsu Welcomes Your Comments
If you have any comments or requests regarding this document or if you find any
unclear statements in the document, please state your points specifically on the form
at the following URL.
xxviii SPARC Enterprise M3000 Server Service Manual • November 2009
CHAPTER
1
Safety Precautions for Maintenance
This chapter provides safety precautions required for maintenance.
■ Section 1.1, “ESD Precautions” on page 1-1
■ Section 1.2, “Server Precautions” on page 1-3
1.1ESD Precautions
To ensure that you and bystanders are not exposed to harm and to prevent damage
to the system, observe the following safety precautions.
TABLE1-1ESD Precautions
ItemPrecaution
ESD connector/wrist strapConnect the ESD connector to your server and wear the antistatic wrist strap
when handling printed circuit boards. See
connection destination.
Conductive matAn approved conductive mat provides protection from static damage when
used with a wrist strap. The mat also cushions and protects small parts that
are attached to printed circuit boards.
ESD safe packaging boxPlace a printed board or component in the ESD safe packaging box after you
remove it.
FIGURE 1-1, for the wrist strap
1-1
FIGURE 1-1 Wrist Strap Connection Destination
■ Hard disk drive or fan unit:
Connect to one of two thumbscrews
on the front of the server.
■ FRU* other than hard disk drive and fan unit
Connect to either upper right on the front or upper left
on the rear of the server.
* FRU: Field Replaceable Unit
Caution – Do not connect the wrist strap cable to the conductive mat. Connect it
directly to the server.
The wrist strap and FRU must have the same level of potential.
1-2SPARC Enterprise M3000 Server Service Manual • November 2009
1.2Server Precautions
When maintaining the server, observe the following precautions for your protection.
■ Follow all cautions, warnings, and instructions marked on the server.
Caution – Do not insert any object in an opening of the server. If any object comes
into contact with a high-voltage part or short-circuits a component, fire or electric
shock might result.
■ Refer servicing of the server to the service engineer.
1.2.1Electrical Safety Precautions
■ Ensure that the voltage and frequency of the power source to be used matches the
electrical rating labels on the server.
■ Wear antistatic wrist straps when handling hard disk drives, motherboard units,
or other printed circuit boards.
■ Use grounded power outlets as described in the SPARC Enterprise M3000 Server
Installation Guide.
Caution – Do not make mechanical or electrical modifications. We are not
responsible for regulatory compliance of modified servers.
1.2.2Equipment Rack Safety Precautions
■ The equipment racks must be anchored to the floor, ceiling, or to adjacent frames.
■ Some equipment racks are supplied with a Quake-Resistant Options Kit or
stabilizer, which supports the weight of the server when it is extended on its slide
rails. This prevents the equipment from toppling over during installation or
maintenance.
■ In the following cases, a safety evaluation must be conducted by the service
engineer prior to installation or maintenance work.
■ When no Quake-Resistant Options Kits or stabilizers are attached and the
equipment rack is not anchored to the floor, ensure safety by confirming that
the server does not fall over when it is pulled out from the slide rails.
Chapter 1 Safety Precautions for Maintenance1-3
■ When the equipment rack is mounted on a raised floor, ensure that the raised
floor has sufficient strength to withstand the weight upon it when the server is
extended on its slide rails. Fix the equipment rack through the raised floor to
the concrete floor below it, using a proprietary mounting kit for this purpose.
Caution – If more than one server is installed in an equipment rack, maintain the
servers one at a time.
For details of equipment racks, see the SPARC Enterprise Equipment Rack Mounting Guide.
1.2.3Component Handling Precautions
Caution – The server is easily damaged by static electricity. To prevent damage to
printed circuit boards, wear a wrist strap and connect it to the server prior to
starting maintenance.
Caution – Do not bend the motherboard unit (MBU) or the components mounted
on circuit boards might be damaged.
To prevent the motherboard unit from being bent, observe the following precautions:
■ Hold the motherboard unit by the handle, where the board stiffener is located.
■ When removing the motherboard unit from the packaging, keep the motherboard
unit horizontal until you lay it on the cushioned conductive mat.
■ Connectors and components on the motherboard unit have thin pins that bend
easily. Therefore, do not place the motherboard unit on a hard surface.
■ Be careful not to damage the small parts located on both sides of the motherboard unit.
Caution – The heat sinks can be damaged by incorrect handling. Do not touch the
heat sinks while replacing or removing motherboard units. If a heat sink is loose or
broken, obtain a replacement motherboard unit. When storing or carrying a
motherboard unit, ensure that the heat sinks have sufficient protection.
Caution – When removing a cable such as the LAN cable, if your fingers do not
reach the latch lock of the connecter, use a flat head screwdriver to push the latch to
disconnect the cable. If you forcibly insert your fingers into the service clearance, the
LAN port of the motherboard unit of PCI Express (PCIe) cards may be damaged.
1-4SPARC Enterprise M3000 Server Service Manual • November 2009
CHAPTER
2
Hardware Overview
This chapter explains the names of components and also explains the LEDs on the
operator panel and rear panel.
■ Section 2.1, “Name of Each Part” on page 2-1
■ Section 2.2, “Operator Panel” on page 2-5
■ Section 2.3, “LED Functions of Components” on page 2-11
■ Section 2.4, “External Interface Port on Rear Panel” on page 2-13
■ Section 2.5, “Labels” on page 2-17
2.1Name of Each Part
This section explains the names of parts mounted on the M3000 server.
Among these parts, those which can be replaced in the field by a certified field
engineer are called Field Replaceable Units (FRU). For information on the actual
replacement/expansion procedure for FRUs, see Chapter 6 to Chapter 15.
The server consists of a chassis in which various components are mounted, top cover
to protect the mounted components, front panel, and rear panel. An operator panel
is located on the front panel, and ports used to connect external interfaces are
located on the rear panel. From the LEDs on the operator panel and rear panel, error
and other status information can be checked. For details, see Section 2.2, “Operator
Panel” on page 2-5 to Section 2.4, “External Interface Port on Rear Panel” on
page 2-13.
2-1
FIGURE 2-1, FIGURE 2-2 and FIGURE 2-3 are the internal view, front view, and rear view
of the server, respectively, and they indicate the names and abbreviated names of
main components.
FIGURE 2-1 Server (Internal View)
Fan backplane (FANBP_B)
Fan
unit
(FAN_A)
DC-DC
converter
(DDC)
Hard disk drive backplane
(HDDBP)
CPU
Memory (DIMM)XSCF unit (XSCFU)
Motherboard unit
PCIe
slot
Power supply unit (PSU)
CD-RW/DVD-RW drive unit (DVDU)
2-2SPARC Enterprise M3000 Server Service Manual • November 2009
PCIe card (PCIe)
FIGURE 2-2 Server (Front View)
1234
Location No.Component
1Fan unit (FAN_A)
2Operator panel (OPNL)
3Hard disk drive (HDD) (2.5-inch SAS disk)
4CD-RW/DVD-RW drive unit (DVDU)
Chapter 2 Hardware Overview2-3
FIGURE 2-3 Server (Rear View) (AC Power Supply Model)
123456 7
Location No.Component
1Power supply unit (PSU)
2PCIe slot
3RCI port
4USB port (for XSCF)
5Serial port (for XSCF)
6LAN port (for XSCF)
7UPC port
8Serial Attached SCSI (SAS) port
9Gigabit Ethernet (GbE) port (for OS)
89
2-4SPARC Enterprise M3000 Server Service Manual • November 2009
2.2Operator Panel
The operator panel has the important function of controlling the power of the server.
The operator panel is usually locked with a key to prevent the server from being
mistakenly powered off during system operation.
Before starting maintenance work, ask the system administrator to unlock the
operator panel.
Chapter 2 Hardware Overview2-5
2.2.1Operator Panel Overview
The system administrator or service engineer checks the operating status of the
server with LEDs or operates the power supply with the power switch.
shows the location of the operator panel.
FIGURE 2-4 Operator Panel Location
FIGURE 2-4
1
2
3
4
5
Location numberComponent
1POWER LED
2XSCF STANDBY LED
3CHECK LED
4Power button
5Mode switch (key switch)
2-6SPARC Enterprise M3000 Server Service Manual • November 2009
2.2.2Switches on the Operator Panel
TABLE 2-1 depicts the functions of the switches on the operator panel.
The switches on the operator panel include the mode switch for setting the operation
mode and the power switch for turning on and off the server.
TABLE2-1Switches (Operator Panel)
SwitchNameDescription of function
Mode
Switch
(Key
Switch)
Power buttonThis button is used to turn on or turn off the power to the
Holding down the button
for a short time
(less than 4 seconds)
Holding down the button
for a long time in Service
mode
(4 seconds or longer)
* In normal operation, the server is powered on only when the data center environmental conditions satisfy the specified values. Then,
the server remains in the reset state until the operating system is booted.
LockedNormal operation mode
ServiceMode for maintenance
This switch is used to set the operation mode for the server.
Insert the special key that is under the customer's control, to
switch between modes.
• The system can be powered on with the power button, but
it cannot be powered off with the power button.
• The key can be pulled out at this key position.
• The system can be powered on and off with the power
button.
• The key cannot be pulled out at this key position.
• To stop and maintain the server, set the mode to Service.
server (a domain).
Power on and power off are controlled by pressing this button
in different patterns, as described below.
Regardless of the mode switch setting, the server is powered
on.
If set in the XSCF, facility (air conditioners) power-on and
warm-up processing is skipped.
• If power to the server is on, OS shutdown processing is
executed for all domains before the system is powered off.
• If the server is being powered on, the power-on processing
is cancelled, and the server is powered off.
• If the server is being powered off, the operation of the
power button is ignored, and the power-off processing is
continued.
*
TABLE 2-2 shows the function of the mode switch.
Chapter 2 Hardware Overview2-7
TABLE2-2Mode Switch Function
FunctionMode switch
LockedService
Inhibition of Break Signal ReceptionEnabled Reception of the
Break signal can be
enabled or disabled for
each domain using
setdomainmode
command.
Power On/Off by power buttonOnly Power On is
enabled.
Disabled
Enabled
2-8SPARC Enterprise M3000 Server Service Manual • November 2009
2.2.3LEDs on the Operator Panel
TABLE 2-3 lists the server states displayed with the LEDs on the operator panel.
The three LED indicators on the operator panel indicate the following:
■ General system status
■ System error warning
■ System error location
Besides the states listed in
TABLE 2-3, the operator panel also displays various states
of the server using combinations of the three LEDs.
are displayed in the course of operation from power-on to power-off of the server.
The blinking interval is 1 second (1 Hz).
TABLE2-3LEDs on the Operator Panel
IconNameColorDescription
POWER LEDGreenIndicates the server power status.
• On: The power to the server (a domain) is on.
• Off: The power to the server is off.
• Blinking: The server is powered off.
XSCF
XSCF
STANDBY
LED
CHECK LEDAmberIndicates that the server has detected an error. This is
GreenIndicates the XSCF unit status.
• On: XSCF unit is functioning normally.
• Off: Input power source is off or is just after turned on, and
• Blinking: System initialization is in progress after power
sometimes called a locator.
• On: An error that hinders startup was detected.
• Off: Normal, or power is not being supplied.
• Blinking: Indicates that the unit is a maintenance target.
TABLE 2-4 indicates the states that
XSCF unit is stopped.
was turned on.
In service mode, break signals can be suppressed. If the key position is switched to
Service, the server will boot into service mode the next time it reboots. Service is
selected by default at the initial power-on.
Chapter 2 Hardware Overview2-9
TABLE2-4State Display by Combination of LEDs on the Operator Panel
NameDescription
POWER
*
XSCF STANDBYCHECK
XSCF
OffOffOffPower is not being supplied.
OffOffOnPower has been turned on.
OffBlinkingOffThe XSCF unit is being initialized.
OffBlinkingOnAn error occurred in the XSCF unit.
OffOnOffThe XSCF unit is in the standby state.
The server is waiting for power-on of the air
conditioning facilities in the data center.
OnOnOffWarm-up standby processing is in progress (power is
turned on after the end of processing).
The power-on sequence is in progress.
The server is in operation.
BlinkingOnOffThe power-off sequence is in progress.
(The fan units are stopped after the end of processing.)
* READY LED is referred to when the XSCF unit status is indicated.
2-10SPARC Enterprise M3000 Server Service Manual • November 2009
2.3LED Functions of Components
This section explains the LEDs of each component. When replacing a FRU, check in
advance the states of LEDs.
Normal system state can be confirmed by checking the operator panel. If an error
occurs in an individual hardware component in the server, the LEDs of the FRU
containing the hardware component which caused the error will indicate the error
location. However, some FRUs such as DIMMs do not have LEDs.
To check the state of a FRU that has no LEDs, use an XSCF Shell command such as
showhardconf in the maintenance terminal. For details, see
TABLE 2-5 describes the component LEDs and their functions.
TABLE2-5Component LEDs and Their Functions
ComponentNameColorDescription
Motherboard unit
(MBU)
POWERIndicates whether the MBU is operating.
On (green)Indicates that the motherboard is operating. The motherboard
cannot be removed from the server while the POWER LED is
on.
Blinking
(green)
OffIndicates that the MBU is stopped. The MBU can be
CHECKIndicates the motherboard unit status.
On (amber)Indicates that an error occurred in the MBU.
OffIndicates that the MBU is in the normal state.
Indicates that the MBU is being incorporated into the system
or being disconnected from the system.
disconnected and replaced.
TABLE 3-1.
Chapter 2 Hardware Overview2-11
TABLE2-5Component LEDs and Their Functions (Continued)
ComponentNameColorDescription
Hard disk drive
(HDD)
Power supply unit
(PSU)
Indicates that the hard disk drive can be removed. However,
this LED is not used.
CHECKOn (amber)Indicates that an error occurred in the HDD. However, this
LED stays on for several minutes (until initialization starts)
immediately after power-on. This state does not indicate an
error.
Blinking
Indicates that the HDD is ready to be replaced.
(amber)
OffIndicates that the HDD is in the normal state.
READYOn (green)Indicates that the HDD is operating. The HDD cannot be
removed (cannot be replaced).
OK
Blinking
(green)
Indicates that the HDD is performing communication.
The HDD cannot be removed (cannot be replaced).
OffThe HDD can be replaced.
DCOn (green)Indicates that power is turned on and being supplied.
ACOn (green)Indicates that input power is being supplied to the power
supply unit.
OffIndicates that input power is not being supplied to the power
supply unit.
CHECKOn (amber)Indicates that an error occurred in the PSU.
Blinking
Indicates that the power supply unit is ready to be replaced.
(amber)
OffIndicates that the PSU is in the normal state.
Fan unit (FAN_A)CHECKOn (amber)Indicates that an error occurred in the fan unit.
Blinking
Indicates that the fan unit is ready to be replaced.
(amber)
OffIndicates that the fan unit is in the normal state.
2-12SPARC Enterprise M3000 Server Service Manual • November 2009
TABLE2-5Component LEDs and Their Functions (Continued)
ComponentNameColorDescription
LAN port display
part
ACTIVEOn (green)Indicates that communication is being performed through the
LAN port.
OffIndicates that communication is not being performed through
the LAN port.
LINK
SPEED
On (amber)Indicates that the communication speed of the LAN port is 1
Gbps.
On (green)Indicates that the communication speed of the LAN port is
100 Mbps.
OffIndicates that the communication speed of the LAN port is 10
Mbps.
2.4External Interface Port on Rear Panel
This section shows the location of the external interface ports located on the server
rear panel and explains their functions.
Chapter 2 Hardware Overview2-13
FIGURE 2-5 External Interface Port Locations
12345 6
789101112
2-14SPARC Enterprise M3000 Server Service Manual • November 2009
TABLE2-6External Interface Port Functions
Location number ComponentDescription
1RCI portUsed to connect the server to a peripheral device
having a RCI connector to enable power
interlocking and error monitoring.
2USB port (for XSCF)Exclusive for maintenance personnel. Cannot be
connected to general-purpose USB devices.
3Serial port (for XSCF)Connects to the XSCF unit through serial
connection to set up and manage the server.
4LAN port 1
(for XSCF)
Accommodates a 100Base-TX LAN cable to set up
the server and display status.
• XSCF Shell (command-line interface: CLI):
• XSCF Web (browser user interface: BUI):
5LAN port 0
(for XSCF)
Through CLI or BUI, the user or system
administrator monitors the server, displays
status, operates domains, and displays
information on the console.
6UPC port 1By connecting an uninterruptible power supply
(UPS) unit that has the UPS controller (UPC)
interface, stable power supply is provided in the
event of a failure in the power supply or even a
7UPC port 0
large-scale power failure.
If a single power feed is used, connect a UPS cable
to UPC port 0. In a dual power feed, connect UPS
cables to UPC ports 0 and 1.
Chapter 2 Hardware Overview2-15
TABLE2-6External Interface Port Functions (Continued)
Location number ComponentDescription
8GbE port 0 (for OS)Up to 4 100Base-TX/1000Base-T cables can be
connected to GbE ports.
High-capacity data can be transferred at a high
speed.
9GbE port 1 (for OS)
10GbE port 2 (for OS)
11GbE port 3 (for OS)
12SAS portAccommodates external Serial Attached SCSI (SAS)
devices such as a tape drive.
2-16SPARC Enterprise M3000 Server Service Manual • November 2009
2.5Labels
This section explains the labels and the card affixed to the server.
Note – The information on the label might differ from that shown on the affixed
labels.
■ The model number, serial number, and hardware version, all of which are
required for maintenance and management, are shown on the system faceplate
label.
■ The standards label is affixed close to the system faceplate label and shows the
approval standards.
■ Safety: NRTL/C
■ Radio wave: VCCI-A, FCC-A, DOC-A, MIC
■ Safety and radio wave: CE
A label-affixed card that can be inserted or extracted is provided near the power
supply unit at the right side at the rear of the server (see
be instered in such a way that the standards label faces the outside of the server and
the system faceplate label faces the inside of the server.
TABLE 2-6). The card should
Chapter 2 Hardware Overview2-17
FIGURE 2-6 Label Locations
Inside: System faceplate label
Outside: Standards label
2-18SPARC Enterprise M3000 Server Service Manual • November 2009
CHAPTER
3
Troubleshooting
This chapter provides the fault diagnosis information and the actions to take for
problems.
■ Section 3.1, “Emergency Power Off” on page 3-1
■ Section 3.2, “Failure Diagnostic Method” on page 3-2
■ Section 3.3, “Checking the Server and System Configuration” on page 3-4
■ Section 3.4, “Error Conditions” on page 3-8
■ Section 3.5, “Using Troubleshooting Commands” on page 3-11
■ Section 3.6, “General Solaris Troubleshooting Commands” on page 3-19
3.1Emergency Power Off
This section explains how to power off in an emergency.
Caution – In an emergency (such as smoke or flames coming from the server),
immediately stop using the server and turn off the power supply. Regardless of the
type of business, give top priority to fire prevention measures.
1. Press the power switch for more than 4 seconds to power off the server.
3-1
2. Remove the power cord clamp and disconnect the cable.
FIGURE 3-1 Power-off Method
3.2Failure Diagnostic Method
When an error occurs, a message is displayed on the maintenance monitor in many
cases. Use the flowchart in
failures.
3-2SPARC Enterprise M3000 Server Service Manual • November 2009
FIGURE 3-2 to find the correct methods for diagnosing
FIGURE 3-2 Diagnostic Method Flowchart
Start
OS panic or performance
error?
mail function sent an E-mail
Check whether an error message
is displayed on the OS console
and XSCF console.
The XSCF console displays
Check /var/adm/messages in the
Solaris OS.
FMA message?
YES
Execute fmadm to display fault
information.
Can the message
ID be used?
YES
Enter the message ID in http://sun.com/msg/
to refer to fault information.
NO
NO
Is the power OK or
AC OK LED off?
NO
The XSCF
message?
NO
an error message?
YES
YES
YESNO
Execute showlogs or fmadm in the
XSCF to display fault information.
Make a memo of the displayed
fault information.
Check the power supply unit and
its connection.
Has the problem been
solved?
YES
NO
Contact your service engineer.
End
Chapter 3 Troubleshooting3-3
3.3Checking the Server and System
Configuration
The operating conditions must remain the same before and after maintenance. If an
error occurs in the server, save the system configuration and component status
information. Confirm that the recovered state after maintenance is the same as that
before maintenance.
If an error occurs in the server, one of the following messages is displayed.
■ Solaris™ Operating System message file
■ XSCF Shell showhardconf(8) command and showstatus(8) command
■ Management console
■ Service processor log
3.3.1Checking the Hardware Configuration and FRU
Status
To replace a faulty FRU and perform the maintenance on the server, it is important
to check and understand the hardware configuration of the server and the state of
each hardware component.
The hardware configuration refers to information that indicates to which layer a
hardware component belongs.
The status of each hardware component refers to information on the conditions of a
standard or optional component in the server: temperature, power supply voltage,
CPU operating conditions, and other status information.
To check the hardware configuration and the status of each hardware component,
use XSCF Shell commands from the maintenance terminal. See
commands used.
TABLE3-1Commands for Checking Hardware Configuration
CommandDescription
showhardconfDisplays hardware configuration.
showstatusDisplays the status of a component. This command is used only when a
faulty component is checked.
3-4SPARC Enterprise M3000 Server Service Manual • November 2009
TABLE 3-1 for the
TABLE3-1Commands for Checking Hardware Configuration (Continued)
CommandDescription
showboardsDisplays information on the system board (XSB).
showdclDisplays the hardware resource configuration information of a domain.
showfruDisplays the setting information of a device.
The status of each component can be checked based on the On or blinking state of
the component LEDs.
For the component types and LED states, see
TABLE 2-3 and TABLE 2-5.
For details of commands, see the SPARC Enterprise
M3000/M4000/M5000/M8000/M9000 Servers XSCF User's Guide and the SPARC
Enterprise M3000/M4000/M5000/M8000/M9000 Servers XSCF Reference Manual .
3.3.1.1Checking the Hardware Configuration.
To check the hardware configuration, authority (user account) to log in with the
XSCF user account to the XSCF is required. The following procedure can be used to
check the hardware configuration from the maintenance terminal.
Ask the system administrator for the required information, such as the user account
and password. For details, see the SPARC EnterpriseM3000/M4000/M5000/M8000/M9000 Servers XSCF User's Guide .
1. Log in to XSCF Shell.
2. Type showhardconf.
XSCF> showhardconf
The showhardconf command displays hardware configuration information. For
details, see the SPARC Enterprise M3000/M4000/M5000/M8000/M9000 Servers XSCFUser's Guide.
Chapter 3 Troubleshooting3-5
3.3.2Checking the Software and Firmware
Configurations
The software and firmware configurations and versions affect the operation of the
server. To change the configuration or investigate a problem, check the latest
information and check for any problems in the software.
Software and firmware varies according to user conditions.
■ The software configuration and version can be checked in the Solaris Operating
System. Refer to the Solaris OS documentation for more information.
■ The firmware configuration and versions can be checked from the maintenance
terminal using XSCF Shell commands. Refer to the SPARC Enterprise
M3000/M4000/M5000/M8000/M9000 Servers XSCF User's Guide for more detailed
information.
Check the software and firmware configuration information with assistance from the
system administrator. However, if you have received login authority from the
system administrator, the following commands can be used from the maintenance
terminal for these checks:
TABLE3-2Commands for Checking the Software Configuration
CommandDescription
showrev(1M)Displays system configuration information and Solaris OS patch information.
uname(1)Outputs current system information.
TABLE3-3Commands for Checking the XSCF Firmware Configuration
CommandDescription
version(8)Outputs current firmware version information.
showhardconf(8)Outputs information on the components mounted on the server.
showstatus(8)Displays the status of a component. This command is used only when a faulty
component is checked.
showboards(8)Displays XSB information. It can display information on an XSB that belongs to the
specified domain and information on all XSBs mounted. An XSB combines hardware
resources on physical system boards. The M3000 server consists of a single physical
system board (Uni-XSB).
showdcl(8)Displays the configuration information of a domain (hardware resource information).
showfru(8)Displays the setting information of a device.
3-6SPARC Enterprise M3000 Server Service Manual • November 2009
3.3.2.1Checking the Software Configuration
The following procedure can be used to check the software configuration from the
domain console.
● Type showrev.
# showrev
The showrev command displays system configuration information on the screen.
3.3.2.2Checking the Firmware Configuration
Login authority is required to check the firmware configuration. The procedure
below can be used to check the configuration from the maintenance terminal.
1. Log in with the account of the XSCF hardware maintenance engineer.
2. Type version.
XSCF> version
The version command displays firmware version information on the screen. For
details, see the SPARC Enterprise M3000/M4000/M5000/M8000/M9000 Servers XSCFUser's Guide.
3.3.2.3Downloading Error Log Information
To download error log information, use the XSCF log fetch function. The XSCF unit
has an interface with external units so that the service engineer can easily obtain
useful maintenance information such as error logs.
Connect the maintenance terminal, and use the XSCF Shell or XSCF Web to
download error log information to the maintenance terminal.
Chapter 3 Troubleshooting3-7
3.4Error Conditions
This section describes error conditions and relevant corrective actions.
This work is explained in the following sections:
■ Section 3.4.1, “Predictive Self-Healing Tools” on page 3-8
■ Section 3.4.2, “Monitoring Output” on page 3-10
■ Section 3.4.3, “Messaging Output” on page 3-10
Details of the fault information, see the SPARC Enterprise
M3000/M4000/M5000/M8000/M9000 Servers XSCF User's Guide.
You can find more detailed descriptions of Solaris OS Predictive Self-Healing at the
website below:
Predictive self-healing is an architecture and methodology for automatically
diagnosing, reporting, and handling software and hardware error conditions. This
new technology reduces the time required to debug a hardware or software problem
and provides the administrator and service engineer with detailed data about each
error.
3.4.1Predictive Self-Healing Tools
In the Solaris OS, Solaris Fault Manager runs in the background. When an error
occurs, the system software recognizes the error and attempts to determine the
faulty hardware component. The system software also takes steps to prevent the
faulty component from being used until it has been replaced. The system software
performs the following activities:
■ Receives telemetry information about errors detected by the system software.
■ Diagnoses the errors.
■ Initiates predictive self-healing activities. For example, Solaris Fault Manager can
disable faulty components.
■ When possible, causes the faulty FRU to provide an LED indication of the error in
addition to populating system console messages with more details.
3-8SPARC Enterprise M3000 Server Service Manual • November 2009
TABLE 3-4 shows typical messages generated when an error occurs. Messages are
displayed on your console and are recorded in the /var/adm/messages file.
A message in
TABLE 3-4 indicates that the fault has already been diagnosed. If there
was any corrective action that the system could take, the system has already taken it.
If your server is still running, the corrective action continues to be taken.
TABLE3-4Predictive Self-Healing Messages
Output displayedDescription
Nov 1 16:30:20 dt88-292 EVENT-TIME:Tue Nov 1
16:30:20 PST 2005
Nov 1 16:30:20 dt88-292 PLATFORM:SUNW,A70,
CSN:-, HOSTNAME:dt88-292
Nov 1 16:30:20 dt88-292 SOURCE:eft, REV: 1.13SOURCE: Information on the Diagnosis Engine used to
Nov 1 16:30:20 dt88-292 EVENT-ID:afc7e660-d6094b2f-86b8-ae7c6b8d50c4
Nov 1 16:30:20 dt88-292 DESC:
Nov 1 16:30:20 dt88-292 A problem was detected in the
PCI Express subsystem
Nov 1 16:30:20 dt88-292 Refer to
http://sun.com/msg/SUN4-8000-0Y for more
information.
Nov 1 16:30:20 dt88-292 AUTO-RESPONSE:One or
more device instances may be disabled.
Nov 1 16:30:20 dt88-292 IMPACT:Loss of services
provided by the device instances associated with this
fault.
Nov 1 16:30:20 dt88-292 REC-ACTION:Schedule a
repair procedure to replace the affected device.Use
Nov 1 16:30:20 dt88-292 fmdump -v -u EVENT_ID to
identify the device or contact Sun for support.
EVENT-TIME: The time stamp of the diagnosis
PLATFORM: A description of the server encountering
the error
determine the error
EVENT-ID: The Universally Unique event ID for this
error
DESC: A basic description of the error
WEB SITE: Where to find specific information and
actions for this error
AUTO-RESPONSE: What, if anything, the system did
to alleviate any follow-on problems
IMPACT: A description of what is considered to be the
impact of the fault
REC-ACTION: A brief description of the corrective
action the system administrator should take
Chapter 3 Troubleshooting3-9
3.4.2Monitoring Output
To understand error conditions, collect monitoring output information. For the
collection of the information, use the commands shown in
TABLE3-5XSCF Commands for Checking Monitoring Output
CommandOperandDescription
showlogs(8)consoleDisplays the console of a domain.
monitorLogs messages that are displayed in the message window.
panicLogs output to the console during a panic.
iplCollects console data generated during the period of the power-on of a
domain to the completion of the Solaris OS start.
3.4.3Messaging Output
To understand error conditions, collect messaging output information. For the
collection of the information, use the commands shown in
TABLE3-6Commands for Checking Messaging Output
TABLE 3-5.
TABLE 3-6.
CommandOperandDescription
showlogs(8)envDisplays the temperature history log. The environmental temperature
data and power status are indicated in 10-minute intervals. The data is
stored for a maximum of six months.
powerDisplays power and reset information.
eventDisplays information reported to the system and stored it as event
logs.
errorDisplays error logs.
fmdump (1M)
fmdump(8)
Displays FMA diagnostic results and errors. This command is
provided as a Solaris OS command and XSCF Shell command.
Each error message logged by the predictive self-healing architecture has a message
ID and Web address associated with the message. From this message ID and Web
address, information on the most up-to-date corrective measures can be retrieved.
For details of predictive self-healing, see the Solaris OS documents.
3-10SPARC Enterprise M3000 Server Service Manual • November 2009
3.5Using Troubleshooting Commands
When any message listed in TABLE 3-4 is displayed, detailed information on the error
may be required. For details on troubleshooting commands, see manual pages of the
Solaris OS or XSCF Shell. This section provides detailed explanations of the
following commands:
■ “Using the showhardconf Command” on page 3-11
■ “Using the showlogs Command” on page 3-14
■ “Using the showstatus Command” on page 3-15
■ “Using the fmdump Command” on page 3-16
■ “Using the fmadm Command” on page 3-17
■ “Using the fmstat Command” on page 3-19
3.5.1Using the showhardconf Command
The showhardconf command displays information on each FRU. The following
information is displayed:
The showlogs command displays information of specified logs in the order of time
stamps. The information with the oldest time stamp is displayed first. The
showlogs command displays the following logs:
■ Error log
■ Power log
■ Event log
■ Temperature and humidity record
■ Monitoring message log
■ Console message log
■ Panic message log
■ IPL message log
XSCF> showlogs error
Date: Jun 17 11:05:32 JST 2008 Code: 80000000-c3ff0000-0173000600000000
Status: Alarm Occurred: Jun 17 11:05:32.522 JST 2008
FRU: /PSU#1
Msg: PSU shortage
Date: Jun 17 13:41:46 JST 2008 Code: 80002080-7801c201-0130000000000000
Status: Alarm Occurred: Jun 17 13:41:44.861 JST 2008
FRU: /MBU_A,*
Msg: Board control error (MBC link error)
Date: Jun 17 13:46:31 JST 2008 Code: 60000000-cd01c701-0164010100000000
Status: Warning Occurred: Jun 17 13:46:31.158 JST 2008
FRU: /OPNL,/FANBP_B
Msg: TWI access error
XSCF>
3-14SPARC Enterprise M3000 Server Service Manual • November 2009
3.5.3Using the showstatus Command
The showstatus command displays information about faulty or degraded units
that are among the FRUs composing the server and information on the units on the
layers immediately above the layers of the faulty or degraded units. For each of the
displayed units, an asterisk (*) indicating that the unit is faulty is displayed with any
of the following status indicators, which is displayed after "Status:".
■ Normal: Normal state
■ Faulted: The unit is faulty and is not operating.
■ Degraded: The unit is operating. The unit is partly faulty or degraded and some
error has been detected. Although a faulty state is displayed for the unit, it is
operating normally.
■ Deconfigured: There is no problem with the unit itself, but it is degraded due to a
configuration problem, environmental problem, or the degradation of another
unit.
■ Maintenance: Maintenance is being performed. replacefru(8) or addfru(8) is
The fmdump command displays the contents of the log managed by the module
called Fault Manager.
This example assumes that only one error exists.
# fmdump
TIME UUID SUNW-MSG-ID
Nov 02 10:04:15.4911 0ee65618-2218-4997-c0dc-b5c410ed8ec2 SUN4-8000-0Y
3.5.4.1fmdump -V Command
To get more detailed information you can use the -e option, as shown in the
following example.
# fmdump -V -u 0ee65618-2218-4997-c0dc-b5c410ed8ec2
TIME UUID SUNW-MSG-ID
Nov 02 10:04:15.4911 0ee65618-2218-4997-c0dc-b5c410ed8ec2 SUN4-8000-0Y
100% fault.io.fire.asic
FRU: hc://product-id=SUNW,A70/motherboard=0
rsrc: hc:///motherboard=0/hostbridge=0/pciexrc=0
The output method using the -V option displays at least three additional lines.
■ The first line is the same information shown for console messages above,
including a time stamp, UUID, and message ID.
■ The second line is a declaration of the certainty of diagnosis. In this case we are
100 percent sure the failure is in the ASIC described. If the diagnosis may involve
multiple components, you may see two lines here with 50% in each of the two
lines.
■ The "FRU" line indicates what component must be replaced to return the server to
a fully operational state.
■ The "rsrc" line indicates the component that has become unusable because of this
error.
3-16SPARC Enterprise M3000 Server Service Manual • November 2009
3.5.4.2fmdump -e Command
To get information of the errors that caused this failure you can use the -e option, as
shown in the following example.
# fmdump -e
TIME CLASS
Nov 02 10:04:14.3008 ereport.io.fire.jbc.mb_per
3.5.5Using the fmadm Command
3.5.5.1Using the fmadm faulty Command
The fmadm faulty command can be used by administrators and service personnel to
view and modify system configuration parameters that are maintained by the Solaris
fault manager. The command is primarily used to determine the status of a
component involved in a fault, as shown in the following example:
The PCIe slot has been degraded and it is associated with the same UUID as above.
Also, the "faulted" status may be displayed.
Chapter 3 Troubleshooting3-17
3.5.5.2fmadm repair Command
When the fmadm faulty command displays a fault, the fmadm repair command
must be executed to clear the FRU information in the domain after replacement of
the motherboard unit that has encountered the error. If the fmadm repair
command is not executed, the error message is not cleared.
If the fmadm faulty command displays a fault, clearing the FMA resource cache
on the operating system side causes no problem. Data in the cache does not need to
match the hardware fault information held by the XSCF.
The fmadm config command output displays the version number and current
status of the diagnosis engine that is being used by the server. Whether the latest
engine is being used can be determined by consulting the SunSolve web site.
# fmadm config
MODULE VERSION STATUS DESCRIPTION
cpumem-diagnosis 1.6 active CPU/Memory Diagnosis
cpumem-retire 1.1 active CPU/Memory Retire Agent
disk-transport 1.0 active Disk Transport Agent
eft 1.16 active eft diagnosis engine
event-transport 2.0 active Event Transport Module
fabric-xlate 1.0 active Fabric Ereport Translater
fmd-self-diagnosis 1.0 active Fault Manager Self-Diagnosis
io-retire 1.0 active I/O Retire Agent
snmp-trapgen 1.0 active SNMP Trap Generation Agent
sysevent-transport 1.0 active SysEvent Transport Agent
syslog-msgs 1.0 active Syslog Messaging Agent
zfs-diagnosis 1.0 active ZFS Diagnosis Engine
zfs-retire 1.0 active ZFS Retire Agent
3-18SPARC Enterprise M3000 Server Service Manual • November 2009
3.5.6Using the fmstat Command
The fmstat command reports statistical information and a set of modules that are
associated with the module called Solaris Fault Manager. By using the fmstat
command, statistical information about the diagnostic engine and diagnostic agent
that are currently involved in fault management can be displayed.
The following output example shows that the fmd-self-diagnosis DE module
(displayed also on the console output) has received accepted events.
Superuser commands of this type are useful to determine whether there is a problem
with the server, network, or another server connected via the network.
This section explains the following commands:
■ “Using the iostat Command” on page 3-20
■ “Using the prtdiag Command” on page 3-21
■ “Using the prtconf Command” on page 3-23
■ “Using the netstat Command” on page 3-26
■ “Using the ping Command” on page 3-27
■ “Using the ps Command” on page 3-28
■ “Using the prstat Command” on page 3-29
Chapter 3 Troubleshooting3-19
Most of these commands are located in the /usr/bin directory or /usr/sbin
directory.
3.6.1Using the iostat Command
The iostat command repeatedly reports terminal, drive, and I/O activity, as well
as CPU utilization.
3.6.1.1Options
TABLE 3-7 lists the options of the iostat command and how those options can help
troubleshoot the server.
TABLE3-7Options for iostat
OptionDescriptionHow it can help
No optionReports status of local I/O devices.A quick three-line output of device status
information.
-cReports the percentages of time the system has
spent in user mode, in system mode, waiting
for I/O, and idling.
-eDisplays device error summary statistics.
Displays the total number of errors, hardware
errors, software errors, and transfer errors.
-EDisplays all device error statistics.Provides information about devices:
-nDisplays names in a descriptive format.The descriptive format helps identify devices.
-xReports extended drive statistics of each drive.
The output is in a tabular form.
Quick report of CPU status
Provides a short table with accumulated
errors. Identifies suspect I/O devices.
manufacturer, model number, serial number,
size, and errors.
Similar to the -e option, but provides rate
information. This helps identify internal
devices with poor performance and other I/O
devices with poor performance across the
network.
3-20SPARC Enterprise M3000 Server Service Manual • November 2009
The following example shows output for the iostat command:
# iostat -En
c0t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: ST3120026A Revision: 8.01 Serial No: 3JT4H4C2
Size: 120.03GB <120031641600 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
c0t2d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: LITE-ON Product: COMBO SOHC-4832K Revision: O3K1 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
3.6.2Using the prtdiag Command
The prtdiag command displays system configuration and diagnostic information.
The diagnostic information identifies any failed FRU in the system.
The prtdiag command is located in the /usr/platform/platform-name/sbin/directory.
The prtdiag command may indicate a slot number different from that shown
elsewhere in this document. This is normal.
3.6.2.1Options
TABLE 3-8 lists the options of the prtdiag command and how those options can help
troubleshooting.
TABLE3-8Options for prtdiag
OptionDescriptionHow it can help
No optionLists components.Shows CPU information, memory
configuration, PCIe cards installed, OBP
version, status of the mode switch, and CPU
operation mode.
-vVerbose mode.Provides the same information as no option.
Additionally, displays the detail information
of PCIe cards.
Chapter 3 Troubleshooting3-21
The following example shows output for the prtdiag command in verbose mode:
# prtdiag -v
System Configuration: Sun Microsystems sun4u SPARC Enterprise M3000 Server
System clock frequency: 1064 MHz
Memory size: 7808 Megabytes
=================== Environmental Status ===================
Mode switch is in LOCK mode
=================== System Processor Mode ===================
SPARC64-VII mode
#
3.6.3Using the prtconf Command
Similar to the show-devs command executed at the ok prompt, the prtconf
command displays the devices that are configured.
The prtconf command identifies hardware that is recognized by the Solaris OS. If
software applications are having problems with hardware but the hardware is not
suspected of being faulty, the prtconf command can be used to check whether the
Solaris software recognizes the hardware and whether a driver for the hardware is
loaded.
Chapter 3 Troubleshooting3-23
3.6.3.1Options
TABLE 3-9 lists the options of the prtconf command and how those options can help
troubleshooting.
TABLE3-9Options for prtconf
OptionDescriptionHow it can help
No optionDisplays the device tree of devices recognized
by the operating system.
-DSimilar to the output of no option, but device
driver names are listed.
-pSimilar to the output of no option, yet is
abbreviated.
-VDisplays the version and date of the
OpenBoot™ PROM firmware.
If a hardware device is recognized, then it is
considered to be functioning properly. If the
message "(driver not attached)" is displayed
for the device or sub-device, then the driver
for the device is corrupt or missing.
Lists the drivers needed or used by the
operating system to enable the device.
Provides a brief list of the devices.
Useful for a quick check of the firmware
version.
The following example shows output for the prtconf command:
# prtconf
System Configuration: Sun Microsystems sun4u
Memory size: 7616 Megabytes
System Peripherals (Software Nodes):
SUNW,SPARC-Enterprise
scsi_vhci, instance #0
packages (driver not attached)
SUNW,probe-error-handler (driver not attached)
SUNW,builtin-drivers (driver not attached)
deblocker (driver not attached)
disk-label (driver not attached)
terminal-emulator (driver not attached)
obp-tftp (driver not attached)
ufs-file-system (driver not attached)
chosen (driver not attached)
openprom (driver not attached)
client-services (driver not attached)
options, instance #0
aliases (driver not attached)
memory (driver not attached)
virtual-memory (driver not attached)
pseudo-console, instance #0
3-24SPARC Enterprise M3000 Server Service Manual • November 2009
The prtconf output continued:
nvram (driver not attached)
pseudo-mc, instance #0
cmp (driver not attached)
core (driver not attached)
cpu (driver not attached)
cpu (driver not attached)
core (driver not attached)
cpu (driver not attached)
cpu (driver not attached)
core (driver not attached)
cpu (driver not attached)
cpu (driver not attached)
core (driver not attached)
cpu (driver not attached)
cpu (driver not attached)
pci, instance #0
ebus, instance #0
flashprom (driver not attached)
serial, instance #0
scfc, instance #0
panel, instance #0
pci, instance #0
pci, instance #0
pci, instance #1
scsi, instance #0
tape (driver not attached)
disk (driver not attached)
sd, instance #1
sd, instance #0
pci, instance #2
pci, instance #0
network, instance #0
network, instance #1 (driver not attached)
pci, instance #3
pci, instance #1
network, instance #2 (driver not attached)
network, instance #3 (driver not attached)
pci, instance #4
pci, instance #1
pci, instance #5
pci, instance #6
pci, instance #7
pci, instance #8
os-io (driver not attached)
iscsi, instance #0
Chapter 3 Troubleshooting3-25
The prtconf output continued:
pseudo, instance #0
#
3.6.4Using the netstat Command
The netstat command displays the network status and protocol statistics.
3.6.4.1Options
TABLE 3-10 lists the options of the netstat command and how those options can
help troubleshooting.
TABLE3-10 Options for netstat
OptionDescriptionHow it can help
-iDisplays the interface status. The information
includes packets in/out, errors in/out,
collisions, and queues.
-i intervalRepeats the setstat command in the
intervals of as many seconds as specified after
the -i option.
-pDisplays the media table.Provides the MAC address for hosts on the
-rDisplays the routing table.Provides routing information.
-nReplaces host names with IP addresses and
displays them.
Provides a quick overview of the network
status.
Identifies intermittent or long duration
network events. By piping setstat output to
a file, overnight activity can be viewed all at
once.
subnet.
Used when an IP address is more useful than a
host name.
3-26SPARC Enterprise M3000 Server Service Manual • November 2009
The following example shows the output for the netstat -p command:
# netstat -p
Net to Media Table: IPv4
Device IP Address Mask Flags Phys Addr
------ -------------------- --------------- -------- --------------bge0 san-ff1-14-a 255.255.255.255 o 00:14:4f:3a:93:61
bge0 san-ff2-40-a 255.255.255.255 o 00:14:4f:3a:93:85
sppp0 224.0.0.22 255.255.255.255
bge0 san-ff2-42-a 255.255.255.255 o 00:14:4f:3a:93:af
bge0 san09-lab-r01-66 255.255.255.255 o 00:e0:52:ec:1a:00
sppp0 192.168.1.1 255.255.255.255
bge0 san-ff2-9-b 255.255.255.255 o 00:03:ba:dc:af:2a
bge0 bizzaro 255.255.255.255 o 00:03:ba:11:b3:c1
bge0 san-ff2-9-a 255.255.255.255 o 00:03:ba:dc:af:29
bge0 racerx-b 255.255.255.255 o 00:0b:5d:dc:08:b0
bge0 224.0.0.0 240.0.0.0 SM 01:00:5e:00:00:00
#
3.6.5Using the ping Command
The ping command sends an ICMP ECHO_REQUEST packet to a network host.
Depending on how the ping command is configured, troublesome network links or
nodes can be identified from the displayed output. The destination host is specified
in the variable hostname.
3.6.5.1Options
TABLE 3-11 lists the options of the ping command and how those options can help
troubleshooting.
TABLE3-11 Options for ping
OptionDescriptionHow it can help
hostnameThe probe packet is sent to hostname and
returned.
-g hostnameForcibly routes the probe packet through a
specified gateway.
-i interfaceSpecifies through which interface to send and
receive the probe packet.
Verifies that a host is active on the network.
By sending the probe packet through different
routes to the target host, individual routes can
be tested for quality.
Enables a simple check of secondary network
interfaces.
Chapter 3 Troubleshooting3-27
TABLE3-11 Options for ping (Continued)
OptionDescriptionHow it can help
-nReplaces host names with IP addresses and
displays them.
-sContinues to repeat ping at intervals of 1
second. Pressing
After it is stopped, statistics are displayed.
-svRDisplays the route the probe packet followed
in 1-second intervals.
CTRL-C stops the execution.
The following example shows output for the ping -s command:
#ping -s san-ff2-17-a
PING san-ff2-17-a: 56 data bytes
64 bytes from san-ff2-17-a (10.1.67.31): icmp_seq=0. time=0.427 ms
64 bytes from san-ff2-17-a (10.1.67.31): icmp_seq=1. time=0.194 ms
^C
Used when an IP address is more useful than a
host name.
Helps identify intermittent or long duration
network events. By piping ping output to a
file, overnight activity can be viewed all at
once.
Indicates the probe packet route and number
of hops. Comparing multiple routes can
identify bottlenecks.
The ps commands lists the status of processes. If no option is specified, the ps
command outputs information about the processes that have the same execution userID as the user who is executing this command and are controlled from the same
control terminal as this command.
If any option is specified, the output information is controlled according to the
specified option.
3-28SPARC Enterprise M3000 Server Service Manual • November 2009
3.6.6.1Options
TABLE 3-12 lists the options of the ps command and how those options can help
troubleshooting.
TABLE3-12 Options for ps
OptionDescriptionHow it can help
-eDisplays information for every process.Identifies the process ID and the executable
files.
-fGenerates a full listing.Provides the following process information:
user ID, parent process ID, time when
executed, and the paths to the executable files.
-o optionEnables configurable output. The pid, pcpu,
pmem, and comm options display process ID,
percent CPU consumption, percent memory
consumption, and the relevant executable file,
respectively.
The following example shows output for the ps command:
# ps
PID TTY TIME CMD
101042 pts/3 0:00 ps
101025 pts/3 0:00 sh
#
Provides only most important information.
Knowing the percentage of resource
consumption helps identify processes that are
affecting performance and might be hung.
When using sort with the -r option, the column headings are output so that the
value in the first column is equal to zero.
3.6.7Using the prstat Command
The prstat utility repeatedly examines all the active processes in the system and
reports statistics based on the selected output mode and sort order. The prstat
command provides output similar to the ps command.
Chapter 3 Troubleshooting3-29
3.6.7.1Options
TABLE 3-13 lists the options of the prstat command and how those options can help
troubleshooting.
TABLE3-13 Options for prstat
OptionDescriptionHow it can help
No optionDisplays a list of the processes sorted in
descending order of consumption amount of
CPU resources. The list is limited to the height
of the terminal window and the total number
of processes. Output is automatically updated
every 5 seconds. Pressing
execution.
-n numberLimits the number of output lines.Limits the amount of data displayed and
-s keyEnables the sorting of list contents by key
parameter.
-vVerbose modeDisplays additional parameters.
CTRL-C stops the
The following example shows output for the prstat command:
Output identifies the process ID, user ID, used
amount of memory, state, CPU consumption,
and command name.
displays processes consuming many resources.
Useful keys are cpu (default), time, and size.
3-30SPARC Enterprise M3000 Server Service Manual • November 2009
CHAPTER
4
FRU Replacement Preparation
This chapter explains the method of preparing for the safe replacement of FRUs.
■ Section 4.1, “Tools Required for Maintenance” on page 4-1
■ Section 4.2, “FRU Replacement and Installation Methods” on page 4-2
■ Section 4.3, “Active Replacement/Active Addition” on page 4-5
■ Section 4.4, “Hot Replacement/Hot Addition” on page 4-7
■ Section 4.5, “Cold Replacement/Cold Addition” on page 4-12
4.1Tools Required for Maintenance
The actual maintenance work described in Chapter 5 to Chapter 15 requires
maintenance software to confirm that the server and other components are operating
correctly and to collect status information and log data on the server and
components. Work for mounting, removing, or replacing a specific component
requires special tools, including screwdrivers and an antistatic wrist strap. These
items are generally named maintenance tools and are listed in
TABLE4-1Maintenance Tools
ItemPart nameUse
1Phillips screwdriver (No. 2)
2Wrist strapFor electrostatic control
3Conductive matFor electrostatic control
4SunVTSTest program
TABLE 4-1.
4-1
4.2FRU Replacement and Installation
Methods
This section explains how to replace and install FRUs.
4.2.1FRU Replacement
There are three methods of replacing FRUs, as follows:
■ Active replacement
A target FRU is operated while the Solaris OS of the domain to which the FRU
belongs is operating.
The target FRU is operated by using Solaris OS commands or XSCF commands.
Because the power supply unit (PSU) and fan unit (FAN) do not belong to any
domain, they are operated by using XSCF commands regardless of the operating
state of the Solaris OS.
Note – The hard disk drive has a redundant configuration only when disk mirroring
software is used.
Note – If a hard disk drive is an unmirrored boot device, it must be replaced by
using the cold replacement procedure. However, if a boot device can be
disconnected by means of a Solaris OS function or disk mirroring software function,
active replacement can also be performed. The procedure for disconnecting a hard
disk drive varies depending on the software being used. For details, see the manuals
for the relevant software.
■ Hot replacement
A target FRU is operated while the domain to which the FRU belongs is stopped.
Depending on the target FRU, there are two cases as follows:
■ Power supply unit/Fan unit: operated with XSCF commands.
■ Hard disk drive: operated directly, not by using XSCF commands.
4-2SPARC Enterprise M3000 Server Service Manual • November 2009
■ Cold replacement
After all the domains are stopped and then the server is powered off, a FRU is
operated.
Note – Do not operate a target FRU while the OpenBoot PROM is running (the ok
prompt is displayed). After stopping the relevant domain (power-off) or starting the
Solaris OS, operate the target FRU.
4.2.2FRU Installation
For empty slots without hard disk drives or PCIe cards, the number of mounted
FRUs can be changed from 1 to the maximum number as required. There are some
components that are tentatively mounted physically in the server. If such a
component is a hard disk drive, it is called an HDD dummy, and if such a
component is a PCIe card, it is called a PCIe slot cover. These components are
necessary to protect the server from noise and to properly cool the server.
The same methods as those used for replacement are used for installation.
Note – When installing a new component in an empty slot, remove the HDD
dummy or PCIe slot cover and then install a new FRU.
TABLE 4-2 lists the access locations and applicable replacement methods for each
FRU.
TABLE4-2FRU Access Locations and Replacement Methods
Hard disk drive backplane (HDDBP) TopYesNoNoChapter 10
CD-RW/DVD-RW drive unit
Front/topYesNoNoChapter 11
(DVDU)
Power supply unit (PSU)RearYesYes
Cold
replacement
Hot
replacement
*
†
Chapter 4 FRU Replacement Preparation4-3
Active
replacement
‡
Ye s
†
Ye s
Where to find the
procedure
Chapter 9
Chapter 12
TABLE4-2FRU Access Locations and Replacement Methods (Continued)
FRU
Access
location
Fan unit (FAN_A)TopYesYes
Cold
replacement
Hot
replacement
†
Active
replacement
†
Ye s
Where to find the
procedure
Chapter 13
Fan backplane (FANBP_B)TopYesNoNoChapter 14
Operator panel (OPNL)Front/topYesNoNoChapter 15
* The FRU is operated directly, without using XSCF commands.
† The FRU is operated with XSCF commands.
■ The hard disk drive has a redundant configuration only when disk mirroring software is used.
‡
■ If a harddisk drive is an unmirrored boot device, it must be replaced by usingthe cold replacement procedure. However, ifa boot
device can be disconnected by means of a Solaris OS function or disk mirroring software function, active replacement can also be
performed. The procedure for disconnectinga hard disk drive varies dependingon the software being used. Fordetails, see themanuals for the relevant software.
TABLE 4-3 lists the access location and applicable installation methods for each FRU.
TABLE4-3
FRU
Motherboard unit
FRU Access Locations and Installation Methods
Access
locationCold addition Hot addition
TopNoNoNo
Active
addition
Where to find the
procedure
(MBU_A, MBU_A_2, MBU_A_3,
MBU_A_4)
Memory (DIMM)TopYesNoNoChapter 7
PCIe card (PCIe)TopYesNoNoChapter 8
Hard disk drive (HDD)FrontYesYes
*
Ye s
†
Chapter 9
Hard disk drive backplane (HDDBP) TopNoNoNo
CD-RW/DVD-RW drive unit
Front/topNoNoNo
(DVDU)
Power supply unit (PSU)RearNoNoNo
Fan unit (FAN_A)TopNoNoNo
Fan backplane (FANBP_B)TopNoNoNo
Operator panel (OPNL)Front/topNoNoNo
* The FRU is operated directly, without using XSCF commands.
† The FRU is operated with XSCF commands.
4-4SPARC Enterprise M3000 Server Service Manual • November 2009
4.3Active Replacement/Active Addition
In active replacement, the target FRU is operated while the Solaris OS of the domain
to which the FRU belongs is operating.
The target FRU is operated using Solaris OS commands or XSCF commands.
Because the power supply unit (PSU) and fan unit (FAN) do not belong to any
domain, they are operated by using XSCF commands regardless of the operating
state of the Solaris OS.
Active replacement has the following four stages:
■ “Releasing a FRU from a Domain” on page 4-5
■ “FRU Removal and Replacement” on page 4-6
■ “Configuring a FRU in a Domain” on page 4-6
■ “Verifying the Hardware Operation” on page 4-7
For active installation, see Section 4.3.3, “Configuring a FRU in a Domain” on
page 4-6 and "Section 4.3.4, “Verifying the Hardware Operation” on page 4-7.
4.3.1Releasing a FRU from a Domain
1. From the Solaris OS, type the cfgadm command to obtain the component
status.
# cfgadm -a
2. Stop the application from using the component and disconnect the component
from the Solaris OS.
The READY LED (green) of the HDD goes off.
Note – If a hard disk drive is an unmirrored boot device, it must be replaced by
using the cold replacement procedure. However, if a boot device can be
disconnected by means of a Solaris OS function or disk mirroring software function,
active replacement can also be performed.
3. Type the cfgadm -c command to disconnect the component from the Solaris
OS.
# cfgadm -c unconfigure Ap_Id
Chapter 4 FRU Replacement Preparation4-5
4. Type the cfgadm -x command to confirm that the CHECK LED blinks.
# cfgadm -x led=fault, mode=blink Ap_Id
The Ap_Id is shown in the output of cfgadm (for example, disk#0).
The CHECK LED (amber) of the HDD blinks.
5. Type the cfgadm command to verify that the component has been
disconnected.
# cfgadm -a
The disconnected component is displayed as being unconfigured.
4.3.2FRU Removal and Replacement
After the disconnection of a FRU from a domain, the same procedure as that for Hot
Replacement/Hot Addition applies. See Section 4.4, “Hot Replacement/Hot
Addition” on page 4-7.
4.3.3Configuring a FRU in a Domain
This section explains the procedure for active replacement/installation by using
Solaris OS commands. For information on using the XSCF command, see Section 4.4,
“Hot Replacement/Hot Addition” on page 4-7.
1. Type the cfgadm -c command from the Solaris OS to integrate the component
into the Solaris OS.
# cfgadm -c configure Ap_Id
2. Type the cfgadm -x command to confirm that the CHECK LED is off.
# cfgadm -x led=fault, mode=off Ap_Id
The Ap_Id is shown in the output of cfgadm (for example, disk#0).
The CHECK LED (amber) of the HDD is turned off.
4-6SPARC Enterprise M3000 Server Service Manual • November 2009
3. Type the cfgadm command to verify that the component has been configured.
# cfgadm -a
The configured component is displayed as being configured.
The READY LED (green) of the HDD goes on.
4.3.4Verifying the Hardware Operation
■ Confirm the status of the LED indicators.
For information on the LED status, see
TABLE 2-3 and TABLE 2-5.
4.4Hot Replacement/Hot Addition
In hot replacement, the target FRU is operated while the domain to which the FRU
belongs is stopped.
Depending on the target FRU, there are two cases as follows:
■ Power supply unit/Fan unit: operated with XSCF commands.
■ Hard disk drive: operated directly, not by using XSCF commands.
For hot addition, do the same operation as that for hot replacement.
Chapter 4 FRU Replacement Preparation4-7
4.4.1FRU Removal and Replacement
● Type the replacefru command from the XSCF Shell prompt.
XSCF> replacefru
---------------------------------------------------------------------Maintenance/Replacement Menu
Please select a type of FRU to be replaced.
You are about to replace FAN_A#0.
Do you want to continue?[r:replace|c:cancel] :r
Please confirm the Check LED is blinking.
If this is the case, please replace FAN_A#0.
After replacement has been completed, please select[f:finish] :f
4-8SPARC Enterprise M3000 Server Service Manual • November 2009
The replacefru command automatically tests the status of the component after the
completion of removal and replacement.
Diagnostic tests for FAN_A#0 have started.
[This operation may take up to 3 minute(s)]
(progress scale reported in seconds)
0..... 30..done
---------------------------------------------------------------------Maintenance/Replacement Menu
Status of the replaced FRU.
FRU Status
------------- -------FAN_A#0 Normal
---------------------------------------------------------------------The replacement of FAN_A#0 has completed normally.[f:finish] :f
---------------------------------------------------------------------Maintenance/Replacement Menu
Please select a type of FRU to be replaced.
For details, see the manual pages of showhardconf.
2. Confirm the state of the status LEDs of the FRU.
For information on the LED status, see
TABLE 2-3 and TABLE 2-5.
Chapter 4 FRU Replacement Preparation4-11
4.5Cold Replacement/Cold Addition
In cold replacement, all business operations must be stopped. When accessing the
server, power off the server and disconnect the power cord to ensure safety.
For cold addition, do the same operation as that for cold replacement.
4.5.1Powering off the Server
This section explains how to power off the server.
4.5.1.1Power-off by Using the XSCF Command
1. Notify users that the server is being powered off.
2. Back up the system files and data to tape, if necessary.
3. A user with platadm or fieldeng authority must log in to the XSCF Shell and
enter the poweroff command.
XSCF> poweroff -a
The following activity is executed when the poweroff command is used:
■ The Solaris OS shuts down completely.
■ The server is powered off and the server enters standby mode. (The power to
the XSCF unit remains on.)
For details, see the SPARC Enterprise M3000/M4000/M5000/M8000/M9000 ServersXSCF User's Guide.
4. Verify that the POWER LED on the operator panel is off.
5. Disconnect all the power cords from the power outlets.
Caution – There is a risk of electrical failure if the power cords are not
disconnected. All the power cords must be disconnected to completely cut the power
to the server.
4-12SPARC Enterprise M3000 Server Service Manual • November 2009
4.5.1.2Power off by Using the Operator Panel
1. Notify users that the server is being powered off.
2. Back up the system files and data to tape, if necessary.
3. Turn the mode switch on the operator panel to the Service position.
4. Press the power switch on the operator panel for 4 seconds or more.
5. Verify that the POWER LED on the operator panel is off.
6. Disconnect all the power cords from the power outlets.
Caution – There is a risk of electrical failure if the power cords are not
disconnected. All the power cords must be disconnected to completely cut the power
to the server.
4.5.2FRU Removal and Replacement
In cold replacement, a FRU is removed and replaced while the power is turned off.
After the FRU replacement, power on the server.
4.5.3Powering on the Server
This section explains how to power on the server.
4.5.3.1Power-on by Using the XSCF Command
1. Verify that the server has enough power supply units to operate in the desired
configuration.
2. Connect all the power cords to power outlets.
3. Verify that the XSCF STANDBY LED on the operator panel is on.
4. Turn the mode switch on the operator panel to the desired mode position
(Locked or Service).
Chapter 4 FRU Replacement Preparation4-13
5. A user with platadm or fieldeng authority must log in to the XSCF Shell and
type the poweron command.
XSCF> poweron -a
Soon, the following activity is executed:
■ The POWER LED on the operator panel is turned on.
■ The power-on self-test (POST) is executed.
Then, the server is completely powered on.
Note – If automatic startup of the Solaris OS is specified, use the sendbreak -d
domain_id command of the XSCF Shell to display the ok prompt after the display
console banner is displayed but before the system starts booting the Solaris OS.
For details, see the SPARC Enterprise M3000/M4000/M5000/M8000/M9000 ServersXSCF User's Guide.
4.5.3.2Power-on by Using the Operator Panel
1. Verify that the server has enough power supply units to operate in the desired
configuration.
2. Connect all the power cords to power outlets.
3. Verify that the XSCF STANDBY LED on the operator panel is on.
4. Turn the mode switch on the operator panel to the desired mode position
(Locked or Service).
5. Press the power button on the operator panel.
Soon, the following activity is executed:
■ The POWER LED on the operator panel is turned on.
■ The power-on self-test (POST) is executed.
Then, the server is completely powered on.
Note – If automatic startup of the Solaris OS is specified, use the sendbreak -d
domain_id command of the XSCF Shell to display the ok prompt after the display
console banner is displayed but before the system starts booting the Solaris OS.
4-14SPARC Enterprise M3000 Server Service Manual • November 2009
4.5.4Verifying the Hardware Operation
1. In response to the ok prompt, press the ENTER key and enter ”#” (default value)
and then press the ”.” (period) key.
The domain console is switched to the XSCF console.
2. Use the showhardconf command to confirm that the new component has been
installed.