Sun Microsystems T1000 User Manual

Sun Fire™T1000 Server
Service Manual
Sun Microsystems, Inc. www.sun.com
Part No. 819-3248-10 January 2006, Revision A
Submit comments about this document at: http://www.sun.com/hwdocs/feedback
Copyright 2006Sun Microsystems,Inc., 4150 NetworkCircle, SantaClara, California95054, U.S.A. Allrights reserved. Sun Microsystems,Inc. hasintellectual propertyrights relating to technology thatis described in this document.In particular, andwithout
limitation, theseintellectual propertyrights may includeone ormore ofthe U.S. patentslisted athttp://www.sun.com/patentsand one or more additionalpatents orpending patent applicationsin theU.S. and in other countries.
This documentand the product to whichit pertainsare distributedunder licenses restricting theiruse, copying, distribution,and decompilation. Nopart of the product orof thisdocument may be reproduced in any formby anymeans without priorwritten authorizationof Sun andits licensors, if any.
Third-party software, includingfont technology, iscopyrighted andlicensed from Sun suppliers. Parts ofthe productmay be derivedfrom BerkeleyBSD systems,licensed from the University ofCalifornia. UNIXis a registered trademarkin
the U.S.and in other countries, exclusivelylicensed throughX/Open Company, Ltd. Sun, SunMicrosystems, theSun logo, Answerbook2,docs.sun.com, Java,OpenBoot, SunSolve, SunVTS, Sun Fire,and Solarisare trademarksor
registered trademarks of SunMicrosystems, Inc.in the U.S. and inother countries. All SPARC trademarks are usedunder licenseand aretrademarks or registered trademarksof SPARC International,Inc. inthe U.S. andin other
countries. Productsbearing SPARC trademarksare basedupon anarchitecture developedby Sun Microsystems, Inc. The OPENLOOK and Sun™ Graphical UserInterface wasdeveloped by SunMicrosystems, Inc.for its users and licensees.Sun acknowledges
the pioneeringefforts ofXerox inresearching anddeveloping the conceptof visualor graphical user interfaces forthe computer industry.Sun holds anon-exclusive license from Xeroxto the Xerox GraphicalUser Interface, whichlicense alsocovers Sun’s licenseeswho implementOPEN LOOK GUIsand otherwise comply with Sun’swritten licenseagreements.
U.S. GovernmentRights—Commercial use.Government users are subjectto the SunMicrosystems, Inc.standard licenseagreement and applicable provisionsof theFAR and its supplements.
DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANYIMPLIED WARRANTY OF MERCHANTABILITY, FITNESSFOR A PARTICULARPURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.
Copyright 2006Sun Microsystems,Inc., 4150 NetworkCircle, SantaClara, Californie95054, Etats-Unis. Tousdroits réservés. Sun Microsystems,Inc. ales droitsde propriété intellectuels relatants àla technologiequi est décrit dans cedocument. En particulier, etsans la
limitation, cesdroits depropriété intellectuelspeuvent inclure un ou plusdes brevetsaméricains énumérés à http://www.sun.com/patents et un oules brevetsplus supplémentairesou les applicationsde breveten attente dans les Etats-Uniset dansles autres pays.
Ce produitou documentest protégépar un copyrightet distribuéavec des licencesqui enrestreignent l’utilisation,la copie, la distribution, etla décompilation. Aucunepartie de ce produit oudocument nepeut êtrereproduite sousaucune forme, par quelque moyenque ce soit, sans l’autorisation préalableet écrite de Sun etde sesbailleurs de licence,s’il yena.
Le logicieldétenu par des tiers, etqui comprendla technologie relative aux policesde caractères,est protégépar un copyright et licenciépar des fournisseurs deSun.
Des partiesde ce produit pourrontêtre dérivées des systèmes BerkeleyBSD licenciés par l’Université deCalifornie. UNIXest une marque déposée auxEtats-Unis et dans d’autres payset licenciéeexclusivement par X/Open Company, Ltd.
Sun, SunMicrosystems, lelogo Sun, AnswerBook2,docs.sun.com, Java,OpenBoot,SunSolve, SunVTS, Sun Fire, etSolaris sontdes marquesde fabrique oudes marquesdéposées de SunMicrosystems, Inc.aux Etats-Uniset dans d’autres pays.
Toutes les marques SPARC sontutilisées sous licence et sontdes marquesde fabrique ou des marquesdéposées deSPARCInternational, Inc. aux Etats-Uniset dans d’autres pays. Lesproduits portantles marquesSPARCsont baséssur une architecture développéepar Sun Microsystems, Inc.
L’interfaced’utilisation graphiqueOPEN LOOK etSun™ aété développée parSun Microsystems,Inc. pourses utilisateurs etlicenciés. Sun reconnaît lesefforts depionniers deXerox pour la rechercheet le développementdu conceptdes interfaces d’utilisationvisuelle ougraphique pour l’industriede l’informatique. Sun détient unelicense nonexclusive de Xerox sur l’interface d’utilisation graphiqueXerox, cettelicence couvrant égalementles licenciées de Sun quimettent enplace l’interface d’utilisation graphiqueOPEN LOOK etqui enoutre seconforment aux licencesécrites de Sun.
LA DOCUMENTATION EST FOURNIE "EN L’ÉTAT" ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES OU TACITES SONTFORMELLEMENT EXCLUES,DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE,Y COMPRISNOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L’APTITUDE A UNE UTILISATION PARTICULIERE OU A L’ABSENCE DE CONTREFAÇON.
Please
Recycle

Contents

Preface vii
1. Sun Fire T1000 Server Overview 1
Sun Fire T1000 Server Features 1
Chip-Multitheaded (CMT) Multicore Processor and Memory Technology 2
Performance Enhancements 2
Remote Manageability With ALOM 3
System Reliability, Availability, and Serviceability 4
Environmental Monitoring 5
Error Correction and Parity Checking 5
Predictive Self-Healing 6
Chassis Identification 6
Additional Service Related Information 7
2. Sun Fire T1000 Server Diagnostics 9
Overview of Sun Fire T1000 Server Diagnostics 9
Using LEDs to Identify the State of Devices 14
Front and Rear Panel LEDs 16
Power Supply LEDs 17
Using ALOM For Diagnosis and Repair Verification 17
iii
Running ALOM Service-Related Commands 19
Connecting to ALOM 19
Switching Between the System Console and ALOM 20
Service-Related ALOM Commands 20
To Run the showfaults Command 21
To Run the showenvironment Command 22
To Run the showfru Command 24
Running POST 27
Controlling How POST Runs 27
To Change POST Parameters 30
Reasons to Run POST 31
Routine Sanity Check of the Hardware 31
Diagnosing the System Hardware 31
To Run POST 31
Using the Solaris Predictive Self-Healing Feature 35
To Use the fmdump Command to Identify Faults 37
Collecting Information From Solaris OS Files and Commands 39
To Check the Message Buffer 39
To View System Message Log Files 39
Managing SystemComponents withAutomatic SystemRecovery Commands 40
To Run the showcomponent Command 41
To Run the disablecomponent Command 42
To Run the enablecomponent Command 43
Exercising the System with SunVTS 43
Checking Whether SunVTS Software Is Installed 43
To Check Whether SunVTS Software Is Installed 44
Exercising the System Using SunVTS Software 44
To Exercise the System Using SunVTS Software 45
iv Sun Fire T2000 Server Service Manual • January 2006
For further information, refer to the documents that accompany the SunVTS
software 49
3. Removing and Replacing FRUs 51
Safety Information 51
Safety Symbols 52
Electrostatic Discharge Safety 52
Use an Antistatic Wrist Strap 53
Use an Antistatic Mat 53
Common Procedures for Parts Replacement 53
Required Tools 53
To Shut the System Down 53
To Remove the Server From a Rack 55
To Perform Electrostatic Discharge (ESD) Prevention Measures 56
To Remove the Top Cover 57
Removing and Replacing CRUs 57
To Remove the Optional PCI Express Card 58
To Add or Replace the Optional PCI Express Card 60
To Remove the Fan Tray Assembly 60
To Replace the Fan Tray Assembly 61
To Remove the Power Supply 61
To Replace the Power Supply 62
To Remove the Hard Drive 63
To Replace the Hard Drive 64
To Remove DIMMs 65
To Add or Replace DIMMs 66
To Remove the Motherboard and Chassis 68
To Replace the Motherboard and Chassis Assembly 69
To Remove the Clock Battery on the Motherboard 70
Contents v
To Replace the Clock Battery on the Motherboard 71
Common Procedures for Finishing Up 72
To Replace the Top Cover 72
To Reinstall the Server Chassis in the Rack 73
To Apply Power to the Server 73
A. Field-Replaceable Units (FRUs) 75
vi Sun Fire T2000 Server Service Manual • January 2006

Preface

The Sun Fire T1000 Service Manual provides information to aid in troubleshooting problems with and replacing components within the Sun Fire™ T1000 server.
This manual is written for technicians, service personnel, and system administrators who service and repair computer systems. The person qualified to use this manual:
Can open a system chassis, identify, and replace internal components.
Understands the Solaris Operating System and the command-line interface.
Has superuser privileges for the system being serviced.
Understands typical hardware troubleshooting tasks.
How This Book Is Organized
This guide is organized into the following chapters:
Chapter 1 describes the main features of the Sun Fire T1000 server
Chapter 2 describes the diagnostics that are available for monitoring and
troubleshooting the Sun Fire T1000 server.
Chapter 3 describes how to remove and replace the FRUS.
Appendix A lists the customer-replaceable components in the Sun Fire T1000 server.
vii
Using UNIX Commands
Use this section to alert readers that not all UNIX commands are provided. For example:
This document might not contain information on basic UNIX procedures such as shutting down the system, booting the system, and configuring devices.
See one or more of the following for this information:
Solaris Handbook for Sun Peripherals
AnswerBook2
Other software documentation that you received with your system
online documentation for the Solaris™operating environment
®
commands and
viii Sun Fire T1000 Server Service Manual • January 2006
Typographic Conventions
Typeface
AaBbCc123 The names of commands, files,
AaBbCc123
AaBbCc123 Book titles, new words or terms,
1 The settings on your browser might differ from these settings.
1
Meaning Examples
Edit your.login file. and directories; on-screen computer output
What you type, when contrasted with on-screen computer output
words to be emphasized. Replace command-line variables with real names or values.
Use ls -a to list all files.
% You have mail.
% su
Password:
Read Chapter 6 in the User ’s Guide.
These are called class options.
Yo u must be superuser to do this.
To delete a file, type rm filename.
Shell Prompts
Shell Prompt
C shell machine-name%
C shell superuser machine-name#
Bourne shell and Korn shell $
Bourne shell and Korn shell superuser #
Sun Fire T1000 Server Documentation
You can view and print the following documents from the Sun documentation web
Preface ix
site at http://www.sun.com/documentation
Title Description Part Number
Sun Fire T1000 Server Site Planning Data Guide
Sun Fire T1000 Server Product Notes Late-breaking information about the
Sun Fire T1000 Server Product Overview
Sun Fire T1000 Server Getting Started Guide
Sun Fire T1000 Server Installation Guide
Sun Fire T1000 Server System Administration Guide
Advanced Lights Out Management (ALOM) CMT v1.1 Guide
Site planning information for the Sun Fire T1000 server
server. The latest notes are posted at:
http://www.sun.com/documentation
Provides an overview of the features of this server
Information about where to find documentation to get your system installed and running quickly
Detailed rack mounting, cabling, power­on, and configuration information
How to perform administrative tasks that are specific to the Sun Fire T1000 server
How to use the Advanced Lights Out Manager (ALOM) software on the Sun Fire T1000 server
Accessing Sun Documentation
819-3246
819-3244
819-3247
819-3249
819-3248
819-3250
819-3246
You can view, print, or purchase a broad selection of Sun™ documentation, including localized versions, at:
http://www.sun.com/documentation
Third-Party Web Sites
Sun is not responsible for the availability of third-party web sites mentioned in this document. Sun does not endorse and is not responsible or liable for any content, advertising, products, or other materials that are available on or through such sites
x Sun Fire T1000 Server Service Manual • January 2006
or resources. Sun will not be responsible or liable for any actual or alleged damage or loss caused by or in connection with the use of or reliance on any such content, goods, or services that are available on or through such sites or resources.
Contacting Sun Technical Support
If you have technical questions about this product that are not answered in this document, go to:
http://www.sun.com/service/contacting
Sun Welcomes Your Comments
Sun is interested in improving its documentation and welcomes your comments and suggestions. You can submit your comments by going to:
http://www.sun.com/hwdocs/feedback
Please include the title and part number of your document with your feedback:
Sun Fire T1000 Server Service Manual, part number 819-3248-10
Preface xi
xii Sun Fire T1000 Server Service Manual • January 2006
CHAPTER
1

Sun Fire T1000 Server Overview

This chapter provides an overview of the features of the Sun Fire T1000 server.
The following topics are covered:
“Sun Fire T1000 Server Features” on page 1
“Chassis Identification” on page 6

Sun Fire T1000 Server Features

The Sun Fire T1000 server FIGURE 1-1 is a high-performance, entry-level server that is highly scalable and very reliable.
FIGURE 1-1 Sun Fire T1000 Server
1

Chip-Multitheaded (CMT) Multicore Processor and Memory Technology

The UltraSPARC®T1 multicore processor is the basis of the Sun Fire T1000 server. The UltraSPARC T1 processor is based on chip multithreading (CMT) technology that is optimized for highly threaded transactional processing. The UltraSPARC T1 processor improves throughput while using less power and dissipating less heat than conventional processor designs.
Depending on the model purchased, the processor has six or eight UltraSPARC cores. Each core equates to a 64-bit execution pipeline capable of running four threads. The result is that the 8-core processor handles up to 32 active threads concurrently.
Additional processor components, such the cache, and the Jbus I/O interface have been carefully tuned for optimal performance.
shows the major components in the Sun Fire T1000 server.
DIMMs
Fan tray assembly
FIGURE 1-2 Sun Fire T1000 Server Components
Hard disk drive
DDR2 memory controllers, L1 cache, L2
PCI-E socket and slot
Motherboard and chassis assembly
UltraSPARC T1 mullticore processor
Power supply

Performance Enhancements

The Sun Fire T1000 server introduces several new technologies with its sun4v architecture and multicore, multithreaded UltraSPARC T1 multicore processor.
2 Sun Fire T1000 Server Service Manual • January 2006
TABLE 1-1 lists feature specifications for the Sun Fire T1000 server.
TABLE 1-1 Sun Fire T1000 System Features
Feature Description
Processor 1 UltraSPARC T1 multicore processor (6 or 8 cores)
Memory 8 slots that can be populated with one of the following types of
DDR-2 DIMMs:
• 512 MB (4 GB maximum)
• 1 GB (8 GB maximum)
• 2 GB (16 GB maximum)
Ethernet ports 4 ports, 10/100/1000 Mbit auto-negotiating.
Each of the 4
Ethernet RJ45s includes two LEDs:
• A green Link indicator, lit when a link is established at any speed,
A yellow Activity indicator, which blinks during packet transfers.
DB-9 serial port 1 DB-9 serial port
Internal hard disk drive
Cooling 4 fans in a single assembly
PCI interface 1 PCI-Express (PCI-E) slot for low-profile cards (supports 1x, 4x, and
Power 1 power supply (PS)
Firmware OpenBoot™ PROM for reset and POST support
Operating system Solaris 10 1/06 or later Operating System preinstalled on the hard
Other software Java™ Enterprise System with a 90-day trial license
1 SATA disk drive, 3.5-inch form factor Support for hardware-embedded RAID 1 (mirroring)
8x width cards)
ALOM system controller (integrated on motherboard) with a serial and 10/100 Mbit Ethernet port
ALOM-CMT for remote management administration
disk drive
For additional information on the Sun Fire T1000 server features refer to the Sun Fire T1000 Server Product Overview.

Remote Manageability With ALOM

The Sun Advanced Lights Out Manager (ALOM) feature is a system controller (SC) that enables to you remotely manage and administer the Sun Fire T1000 server.
Chapter 1 Sun Fire T1000 Server Overview 3
The ALOM-CMT software is preinstalled as firmware, and therefore, ALOM initializes as soon as you apply power to the system. You can customize ALOM to work with your particular installation.
ALOM enables you to monitor and control your server over a network, or by using a dedicated serial port for connection to a terminal or terminal server. ALOM provides a command-line interface that you can use to remotely administer geographically distributed or physically inaccessible machines. In addition, ALOM enables you to run diagnostics (such as POST) remotely that would otherwise require physical proximity to the server’s serial port.
You can configure ALOM to send email alerts of hardware failures, hardware warnings, and other events related to the server or to ALOM. The ALOM circuitry runs independently of the server, using the server’s standby power. Therefore, ALOM firmware and software continue to function when the server operating system goes offline or when the server is powered off. ALOM monitors the following Sun Fire T1000 server components:
Hard disk drive status
Enclosure thermal conditions
Power supply status
Voltage levels
Faults detected by POST (Power-On Self-Test)
Solaris OS Predictive Self Healing (PSH) diagnostic facilities
For information about configuring and using the ALOM system controller, refer to the Sun Fire T1000 Server Advanced Lights Out Manager (ALOM) Guide.

System Reliability, Availability, and Serviceability

Reliability, availability, and serviceability (RAS) are aspects of a system’s design that affect its ability to operate continuously and to minimize the time necessary to service the system. Reliability refers to a system’s ability to operate continuously without failures and to maintain data integrity. System availability refers to the ability of a system to recover to an operational state after a failure, with minimal impact. Serviceability relates to the time it takes to restore a system to service following a system failure. Together, reliability, availability, and serviceability features provide for near continuous system operation.
To deliver high levels of reliability, availability, and serviceability, the Sun Fire T1000 server offers the following features:
Environmental monitoring
Error detection and correction for improved data integrity
Easy access for most component replacements
Extensive POST tests that automatically delete faulty components from the
configuration.
4 Sun Fire T1000 Server Service Manual • January 2006
PSH automated run time diagnosis capability that takes faulty components off
line.
For more information about using RAS features, refer to the Sun Fire T1000 Server System Administration Guide.
Environmental Monitoring
The Sun Fire T1000 server features an environmental monitoring subsystem designed to protect the server and its components against:
Extreme temperatures
Lack of adequate airflow through the system
Power supply failure
Hardware faults
Temperature sensors throughout the system monitor the ambient temperature of the system and internal components. The software and hardware ensure that the temperatures within the enclosure do not exceed predetermined safe operating ranges. If the temperature observed by a sensor falls below a low-temperature threshold or rises above a high-temperature threshold, the monitoring subsystem software lights the amber Service required LEDs on the front and back panels. If the temperature condition persists and reaches a critical threshold, the system initiates a graceful system shutdown.
All error and warning messages are sent to the ALOM system controller system console and logged in the ALOM log file. Additionally, some FRUs such as the power supply provide LEDs that indicate a failure within the FRU.
Additionally, the power supply contains an LED that is lit to indicate a failure within the power supply.
Error Correction and Parity Checking
The SPARC T1 multicore processor provides parity protection on its internal cache memories, including tag parity and data parity on the D-cache and I-cache. The internal 3MB L2 cache has parity protection on the tags, and ECC protection of the data.
Advanced ECC, also called Chipkill, detects up to 4-bits in error.
Chapter 1 Sun Fire T1000 Server Overview 5

Predictive Self-Healing

The Sun Fire T1000 server features the latest fault management technologies. With the Solaris 10 Operating System (OS), Sun is introducing a new architecture for building and deploying systems and services capable of predictive self-healing. Self­healing technology enables Sun systems to accurately predict component failures and mitigate many serious problems before they actually occur. This technology is incorporated into both the hardware and software of the Sun Fire T2000 server.
At the heart of the predictive self-healing capabilities is the Solaris Fault Manager, a new service that receives data relating to hardware and software errors, and automatically and silently diagnoses the underlying problem. Once a problem is diagnosed, a set of agents automatically responds by logging the event, and if necessary, takes the faulty component offline. By automatically diagnosing problems, business-critical applications and essential system services can continue uninterrupted in the event of software failures, or major hardware component failures.

Chassis Identification

FIGURE 1-3 and FIGURE 1-4 show the physical characteristics of the Sun Fire T1000
server.
Power OK LED and Power On/Off button
FIGURE 1-3 Sun Fire T1000 Server Front Panel
6 Sun Fire T1000 Server Service Manual • January 2006
Service
required LED
Locator LED/button
Ethernet ports
PCI-E slot
Power supply LEDs
FIGURE 1-4 Sun Fire T1000 Server Rear Panel
Locator LED/
button
Service Power OK LED required LED

Additional Service Related Information

In addition to this document, the following resources are available to help you keep your server running optimally:
Product Notes – The Sun Fire T1000 Server Product Notes (819-3244) contain late
breaking information about the system including required software patches, updated hardware and compatibility information, and solutions to know issues. The product notes are available online at:
http://www.sun.com/documentation
Release Notes – The Solaris OS Release Notes contain important information
about the Solaris operating system. The release notes are available online at:
http://www.sun.com/documentation
SunSolve™ Online – Provides a collection of support resources. Depending on
the level of your service contract, you have access to Sun patches, the Sun System Handbook, the SunSolve knowledge base, the Sun Support Forum, and additional documents, bulletins, and related links. Access this site at:
http://sunsolve.sun.com
Predictive Self-Healing Knowledge Database – You can access the knowledge
article corresponding to a self-healing message by taking the Sun Message Identifier (SUNW-MSG-ID) and entering it into the field on this page:
http://www.sun.com/msg
DB9 serial port
System console ports
Chapter 1 Sun Fire T1000 Server Overview 7
8 Sun Fire T1000 Server Service Manual • January 2006
CHAPTER
2

Sun Fire T1000 Server Diagnostics

This chapter describes the diagnostics that are available for monitoring and troubleshooting the Sun Fire T1000 server. This chapter does not provide detailed troubleshooting procedures, but instead describes the Sun Fire T1000 server diagnostics facilities and how to use them.
This chapter is intended for technicians, service personnel, and system administrators who service and repair computer systems.
The following topics are covered:
“Overview of Sun Fire T1000 Server Diagnostics” on page 9
“Using LEDs to Identify the State of Devices” on page 14
“Using ALOM For Diagnosis and Repair Verification” on page 17
“Running POST” on page 27
“Using the Solaris Predictive Self-Healing Feature” on page 35
“Collecting Information From Solaris OS Files and Commands” on page 39
“Managing System Components with Automatic System Recovery Commands”
on page 40
“Exercising the System with SunVTS” on page 43

Overview of Sun Fire T1000 Server Diagnostics

There are a variety of diagnostic tools, commands, and indicators you can use to troubleshoot a Sun Fire T1000 server.
LEDs – provide a quick visual notification of the status of the server and of some
of the FRUs.
9
ALOM-CMT firmware – is the system firmware that runs on the system
controller. In addition to providing the interface between the hardware and OS, ALOM also tracks and reports the health of key server components. ALOM works closely with POST and Solaris predictive self healing technology to keep the system up and running even when there is a faulty component.
Power-On self-test (POST) – Performs diagnostics on system components upon
system reset to ensure the integrity of those components. POST is configureable and works with ALOM to take faulty components offline if needed and blacklist them in the asr-db.
Solaris OS predictive self healing (PSH) – Continuously monitors the health of
the CPU and memory, and works with ALOM to take a faulty component offline if needed.
Log files and console messages – Provide the standard Solaris OS log files and
investigative commands that can be accessed and displayed on the device of your choice.
SunVTS™ – is an application you can run that exercises the system, provides
hardware validation, and discloses possible faulty components with recommendations for repair.
The LEDs, ALOM, Solaris OS PSH, and many of the log files and console messages are integrated. For example, a fault detected by the Solaris PSH software will display the fault, log it, pass information to ALOM where it is logged, and depending on the fault, might result in the illumination of one or more LEDs.
The diagnostic flowchart in
FIGURE 2-1 and TABLE 2-1 describe an approach for using
the servers diagnostics that is likely identify a faulty field-replaceable unit (FRU). The diagnostics you use, and the order in which you use them, depend on the nature of the problem you are troubleshooting, so you might not follow this flow step-by­step.
The flowchart assumes that you have already performed some rudimentary troubleshooting such as verification of proper installation, visual inspection of cables and power, and possibly reset server (For details, refer to the Sun Fire T1000 Server Installation Guide and Sun Fire T1000 Server Administration Guide .
Use this flow chart to understand what diagnostics are available to troubleshoot faulty hardware, and use TABLE 2-1 to find more information about each diagnostic in this chapter.
For many faults, service can be deferred, either because the faulty component has been asr'd out, the fault is being corrected, or the fault is predictive
10 Sun Fire T1000 Server Service Manual • January 2006
Suspect
faulty
hardware
No
1.
Is the power
supply
fault LED
lit?
Ye s
2.
Connect power cord or replace faulty power supply.
Numbers in this flowchart correspond to the Action numbers in Table 2-1.
3.
Are any
faults reported by
the showfaults
command?
Ye s
4.
Is a
fault message
ID (MSG-ID)
displayed?
Ye s
5. Enter the
message ID into
the Sun Knowl-
edge Article
web site for
recommended
actions
9.
Do the
Solaris logs
No No
indicate a
faulty FRU?
No
10. Identify and replace faulty
Ye s
FRU.
No
7.
showenviron­ment command
reports overtemp
cond?
Ye s
Ye s
11.
Does POST
report any faulty
devices?
No
12.
8.
8.
Find cause of
Find cause of
overtemp cond.
overtemp
report any faulty
Ye s
Does SunVTS
devices?
No
6.
Did the
article recom-
mend a FRU
replacement?
No
FIGURE 2-1 Diagnostic Flow Chart
Ye s
13.
Perform recom­mended corrective actions. If needed,
contact Sun for
support
Chapter 2 Sun Fire T1000 Server Diagnostics 11
TABLE 2-1 Diagnostic Flow Chart Actions
Action No. Diagnostic Action Resulting Action
1. Check the power supply fault LED.
The amber Fault LED indicates the power cord in unplugged or the power supply is faulty.
• If the Fault LED is lit, go to Action 2.
2. Check the power cord.
Connect the power cord.
• If the Fault LED is still lit, replace faulty power supply.
• If the green LEDs are lit, go to Action 3.
3. Run the ALOM
showfaults
command.
The showfaults command displays faults detected by the system firmware.
• If faults are displayed, go to Action 2.
• If no faults are displayed, go to Action 6.
4. Check fault message for a Sun Message ID.
Sun Message IDs (SUNW-MSG-ID) indicate that information is available from Sun’s knowledge article database.
• If you have a message ID number, go to Action 5.
• If you do not have a message ID number, go to Action 10.
5. Enter the Sun Message ID into the Sun
Enter the Sun Message ID number into the knowledge article web site at:
http:www.sun.com/msg and go to Action 4.
Knowledge Article web site.
For more information, see these sections
“To Remove the Power Supply” on page 61 and “To Replace the Power Supply” on page 62
“To Run the showfaults Command” on page 21
“Using the Solaris Predictive Self-Healing Feature” on page 35
6. Analyze the suggested actions.
In some cases, fault related messages are identified with suggested actions.
• If the suggested action recommends replacing a FRU, go to Action 9.
• If the suggested action does not recommend replacing a FRU, perform the suggested action. Contact Sun for additional support, if needed
7. Run the ALOM
showenvironment
command.
12 Sun Fire T1000 Server Service Manual • January 2006
The showenvironment command reports over temperature conditions when the ambient room temperature exceeds the upper limit.
Sun Support information:
http://www.sun.com/ service/contacting
“To Run the
showenvironment
Command” on page 22
TABLE 2-1 Diagnostic Flow Chart Actions (Continued)
Action No. Diagnostic Action Resulting Action
8. Identify the cause of the over temperature condition
The over temperature condition may be caused excessive ambient room temperature, an overheating power supply or a faulty fan tray assembly.
• If ambient room temperature is too high, reduce room temperature.
• If over temperature condition still exists, go to Action 9.
• If over temperature condition does not exist, go to Action 10.
9. Identify the faulty FRU.
The FRUs require that you shut down the server to perform a cold-swap.
After replacing the faulty FRU, go to Action 14.
10. Check the Solaris log files for fault information.
The Solaris message buffer and log files record system events and can provide information about faults.
• If system messages indicate a faulty device, replace the FRU (Action 11).
• To obtain more diagnostic information, got to Action 7.
11. Run POST. POST perforsm basic tests of the server components and reports faulty FRUs.
• If POST indicates a faulty FRU, replace the FRU
(Action 9).
• If POST does not indicate a faulty FRU, go to
Action 12.
12. Run SunVTS. SunVTS provides tests used to exercise and diagnose FRUs. To run SunVTS, the server must be running the Solaris OS.
• If SunVTS reports a faulty device replace the
FRU (Action 9).
• If SunVTS does not report a faulty device, go to
Action 11.
For more information, see these sections
“To Remove the Fan Tray Assembly” on page 60 and “To Replace the Fan Tray Assembly” on page 61.
“To Remove the Power Supply” on page 61 and “To Replace the Power Supply” on page 62
“Collecting Information From Solaris OS Files and Commands” on page 39
“Running POST” on page 27
“Exercising the System with SunVTS” on page 43
Chapter 2 Sun Fire T1000 Server Diagnostics 13
TABLE 2-1 Diagnostic Flow Chart Actions (Continued)
Action No. Diagnostic Action Resulting Action
13. Replace faulty FRU.
14. Verify the repair. Various commands and utilities can be used to
15. Contact Sun for Support.
The FRUs require that you shut down the server to perform a cold-swap.
After replacing the faulty FRU, go to Action 14.
verify the functionality of the system components. Two useful commands are:
• The ALOM showfaults command
• The ASR showcomponents command If the FRU is blacklisted, you can manually remove
it from the black list with the enablecomponent command.
If the fault is cleared, and the component is not blacklisted, the repair is verified well enough to boot the server. For added assurance, you can run the SunVTS diagnostic software.
The majority of hardware faults are detected by the server’s diagnostics. In rare cases it is possible that a problem requires additional troubleshooting. If you are unable to determine the cause of the problem, contact Sun for support.
For more information, see these sections
“Removing and Replacing FRUs” on page 51
“To Run the showfaults Command” on page 21
“Managing System Components with Automatic System Recovery Commands” on page 40
“Exercising the System with SunVTS” on page 43
Sun Support information:
http://www.sun.com/ service/contacting

Using LEDs to Identify the State of Devices

The Sun Fire T1000 server provides the following groups of LEDs:
AC OK
Front and rear panel LEDS (FIGURE 2-2, FIGURE 2-3, and TABLE 2-2)
LED
Power supply LEDs (FIGURE 2-3 and TABLE 2-3)
These LEDs provide a quick visual check of the state of the system.
14 Sun Fire T1000 Server Service Manual • January 2006
Power OK LED/power on/off button
FIGURE 2-2 Sun Fire T1000 Server Front Panel
Service required LED
Locator LED
Fault LED
C OK
ED
Power OK LED DB9 serial
DC OK LED
FIGURE 2-3 Sun Fire T1000 Server Rear Panel LEDs
Locator LED
Service required LED
Activity LED
Link LED
Ethernet ports
port
System console ports
Link LED
Activity LED
Chapter 2 Sun Fire T1000 Server Diagnostics 15
Loading...
+ 63 hidden pages