Sun Microsystems Fire X4100, Fire X4200 M2, Fire X4100 M2, Fire X4200 User Manual

Sun Fire™ X4100/X4100 M2
and X4200/X4200 M2
Servers Diagnostics Guide
Sun Microsystems, Inc. www.sun.com
Part No. 819-3284-17 May 2007, Revision A
Submit comments about this document at: http://www.sun.com/hwdocs/feedback
limitation, theseintellectual propertyrights may includeone ormore ofthe U.S. patentslisted athttp://www.sun.com/patentsand one or more additionalpatents orpending patent applicationsin theU.S. and inother countries.
This documentand the product to whichit pertainsare distributedunder licenses restricting theiruse, copying, distribution,and decompilation. Nopart of the product orof thisdocument may bereproduced in any formby any means without priorwritten authorizationof Sun andits licensors, if any.
Third-party software, includingfont technology, iscopyrighted andlicensed fromSun suppliers. Parts ofthe productmay be derivedfrom BerkeleyBSD systems,licensed from the University ofCalifornia. UNIX is a registered trademarkin
the U.S.and in other countries, exclusivelylicensed throughX/Open Company, Ltd. Sun, Sun Microsystems,the Sunlogo, Java, AnswerBook2,docs.sun.com, SunFire, SunVTS,and Solaris are trademarksor registered
trademarks ofSun Microsystems,Inc. in theU.S. andin other countries. All SPARCtrademarks are used under licenseand aretrademarks or registered trademarksof SPARC International, Inc. inthe U.S. and in other
countries. Productsbearing SPARC trademarks are basedupon anarchitecture developedby Sun Microsystems, Inc. The OPENLOOK and Sun™ Graphical UserInterface was developed by SunMicrosystems, Inc.for its users and licensees. Sun acknowledges
the pioneeringefforts ofXerox inresearching anddeveloping the conceptof visualor graphical userinterfaces forthe computer industry.Sun holds anon-exclusive license from Xeroxto the Xerox GraphicalUser Interface, whichlicense alsocovers Sun’s licenseeswho implementOPEN LOOK GUIsand otherwise comply with Sun’swritten license agreements.
U.S. GovernmentRights—Commercial use.Government users are subjectto the SunMicrosystems, Inc.standard licenseagreement and applicable provisionsof theFAR and its supplements.
DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANYIMPLIED WARRANTY OF MERCHANTABILITY, FITNESSFOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.
Copyright 2007Sun Microsystems,Inc., 4150 NetworkCircle, SantaClara, Californie95054, Etats-Unis. Tousdroits réservés. Sun Microsystems,Inc. ales droitsde propriété intellectuels relatants àla technologiequi est décritdans cedocument. En particulier,et sans la
limitation, cesdroits depropriété intellectuelspeuvent inclure un ou plusdes brevetsaméricains énumérés àhttp://www.sun.com/patentset un oules brevetsplus supplémentairesou les applicationsde breveten attente dans les Etats-Uniset dans les autres pays.
Ce produitou documentest protégépar un copyrightet distribuéavec des licencesqui enrestreignent l’utilisation,la copie, ladistribution, etla décompilation. Aucunepartie de ce produit oudocument nepeut êtrereproduite sousaucune forme, parquelque moyenque ce soit,sans l’autorisation préalableet écrite de Sun etde ses bailleurs de licence,s’il yen a.
Le logicieldétenu par des tiers, etqui comprendla technologie relative auxpolices de caractères, est protégé parun copyright etlicencié pardes fournisseurs deSun.
Des partiesde ce produit pourrontêtre dérivées des systèmes BerkeleyBSD licenciés par l’Université deCalifornie. UNIX est une marque déposée auxEtats-Unis et dans d’autres payset licenciéeexclusivement par X/OpenCompany, Ltd.
Sun, SunMicrosystems, lelogo Sun, Java,AnswerBook2, docs.sun.com,Sun Fire,SunVTS, et Solarissont desmarques de fabrique ou des marques déposéesde SunMicrosystems, Inc.aux Etats-Unis etdans d’autrespays.
Toutes lesmarques SPARC sontutilisées souslicence et sontdes marquesde fabriqueou des marques déposéesde SPARC International,Inc. aux Etats-Uniset dans d’autres pays. Lesproduits portantles marquesSPARCsont basés sur une architecture développéepar Sun Microsystems, Inc.
L’interfaced’utilisation graphiqueOPEN LOOK etSun™ aété développée parSun Microsystems,Inc. pourses utilisateurs etlicenciés. Sun reconnaît lesefforts depionniers deXerox pour la recherche et le développementdu conceptdes interfaces d’utilisationvisuelle ougraphique pour l’industriede l’informatique. Sun détient unelicense non exclusive de Xeroxsur l’interfaced’utilisation graphique Xerox, cette licence couvrant égalementles licenciéesde Sunqui mettenten placel’interface d ’utilisation graphique OPEN LOOK etqui enoutre seconforment aux licences écritesde Sun.
LA DOCUMENTATION EST FOURNIE "EN L’ÉTAT" ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES OU TACITES SONT FORMELLEMENT EXCLUES,DANS LAMESURE AUTORISEE PAR LA LOI APPLICABLE,Y COMPRISNOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L’APTITUDE A UNE UTILISATION PARTICULIERE OU A L’ABSENCE DE CONTREFAÇON.

Contents

Preface vii
1. Initial Inspection of the Server 1
Service Visit Troubleshooting Flowchart 1
Gathering Service Visit Information 3
Serial Number Locations 3
System Inspection 4
Troubleshooting Power Problems 4
Externally Inspecting the Server 4
Internally Inspecting the Server 5
Troubleshooting DIMM Problems 8
How DIMM Errors Are Handled By the System 8
Uncorrectable DIMM Errors 8
Correctable DIMM Errors 9
BIOS DIMM Error Messages 9
DIMM Fault LEDs 11
DIMM Population Rules 14
Sun Fire X4100/X4200 Rules 14
Sun Fire X4100 M2/X4200 M2 Rules 15
Isolating and Correcting DIMM ECC Errors 16
Contents iii
2. Diagnostic Testing Software 19
SunVTS Diagnostic Tests 19
SunVTS Documentation 20
Diagnosing Server Problems With the Bootable Diagnostics CD 20
Requirements 20
Using the Bootable Diagnostics CD 21
A. BIOS Event Logs and POST Codes 23
Viewing BIOS Event Logs 23
Power-On Self-Test (POST) 25
How BIOS POST Memory Testing Works 25
Redirecting Console Output 26
Changing POST Options 27
POST Codes 28
POST Code Checkpoints 30
B. Status Indicator LEDs 35
External Status Indicator LEDs 35
Internal Status Indicator LEDs 39
C. Using the ILOM SP GUI to View System Information 43
Making a Serial Connection to the SP 44
Viewing ILOM SP Event Logs 45
Interpreting Event Log Time Stamps 47
Viewing Replaceable Component Information 48
Viewing Temperature, Voltage, and Fan Sensor Readings 50
D. Using IPMItool to View System Information 55
About IPMI 56
About IPMItool 56
iv Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
IPMItool Man Page 56
Connecting to the Server With IPMItool 57
Enabling the Anonymous User 57
Changing the Default Password 58
Configuring an SSH Key 58
Using IPMItool to Read Sensors 59
Reading Sensor Status 59
Reading All Sensors 59
Reading Specific Sensors 60
Using IPMItool to View the ILOM SP System Event Log 62
Viewing the SEL With IPMItool 62
Clearing the SEL With IPMItool 63
Using the Sensor Data Repository (SDR) Cache 64
Sensor Numbers and Sensor Names in SEL Events 64
Viewing Component Information With IPMItool 65
Viewing and Setting Status LEDs 66
LED Sensor IDs 66
LED Modes 68
LED Sensor Groups 68
Using IPMItool Scripts For Testing 69
E. Error Handling 71
Handling of Uncorrectable Errors 71
Handling of Correctable Errors 74
Handling of Parity Errors (PERR) 76
Handling of System Errors (SERR) 79
Handling Mismatching Processors 81
Hardware Error Handling Summary 82
Contents v
vi Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
Preface
This Guide contains information and procedures for troubleshooting problems with the servers.
Before You Read This Document
It is important that you review the safety guidelines in the Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Safety and Compliance Guide (819-1161).
Using UNIX Commands
This document might not contain information about basic UNIX®commands and procedures such as shutting down the system, booting the system, and configuring devices. Refer to the following for this information:
Software documentation that you received with your system
Solaris™ Operating System documentation, which is at:
http://docs.sun.com
vii
Related Documentation
For a description of the document set for these servers, see the Where To Find Documentation sheet that is packed with your system and also posted at the product's documentation site. See the following URL, then navigate to your product.
http://www.sun.com/documentation
Translated versions of some of these documents are available at the web site described above in French, Simplified Chinese, Traditional Chinese, Korean, and Japanese. English documentation is revised more frequently and might be more up­to-date than the translated documentation.
For all Sun hardware documentation, see the following URL:
http://www.sun.com/documentation
For Solaris and other software documentation, see the following URL:
http://docs.sun.com
viii Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
Typographic ConventionsThird-Party
Typeface
AaBbCc123 The names of commands, files,
AaBbCc123 What you type, when contrasted
AaBbCc123 Book titles, new words or terms,
* The settings on your browser might differ from these settings.
*
Meaning Examples
Edit your.login file. and directories; on-screen computer output
with on-screen computer output
words to be emphasized. Replace command-line variables with real names or values.
Use ls -a to list all files.
% You have mail.
su
%
Password:
Read Chapter 6 in the User’s Guide.
These are called class options.
Yo u must be superuser to do this.
To delete a file, type rm filename.
Web Sites
Sun is not responsible for the availability of third-party web sites mentioned in this document. Sun does not endorse and is not responsible or liable for any content, advertising, products, or other materials that are available on or through such sites or resources. Sun will not be responsible or liable for any actual or alleged damage or loss caused by or in connection with the use of or reliance on any such content, goods, or services that are available on or through such sites or resources.
Preface ix
Sun Welcomes Your Comments
Sun is interested in improving its documentation and welcomes your comments and suggestions. You can submit your comments by going to:
http://www.sun.com/hwdocs/feedback
Please include the title and part number of your document with your feedback:
Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide, part number 819-3284-17
x Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
CHAPTER
1

Initial Inspection of the Server

Note – This chapter applies to all Sun Fire X4100/X4100 M2 and X4200/X4200 M2
servers, unless otherwise noted.

Service Visit Troubleshooting Flowchart

Use the following flowchart as a guideline for using the subjects in this book to troubleshoot the server.
1
To perform this task: Refer to these sections:
Gather initial service visit information.
“Gathering Service Visit Information” on page 3
Investigate any powering-on problems.
Perform external visual inspection and internal visual inspection.
View BIOS event logs and POST messages.
View service processor logs and sensor information.
View service processor logs and sensor information.
“Troubleshooting Power Problems” on page 4
“Externally Inspecting the Server” on page 4 “Internally Inspecting the Server” on page 5 “Troubleshooting DIMM Problems” on page 8
“Viewing BIOS Event Logs” on page 23, “Power-On Self-Test (POST)” on page 25
“Using the ILOM SP GUI to View System Infor­mation” on page 43
“Using IPMItool to View System Information” on page 55
Run SunVTS diagnostics
FIGURE 1-1 Troubleshooting Flowchart
2 Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
“Diagnosing Server Problems With the Boota­ble Diagnostics CD” on page 20

Gathering Service Visit Information

The first step in determining the cause of the problem with the server is to gather whatever information you can from the service call paperwork or the on-site personnel. Use the following general guideline steps when you begin troubleshooting.
1. Collect information about the following items:
Events that occurred prior to the failure
Whether any hardware or software was modified or installed
Whether the server was recently installed or moved
How long the server exhibited symptoms
The duration or frequency of the problem
2. Document the server settings before you make any changes.
If possible, make one change at a time, in order to isolate potential problems. In this way, you can maintain a controlled environment and reduce the scope of troubleshooting.
3. Take note of the results of any change you make. Include any errors or informational messages.
4. Check for potential device conflicts before you add a new device.
5. Check for version dependencies, especially with third-party software.

Serial Number Locations

The system serial number is located on a sticker that is attached to the front bezel (see
FIGURE 1-2 or FIGURE 1-3 for the location).
If the bezel is missing, a second serial number label is affixed to the system:
For Sun Fire X4100/X4100 M2 servers, the second sticker is attached to the top of
For Sun Fire X4200/X4200 M2 servers, the second sticker is attached to the side of
the chassis. If you are facing the chassis front, the sticker is on the left side near the front.
Chapter 1 Initial Inspection of the Server 3

System Inspection

Improperly set controls and loose or improperly connected cables are common causes of problems with hardware components.

Troubleshooting Power Problems

If the server will power on, skip this section and go to “Externally Inspecting the
Server” on page 4.
If the server will not power on, check this list of items:
1. Check that AC power cords are attached firmly to the server’s power supplies and to the AC source.
2. Check that both the main cover and rear cover are firmly in place.
There is an intrusion switch on the front I/O board that automatically shuts down the server power to standby mode when the covers are removed.

Externally Inspecting the Server

To perform a visual inspection of the external system:
1. Inspect the external status indicator LEDs, which can indicate component malfunction.
For the LED locations and descriptions of their behavior, see “External Status
Indicator LEDs” on page 35.
2. Verify that nothing in the server environment is blocking air flow or making a contact that could short out power.
3. If the problem is not evident, continue with “Internally Inspecting the Server” on
page 5.
4 Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007

Internally Inspecting the Server

Perform a visual inspection of the internal system by following these steps. Stop when you identify the problem.
1. Choose a method for shutting down the server from main power mode to standby power mode.
Graceful shutdown: Use a ballpoint pen or other stylus to press and release the
Power button on the front panel. This causes Advanced Configuration and Power Interface (ACPI) enabled operating systems to perform an orderly shutdown of the operating system. Servers not running ACPI-enabled operating systems will shut down to standby power mode immediately.
Emergency shutdown: Use a ballpoint pen or other stylus to press and hold the
Power button for four seconds to force main power off and enter standby power mode.
When main power is off, the Power/OK LED on the front panel will begin flashing, indicating that the server is in standby power mode.
Caution – When you use the Power button to enter standby power mode, power is
still directed to the graphics-redirect and service processor (GRASP) board and power supply fans, indicated when the Power/OK LED is flashing. To completely power off the server, you must disconnect the AC power cords from the back panel of the server.
Power buttonPower/OK LED
Serial number sticker on bezel
FIGURE 1-2 Sun Fire X4100/X4100 M2 Server Front Panel
Chapter 1 Initial Inspection of the Server 5
Power buttonPower/OK LED
Serial number sticker on bezel
FIGURE 1-3 Sun Fire X4200/X4200 M2 Server Front Panel
2. Remove the server covers, as required.
For instructions on removing system covers, refer to the Sun Fire X4100/X4100 M2 and Sun Fire X4200/X4200 M2 Servers Service Manual, 819-1157.
3. Inspect the internal status indicator LEDs, which can indicate component malfunction.
For the LED locations and descriptions of their behavior, see “Internal Status
Indicator LEDs” on page 39.
Note – You can hold down the Locate button on the server back panel or front panel
for 5 seconds to initiate a “push-to-test” mode that illuminates all other LEDs both inside and outside of the chassis for 15 seconds.
4. Verify that there are no loose or improperly seated components.
5. Verify that all cable connectors inside the system are firmly and correctly attached to their appropriate connectors.
6. Verify that any after-factory components are qualified and supported.
For a list of supported PCI cards and DIMMs, refer to the Sun Fire X4100/X4100 M2 and Sun Fire X4200/X4200 M2 Servers Service Manual, 819-1157.
7. Check that the installed DIMMs comply with the supported DIMM population rules and configurations, as described in “Troubleshooting DIMM Problems” on
page 8.
8. Replace the server covers.
6 Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
9. To restore main power mode to the server (all components powered on), use a ballpoint pen or other pointed object to press and release the Power button on the server front panel. See
FIGURE 1-2 or FIGURE 1-3.
When main power is applied to the full server, the Power/OK LED next to the Power button lights and remains lit.
10. If the problem with the server is not evident, you can try viewing the power-on self test (POST) messages and BIOS event logs during system startup. Continue with “Viewing BIOS Event Logs” on page 23.
Chapter 1 Initial Inspection of the Server 7

Troubleshooting DIMM Problems

Use this section to troubleshoot problems with memory modules, or DIMMs.
Note – For information on Sun’s DIMM replacement policy for x64 servers, contact
your Sun Service representative.

How DIMM Errors Are Handled By the System

Uncorrectable DIMM Errors
For all operating systems (OS), the behavior is the same:
When UC error happens, the memory controller causes an immediate reboot of
the system.
During reboot, BIOS checks NorthBridge memory controller’s “Machine Check”
registers and finds out previous reboot was due to Uncorrectable ECC Error (PERR/SERR also), then reports this in POST after the memtest stage:
A Hypertransport Sync Flood occurred on last boot.
Memory reports this event in Service Processor’s System Event Log (SEL) as
follows:
# ipmitool -H 10.6.77.249 -U root -P changeme -I lanplus sel list
f000 | 02/16/2006 | 03:32:38 | OEM #0x12 |
f100 | OEM record e0 | 00000000040f0c0200200000a2
f200 | OEM record e0 | 01000000040000000000000000
f300 | 02/16/2006 | 03:32:50 | Memory | Uncorrectable ECC | CPU 1 DIMM 0
f400 | 02/16/2006 | 03:32:50 | Memory | Memory Device Disabled | CPU 1
DIMM 0
f500 | 02/16/2006 | 03:32:55 | System Firmware Progress | Motherboard
initialization
f600 | 02/16/2006 | 03:32:55 | System Firmware Progress | Video
initialization
f700 | 02/16/2006 | 03:33:01 | System Firmware Progress | USB resource
configuration
8 Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
Correctable DIMM Errors
At this time, correctable errors are not logged in the server’s system event logs. They are reported or handled in the supported operating systems as follows:
Windows server:
A Machine Check error message bubble pops up on task bar.
User must manually go into Event Viewer to view errors as follows:
Start-->Administration Tools-->Event Viewer
View individual errors (by time) to see details of error
Solaris:
There is no reporting of correctable errors in Solaris x86 at this time.
Linux:
There is no reporting of correctable errors in the Linux distributions that we
support at this time.
BIOS DIMM Error Messages
BIOS will display and log three types of error messages:
NODE-n Memory Configuration Mismatch
The following conditions will cause this error message:
DIMMs are not paired (Running in 64-bit mode instead of 128-bit mode)
DIMMs speed not same
DIMMs do not support ECC
DIMMs are not registered
MCT stopped due to errors in DIMM
DIMM module type (buffer) mismatch
DIMM generation (I/II) mismatch
DIMM CL/T mismatch
Banks on two sided DIMM mismatch
DIMM organization mismatch (128-bit)
SPD missing Trc or Trfc info
Chapter 1 Initial Inspection of the Server 9
NODE-n Paired DIMMs Mismatch
The following conditions will cause this error message:
Paired DIMMs are not same, Checksum mismatch
NODE-n DIMMs Manufacturer Mismatch
The following conditions will cause this error message:
DIMMs Manufacturer not supported
Only Samsung, Micron, Infineon and SMART DIMMs are supported
This will be displayed when you add Hitachi DIMMs
10 Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007

DIMM Fault LEDs

The ejectors on the DIMM slots on the motherboard contain DIMM fault LEDs.
Note the following differences between the Sun Fire X4100/X4200 and the X4100 M2/X4200 M2 servers regarding the power requirements for viewing the DIMM fault LEDs:
Sun Fire X4100/X4200 servers only: To see the DIMM fault LEDs, you must put
the server in standby power mode, with the AC power cords attached. See “Internally Inspecting the Server” on page 5.
Sun Fire X4100 M2/X4200 M2 servers only: You can view the DIMM fault LEDs
without the power cords attached. These LEDs can be lit by a capacitor on the motherboard for up to one minute. To light the DIMM fault LEDs from the capacitor, push the small button on the motherboard labeled “DIMM SW2.” See
FIGURE 1-5.
Note – The DIMM fault LEDs always indicate a failed DIMM pair, with the LEDs lit
on both slots of the pair that contains the failed DIMM. See “Isolating and Correcting
DIMM ECC Errors” on page 16 for a procedure to determine which DIMM of the pair
is faulty.
FIGURE 1-4 shows the numbering of the Sun Fire X4100/X4200 DIMM slots.
FIGURE 1-5 shows the numbering of the Sun Fire X4100 M2/X4200 M2 DIMM slots.
Chapter 1 Initial Inspection of the Server 11
Back panel of server
DIMM 3 DIMM 1 DIMM 2 DIMM 0
Pair 0 = DIMM 0 + DIMM 1 Pair 1 = DIMM 2 + DIMM 3
FIGURE 1-4 Sun Fire X4100/X4200 DIMM Slot Locations
CPU1 CPU0
FT1 FM0
FT0 FM0
FT1 FM1
FT1 FM1
DIMM 3 DIMM 1 DIMM 2 DIMM 0
DIMM fault LEDs in DIMM ejector levers
FT1 FM2
FT1 FM2
12 Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
DIMM SW2
Back panel of server
DIMM A0 DIMM B0 DIMM A1 DIMM B1
Pair 0 = DIMM B1 + DIMM A1 Pair 1 = DIMM B0 + DIMM A0
FIGURE 1-5 Sun Fire X4100 M2/X4200 M2 DIMM Slot Locations
CPU1 CPU0
FT1 FM0
FT0 FM0
FT1 FM1
FT1 FM1
DIMM A0 DIMM B0 DIMM A1 DIMM B1
DIMM fault LEDs in DIMM ejector levers
FT1 FM2
FT1 FM2
Chapter 1 Initial Inspection of the Server 13

DIMM Population Rules

Note – The Sun Fire X4100/X4200 servers use only DDR1 DIMM. The Sun Fire
X4100 M2/X4200 M2 servers use only DDR2 DIMMs.
Sun Fire X4100/X4200 Rules
The DIMM population rules for the Sun Fire X4100/X4200 servers are listed here:
Each CPU can support a maximum of four DDR1 DIMMs.
Each pair of DIMMs must be identical (same manufacturer, size, and speed).
The DIMM slots are paired and the DIMMs must be installed in pairs (0 and 1,
2 and 3). The memory sockets are colored black or white to indicate which slots are paired by matching colors.
CPUs with only a single pair of DIMMs must have those DIMMs installed in that
CPU’s white DIMM slots (0 and 1).
See TABLE 1-1 for supported DIMM configurations.
TABLE 1-1 Sun Fire X4100/X4200 Supported DIMM Configurations (DDR1 Only)
Slot 3 Slot 1 Slot 2 Slot 0 Total Memory Per CPU
0 512 MB 0 512 MB 1GB
512 MB 512 MB 512 MB 512 MB 2GB
512 MB 1 GB 512 MB 1GB 3GB
512 MB 2 GB 512 MB 2GB 5GB
512 MB 4 GB 512 GB 4GB 9GB
01GB0 1GB 2GB
1 GB 512 MB 1GB 512 MB 3GB
1GB 1GB 1GB 1GB 4GB
1GB 2GB 1GB 2GB 6GB
1GB 4GB 1GB 4GB 10 GB
02GB0 2GB 4GB
2 GB 512 MB 2GB 512 MB 5GB
2GB 1GB 2GB 1GB 6GB
2GB 2GB 2GB 2GB 8GB
14 Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
TABLE 1-1 Sun Fire X4100/X4200 Supported DIMM Configurations (DDR1 Only)
Slot 3 Slot 1 Slot 2 Slot 0 Total Memory Per CPU
2GB 4GB 2GB 4GB 12 GB
04GB0 4GB 8GB
4GB 4GB 4GB 4GB 16 GB
Sun Fire X4100 M2/X4200 M2 Rules
The DIMM population rules for the Sun Fire X4100 M2/X4200 M2 servers are listed here:
Each CPU can support a maximum of four DDR2 DIMMs.
Each pair of DIMMs must be identical (same manufacturer, size, and speed).
The DIMM slots are paired and the DIMMs must be installed in pairs (A1 and B1,
A0 and B0). The memory sockets are colored black or white to indicate which slots are paired by matching colors.
CPUs with only a single pair of DIMMs must have those DIMMs installed in that
CPU’s white DIMM slots (A1 and B1).
See TABLE 1-2 for supported DIMM configurations.
TABLE 1-2 Sun Fire X4100/X4200 M2 Supported DIMM Configurations (DDR2 Only)
Slot A1 Slot B1 Slot A0 Slot B0 Total Memory Per CPU
1GB 1GB 0 0 2GB
1GB 1GB 1GB 1GB 4GB
2GB 2GB 1GB 1GB 6GB
4GB 4GB 1GB 1GB 10 GB
2GB 2GB 0 0 4GB
2GB 2GB 2GB 2GB 8GB
4GB 4GB 2GB 2GB 12 GB
4GB 4GB 0 0 8GB
4GB 4GB 4GB 4GB 16 GB
Chapter 1 Initial Inspection of the Server 15

Isolating and Correcting DIMM ECC Errors

If your log files report an ECC error or a problem with a DIMM, complete the steps below until you can isolate the fault.
Note – The slot numbers given in the following example use the slot numbering
from Sun Fire X4100/X4200 servers. The pair 0+1 is equivalent to pair A1+B1, and pair 2+3 is equivalent to pair A0+B0, in the Sun Fire X4100 M2/X4200 M2 servers.
In this example, the log file reports an error with the DIMM in CPU0, slot 1. The fault LEDs on CPU0, slots 0+1 are lit.
1. If you have not already done so, shut down your server to standby power mode and remove the main cover.
Refer to the Sun Fire X4100 and Sun Fire X4200 Servers Service Manual, 819-1157.
2. Inspect the installed DIMMs to ensure that they comply with the “DIMM
Population Rules” on page 14.
3. Inspect the fault LEDs on the DIMM slot ejectors and the CPU LEDs on the motherboard. See
If any of these LEDs are lit, they can indicate the component with the fault.
4. Disconnect the AC power cords from the server.
FIGURE 1-4.
Caution – Before handling components, attach an ESD wrist strap to a chassis
ground (any unpainted metal surface). The system’s printed circuit boards and hard disk drives contain components that are extremely sensitive to static electricity.
5. Remove the DIMMs.
6. Visually inspect the DIMMs for physical damage, dust, or any other contamination on the connector or circuits.
7. Visually inspect the DIMM slot for physical damage. Look for cracked or broken plastic on the slot.
8. Dust off the DIMMs, clean the contacts, and reseat them.
9. If there is no obvious damage, exchange the individual DIMMs between the two slots of a given pair. Ensure that they are inserted correctly with ejector latches secured.
Using the example, remove the DIMMs from CPU0, slots 0+1 then reinstall the DIMM from slot 1 into slot 0; reinstall the DIMM from slot 0 into slot 1.
10. Reconnect AC power cords to the server.
16 Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
11. Power on the server and run the diagnostics test again.
12. Review the log file.
If the error now appears in CPU0, slot 0 (opposite to the original error in slot 1),
the problem is related to the individual DIMM. In this case, return both DIMMs (the pair) to the Support Center for replacement.
If the error still appears in CPU0, slot 1 (as the original error did), the problem is
not related to an individual DIMM. Instead, it might be caused by CPU0 or by the DIMM slot. Continue with the next step.
13. Shut down the server again and disconnect the AC power cords.
14. Remove both DIMMs of the pair and install them into paired slots on the opposite CPU.
Using the example, install the two DIMMs from CPU0, slots 0+1 into CPU1, slots 0+1 or CPU1, slots 2+3.
15. Reconnect AC power cords to the server.
16. Power on the server and run the diagnostics test again.
17. Review the log file.
If the error now appears under the CPU that manages the DIMM slots you just
installed, the problem is with the DIMMs. Return both DIMMs (the pair) to the Support Center for replacement.
If the error remains with the original CPU, there is a problem with that CPU.
Chapter 1 Initial Inspection of the Server 17
18 Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
CHAPTER
2

Diagnostic Testing Software

This chapter contains information about a diagnostic software tools that you can use.
Note – This chapter applies to all Sun Fire X4100/X4100 M2 and X4200/X4200 M2
servers, unless otherwise noted.

SunVTS Diagnostic Tests

The servers are shipped with a Bootable Diagnostics CD (705-1439) that contains SunVTS™ software.
SunVTS is the Sun Validation Test Suite, which provides a comprehensive diagnostic tool that tests and validates Sun hardware by verifying the connectivity and functionality of most hardware controllers and devices on Sun platforms. SunVTS software can be tailored with modifiable test instances and processor affinity features.
Only the following tests are supported on x86 platforms. The current x86 support is for the 32-bit operating system only.
CD DVD Test (cddvdtest)
CPU Test (cputest)
Disk and Floppy Drives Test (disktest)
Data Translation Look-aside Buffer (dtlbtest)
Floating Point Unit Test (fputest)
Network Hardware Test (nettest)
Ethernet Loopback Test (netlbtest)
Physical Memory Test (pmemtest)
Serial Port Test (serialtest)
System Test (systest)
19
Universal Serial Bus Test (usbtest)
Virtual Memory Test (vmemtest)
SunVTS software has a sophisticated graphical user interface (GUI) that provides test configuration and status monitoring. The user interface can be run on one system to display the SunVTS testing of another system on the network. SunVTS software also provides a TTY-mode interface for situations in which running a GUI is not possible.

SunVTS Documentation

For the most up-to-date information on SunVTS software, go to this site:
http://docs.sun.com/app/docs/coll/1140.2

Diagnosing Server Problems With the Bootable Diagnostics CD

SunVTS software is preinstalled on these servers. The server is also shipped with the Bootable Diagnostics CD (705-1439). This CD is designed so that the server will boot from the CD. This CD will boot the Solaris™ Operating System and start SunVTS software. Diagnostic tests will run and write output to log files that the service technician can use to determine the problem with the server.
Requirements
To use the Bootable Diagnostics CD, you must have a keyboard, mouse, and
monitor attached to the server on which you are performing diagnostics.
20 Sun Fire X4100/X4100 M2 and X4200/X4200 M2 Servers Diagnostics Guide • May 2007
Loading...
+ 70 hidden pages