FUJITSU T5120 User Manual

SPARC® Enterprise
T5120 and T5220 Servers
Product Notes
Order No. U41752-J-Z816-1-76 Part No. 875-4197-10 October 2007, Revision
Copyright 2007 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved. FUJITSU LIMITED provided technical input and review on portions of this material. Sun Microsystems, Inc. and Fujitsu Limited each own or control intellectual property rights relating to products and technology described in
this document, and such products, technology and this document are protected by copyright laws, patents and other intellectual property laws and international treaties. The intellectual property rights of Sun Microsystems, Inc. and Fujitsu Limited in such products, technology and this document include, without limitation, one or more of the United States patents listed at http://www.sun.com/patents and one or more additional patents or patent applications in the United States or other countries.
This document and the product and technology to which it pertains are distributed under licenses restricting their use, copying, distribution, and decompilation. No part of such product or technology, or of this document, may be reproduced in any form by any means without prior written authorization of Fujitsu Limited and Sun Microsystems, Inc., and their applicable licensors, if any. The furnishing of this document to you does not give you any rights or licenses, express or implied, with respect to the product or technology to which it pertains, and this document does not contain or represent any commitment of any kind on the part of Fujitsu Limited or Sun Microsystems, Inc., or any affiliate of either of them.
This document and the product and technology described in this document may incorporate third-party intellectual property copyrighted by and/or licensed from suppliers to Fujitsu Limited and/or Sun Microsystems, Inc., including software and font technology.
Per the terms of the GPL or LGPL, a copy of the source code governed by the GPL or LGPL, as applicable, is available upon request by the End User. Please contact Fujitsu Limited or Sun Microsystems, Inc.
This distribution may include materials developed by third parties. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark
in the U.S. and in other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, Java, Netra, Solaris, Sun StorEdge, docs.sun.com, OpenBoot, SunVTS, Sun Fire, SunSolve, CoolThreads,
J2EE, and Sun are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries. Fujitsu and the Fujitsu logo are registered trademarks of Fujitsu Limited. All SPARC trademarks are used under license and are registered trademarks of SPARC International, Inc. in the U.S. and other countries.
Products bearing SPARC trademarks are based upon architecture developed by Sun Microsystems, Inc. SPARC64 is a trademark of SPARC International, Inc., used under license by Fujitsu Microelectronics, Inc. and Fujitsu Limited. The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges
the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun’s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun’s written license agreements.
United States Government Rights - Commercial use. U.S. Government users are subject to the standard government user license agreements of Sun Microsystems, Inc. and Fujitsu Limited and the applicable provisions of the FAR and its supplements.
Disclaimer: The only warranties granted by Fujitsu Limited, Sun Microsystems, Inc. or any affiliate of either of them in connection with this document or any product or technology described herein are those expressly set forth in the license agreement pursuant to which the product or technology is provided. EXCEPT AS EXPRESSLY SET FORTH IN SUCH AGREEMENT, FUJITSU LIMITED, SUN MICROSYSTEMS, INC. AND THEIR AFFILIATES MAKE NO REPRESENTATIONS OR WARRANTIES OF ANY KIND (EXPRESS OR IMPLIED) REGARDING SUCH PRODUCT OR TECHNOLOGY OR THIS DOCUMENT, WHICH ARE ALL PROVIDED AS IS, AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. Unless otherwise expressly set forth in such agreement, to the extent allowed by applicable law, in no event shall Fujitsu Limited, Sun Microsystems, Inc. or any of their affiliates have any liability to any third party under any legal theory for any loss of revenues or profits, loss of use or data, or business interruptions, or for any indirect, special, incidental or consequential damages, even if advised of the possibility of such damages.
DOCUMENTATION IS PROVIDED “AS IS” AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.
Recycle
Copyright 2007 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, Etats-Unis. Tous droits réservés. Entrée et revue tecnical fournies par FUJITSU LIMITED sur des parties de ce matériel. Sun Microsystems, Inc. et Fujitsu Limited détiennent et contrôlent toutes deux des droits de propriété intellectuelle relatifs aux produits et
technologies décrits dans ce document. De même, ces produits, technologies et ce document sont protégés par des lois sur le copyright, des brevets, d’autres lois sur la propriété intellectuelle et des traités internationaux. Les droits de propriété intellectuelle de Sun Microsystems, Inc. et Fujitsu Limited concernant ces produits, ces technologies et ce document comprennent, sans que cette liste soit exhaustive, un ou plusieurs des brevets déposés aux États-Unis et indiqués à l’adresse http://www.sun.com/patents de même qu’un ou plusieurs brevets ou applications brevetées supplémentaires aux États-Unis et dans d’autres pays.
Ce document, le produit et les technologies afférents sont exclusivement distribués avec des licences qui en restreignent l’utilisation, la copie, la distribution et la décompilation. Aucune partie de ce produit, de ces technologies ou de ce document ne peut être reproduite sous quelque forme que ce soit, par quelque moyen que ce soit, sans l’autorisation écrite préalable de Fujitsu Limited et de Sun Microsystems, Inc., et de leurs éventuels bailleurs de licence. Ce document, bien qu’il vous ait été fourni, ne vous confère aucun droit et aucune licence, expresses ou tacites, concernant le produit ou la technologie auxquels il se rapporte. Par ailleurs, il ne contient ni ne représente aucun engagement, de quelque type que ce soit, de la part de Fujitsu Limited ou de Sun Microsystems, Inc., ou des sociétés affiliées.
Ce document, et le produit et les technologies qu’il décrit, peuvent inclure des droits de propriété intellectuelle de parties tierces protégés par copyright et/ou cédés sous licence par des fournisseurs à Fujitsu Limited et/ou Sun Microsystems, Inc., y compris des logiciels et des technologies relatives aux polices de caractères.
Par limites du GPL ou du LGPL, une copie du code source régi par le GPL ou LGPL, comme applicable, est sur demande vers la fin utilsateur disponible; veuillez contacter Fujitsu Limted ou Sun Microsystems, Inc.
Cette distribution peut comprendre des composants développés par des tierces parties. Des parties de ce produit pourront être dérivées des systèmes Berkeley BSD licenciés par l’Université de Californie. UNIX est une marque
déposée aux Etats-Unis et dans d’autres pays et licenciée exclusivement par X/Open Company, Ltd. Sun, Sun Microsystems, le logo Sun, Java, Netra, Solaris, Sun StorEdge, docs.sun.com, OpenBoot, SunVTS, Sun Fire, SunSolve, CoolThreads,
J2EE, et Sun sont des marques de fabrique ou des marques déposées de Sun Microsystems, Inc. aux Etats-Unis et dans d’autres pays. Fujitsu et le logo Fujitsu sont des marques déposées de Fujitsu Limited. Toutes les marques SPARC sont utilisées sous licence et sont des marques de fabrique ou des marques déposées de SPARC International, Inc.
aux Etats-Unis et dans d’autres pays. Les produits portant les marques SPARC sont basés sur une architecture développée par Sun Microsystems, Inc.
SPARC64 est une marques déposée de SPARC International, Inc., utilisée sous le permis par Fujitsu Microelectronics, Inc. et Fujitsu Limited. L’interface d’utilisation graphique OPEN LOOK et Sun™ a été développée par Sun Microsystems, Inc. pour ses utilisateurs et licenciés. Sun
reconnaît les efforts de pionniers de Xerox pour la recherche et le développement du concept des interfaces d’utilisation visuelle ou graphique pour l’industrie de l’informatique. Sun détient une license non exclusive de Xerox sur l’interface d’utilisation graphique Xerox, cette licence couvrant également les licenciés de Sun qui mettent en place l’interface d’utilisation graphique OPEN LOOK et qui, en outre, se conforment aux licences écrites de Sun.
Droits du gouvernement américain - logiciel commercial. Les utilisateurs du gouvernement américain sont soumis aux contrats de licence standard de Sun Microsystems, Inc. et de Fujitsu Limited ainsi qu’aux clauses applicables stipulées dans le FAR et ses suppléments.
Avis de non-responsabilité: les seules garanties octroyées par Fujitsu Limited, Sun Microsystems, Inc. ou toute société affiliée de l’une ou l’autre entité en rapp ort a vec ce docum ent o u tout p rodu it ou toute technologie décrit(e) dans les présentes correspondent aux garanties expressément stipulées dans le contrat de licence régissant le produit ou la technologie fourni(e). SAUF MENTION CONTRAIRE EXPRESSÉMENT STIPULÉE DANS CE CONTRAT, FUJITSU LIMITED, SUN MICROSYSTEMS, INC. ET LES SOCIÉTÉS AFFILIÉES REJETTENT TOUTE REPRÉSENTATION OU TOUTE GARANTIE, QUELLE QU’EN SOIT LA NATURE (EXPRESSE OU IMPLICITE) CONCERNANT CE PRODUIT, CETTE TECHNOLOGIE OU CE DOCUMENT, LESQUELS SONT FOURNIS EN L’ÉTAT. EN OUTRE, TOUTES LES CONDITIONS, REPRÉSENTATIONS ET GARANTIES EXPRESSES OU TACITES, Y COMPRIS NOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE À LA QUALITÉ MARCHANDE, À L’APTITUDE À UNE UTILISATION PARTICULIÈRE OU À L’ABSENCE DE CONTREFAÇON, SONT EXCLUES, DANS LA MESURE AUTORISÉE PAR LA LOI APPLICABLE. Sauf mention contraire expressément stipulée dans ce contrat, dans la mesure autorisée par la loi applicable, en aucun cas Fujitsu Limited, Sun Microsystems, Inc. ou l’une de leurs filiales ne sauraient être tenues responsables envers une quelconque partie tierce, sous quelque théorie juridique que ce soit, de tout manque à gagner ou de perte de profit, de problèmes d’utilisation ou de perte de données, ou d’interruptions d’activités, ou de tout dommage indirect, spécial, secondaire ou consécutif, même si ces entités ont été préalablement informées d’une telle éventualité.
LA DOCUMENTATION EST FOURNIE “EN L’ETAT” ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES OU TACITES SONT FORMELLEMENT EXCLUES, DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE, Y COMPRIS NOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L’APTITUDE A UNE UTILISATION PARTICULIERE OU A L’ABSENCE DE CONTREFACON.

Contents

Important Information About the SPARC Enterprise T5120 and T5220 Servers
1
Support for the SPARC Enterprise T5120 and T5220 Servers 1
Technical Support 2
Downloading Documentation 2
Fujitsu Siemens Computers Welcomes Your Comments 2
Supported Versions of Solaris and Sun Java Enterprise System Software and
System Firmware 2
Updating System Firmware 3
Preinstalled and Preloaded Software 3
Cool Tools for Servers With CoolThreads Technology 4
Logical Domains 5
Sun Java Enterprise Server and Solaris OS 5
Solaris Live Upgrade 6
Sun Studio - C, C++ & Fortran Compilers and Tools 6
Mandatory Patch Information 6
To Download Patches 7
Patches for Option Cards 8
General Functionality Issues and Limitations 8
Cryptographic Function 8
v
RAID Function 8
LDOM Manager 9
Hardware and Mechanical Issues 9
DVD and USB Module on Front Panel 9
Replacing Components in the System Chassis 9
Hotswapping Fan Modules 9
Unexpected LED Behavior 10
Solaris OS, Firmware, and General Software Issues 10
Supported Sun Explorer Utility Version 10
Solaris OS Issues 10
Login Prompt Resets Five Seconds After Solaris OS Boots (CR 6607315)
10
Inconsistent Console Behavior When Not Using the Virtual Console (CR
6581309) 11
XAUI and CPU Resources Added After Initial LDoms Setup Are Not
Available to LDoms Manager (CR 6597815) 11
Problem When N2 PCIe Link Fails to Train as x8 (CR 6556505) 11
Diagnosed FBR and FBU Errors to Branch (CR 6536482) 12
Panic in nxge_start When dupb Fails (CR 6567838) 13
RX Jumbo Frame Throughput of nxge Drops to 30 Mbps Due to Packet
Dropping (CR 6554478) 13
Physical-Platform Information Missing From prtpicl and prtdiag (CR
6586624) 16
Solaris Shutdown Hangs and Fewer System Services Are Seen (CR
6588499) 16
prtdiag Without -v Option Does Not Show Failures in Output (CR
6586847) 17
Device Paths Wrong in prtdiag When Running e1000g Driver (CR
6479347) 17
prtdiag -v Formatting Issues (CR 6587389) 17
Solaris locator Command Not Working (CR 6564180) 18
vi SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
prtdiag -v Voltage Indicators Section Has Non-Voltage Indicators (CR
6587380) 19
prtdiag -v Slow To Respond on 1U (CR 6588550) 19
Control-C from prtdiag, Blank Environmental Data Fields When Run
Again (CR 6552999) 19
Firmware and General Software Issues 20
IPMI Chassis Power Cycle Does Not Always Work (CR 6602913) 20
No keyboard support found; Can’t open USB keyboard
package (CR 6601900) 20
Ierrs Generated When 100Mb/Full With Forced Speed/Duplex is Set in
e1000g.conf (CR 6555486) 20
ereport.fm.ferg.invalid DMU Core and Block Error Status(0) No
bits Set (CR 6598381) 21
USB Device Causes Panic on DMA to Address 0 (CR 6555956) 22
VBSC Did Not Detect Memory One Time: ERROR:
MB/CMP0/BR3/CH0/D0 must be populated (CR 6604305) 22
rm-io Followed by Multiple set-vcpu Operations Could Cause HV
Abort or ldmd Core Dump (CR 6597761) 23
cryptotest Intermittent SUNW_RANDOM Generate Failure (CR
6572985) 23
modunload While the nxge Port is Running, Could Cause a System Panic
(CR 6551509) 23
When Either NIU Port in the N2 CPU is Disabled the Corresponding XAUI
is Configured (CR 6599334) 25
Booting Solaris OS From External USB DVD-ROM Drive Could Cause
Panic and Fail to Boot (CR 6588452) 25
SPARC Enterprise T52x0 Hangs When Trying to Boot With Infiniband
HBA Card Installed (CR 6578410) 25
mpt Handles SMART Event From the LSI Firmware (CR 6574127) 26
Setting Properties for N2/NIU nxge Devices Could Fail (CR 6561389) 26
ldc_skt and ds Might Not Handle LDC Resets Properly (CR 6583567)
27
ID 216524 daemon.error Registration With DMI Failed
err = 831 (CR 6237994) 28
Contents vii
Domain ETM and LDC Deadlock When Transmit Queue Full (CR
6594506) 28
raidctl -h and Man Page Display Some Unsupported Features 29
Unable to Remove RAID 1 Volume After RAID 1 and RAID 0 Volumes Are
Created (CR 6592238) 30
raidctl -l Continuously Outputs Disk: 0.0.0 (CR 6589612) 30
IO-Domain-Reset: ERROR: Last Trap: Watchdog Reset (CR 6593547) 30
XAUI and CPU Resources Added After Initial LDoms Setup Are Not
Available to LDoms Manager (CR 6597815) 30
OpenBoot PROM Banner Shows the Same Memory Number After
Disabling a DIMM and Resetting All (CR 6579390) 31
Guest Domain wanboot miniroot Download Could Take More Than 30
Minutes (CR 6543749) 31
OpenBoot PROM Variables Cannot Be Modified by eeprom at the OS
Prompt When ldmd Is Running (CR 6540368) 31
Options True False Menu Interrupts OpenBoot PROM Reset (CR
6594395) 32
Changing OBP nvram Parameters Does Not Take Affect After resetsc
(CR 6596594) 33
Changes to OpenBoot PROM Variables With Nondefault LDoms
Configuration Do Not Persist (CR 6593132) 33
Occasional LDom Warning Message After POST (CR 6592934) 34
L2 Cache ue Error Injections Produce dau Ereports and Memory Faults
(CR 6592272) 35
Temporary PCIe Link Failure During Boot Causes Fatal Error Later (CR
6553515) 36
Processor Always Starts on Lowest Available Strand Regardless If asr
Disabled (CR 6541482) 37
Integrated Lights Out Manager (ILOM) Issues 37
Missing CPU Cores and Strands Shown in PICL Physical-Platform Tree
(CR 6596503) 38
showfaults Shows the Motherboard as Faulty Instead of the DIMM (CR
6582853) 38
viii SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
If Socketed EEProm (SCC) is Changed, SP Does Not Always Read Some
SP Properties From New EEProm (CR 6596430) 39
If All Platform Identity Checks Fail (CR 6593801) 39
Login to SP Times Out Intermittently (CR 6568750) 39
Use setdefaults Command to Clear Last POST Run From
showfaults (CR 6573354) 40
Mouse Redirection is Extremely Slow or Not Usable With RHEL4U4 Host
and Windows XP Client (CR 6502777) 40
Fewer Than 21 Entries in the Event Log Causes showlogs to Display
None of the Events (CR 6589043) 40
XAUI0 or XAUI1 Disabled With disablecomponent but the Devices are
Still Created in OpenBoot PROM (CR 6599333) 40
uadmin 2 0 and reboot Reads Old Bootmode Settings (CR 6585340) 41
ALOM CMT Compatibility CLI Displays Incorrect Guest Status Sent From
OpenBoot PROM and Solaris (CR 6567748) 42
Intermittent POST PIU0 Link Train Errors (CR 6571886) 42
SP Serial Line Terminal Server break Command Does Not Work (CR
6577528) 42
Write to Virtual Blade Server Control (VBSC) Illegal Seek (CR 6582340)
44
DIMM FRUs Must Be Presented With the IPMI Interface (CR 6591367) 44
Intermittent Issue With netsc commit and showsc Hang (CR 6549028)
45
Use of consolehistory -e Results in SP Becoming Unusable (CR
6587869) 45
reset /SP With SP ILOM CLI SSH Connection Results in Error Message
(CR 6588999) 46
Missing Fan Board Module Output is Displayed in prtdiag and ILOM
CLI and BUI (CR 6595955) 46
Remove Boot and Run Options From consolehistory Usage (RFE
6510082) 46
SP useradd and usershow Errors Followed by Hang (CR 6585114) 46
Cleanup Output Display When Resetting ILOM (CR 6585292) 47
Contents ix
Documentation Errata 48
x SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007

Important Information About the SPARC Enterprise T5120 and T5220 Servers

These product notes contain important and late-breaking information about the SPARC® Enterprise T5120 and T5220 servers.
The following sections are included:
“Support for the SPARC Enterprise T5120 and T5220 Servers” on page 1
“Supported Versions of Solaris and Sun Java Enterprise System Software and
System Firmware” on page 2
“Preinstalled and Preloaded Software” on page 3
“Mandatory Patch Information” on page 6
“Hardware and Mechanical Issues” on page 9
“Solaris OS, Firmware, and General Software Issues” on page 10
“Documentation Errata” on page 48

Support for the SPARC Enterprise T5120 and T5220 Servers

This section includes where to obtain technical support, software, and documentation.
1

Technical Support

If you have technical questions or issues that are not addressed in the SPARC Enterprise T5120 or T5220 servers documentation, contact a sales representative or a certified service engineer.

Downloading Documentation

Instructions for installing, administering, and using your servers are provided in the SPARC Enterprise T5120 and T5220 servers documentation sets. The entire documentation set is available for download from the following web site:
http://manuals.fujitsu-siemens.com/
Note – Information in these product notes supersedes the information in the SPARC
Enterprise T5120 and T5220 documentation sets.

Fujitsu Siemens Computers Welcomes Your Comments

If you have any comments or requests regarding this document, or if you find any unclear statements in the document, please state your points specifically, and forward it to a sales representative or a certified service engineer.
Please include the title and part number of your document with your feedback.

Supported Versions of Solaris and Sun Java Enterprise System Software and System Firmware

The following are the minimum supported versions of firmware and software for this release of the SPARC Enterprise T5120 and T5220 servers:
Solaris 10 8/07 Operating System (OS)
Sun Java Enterprise System 5 software (Sun Java ES 5)
2 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
System firmware 7.0.3, which includes Integrated Lights Out Manager (ILOM) 2.0
software, OpenBoot™ 4.27.0 firmware, and Hypervisor software.

Updating System Firmware

The system firmware includes Integrated Lights Out Manager (ILOM) software, OpenBoot firmware, and Hypervisor software.
Firmware updates are available from the following web site through patch releases.
http://support.fujitsu-siemens.com
For details on how to update your system firmware, refer to the SPARC Enterprise T5120 and T5220 Servers Installation Guide. For more details on the flashupdate command, refer to the Integrated Lights Out Manager 2.0 (ILOM 2.0) Supplement for SPARC Enterprise T5120 and T5220 Servers.
Note – Updating your system firmware also updates your ILOM software and
OpenBoot firmware.

Preinstalled and Preloaded Software

This section lists and describes the preinstalled and preloaded software on your server. The preinstalled software is ready to use. The preloaded software must first be installed from the preloaded location.
Note – The Solaris OS is preinstalled both in root disk Slice 0 for normal operations,
and in Slice 3 along with Live Upgrade software to provide an Alternate Boot Environment (ABE). The ABE allows upgrading the OS or performing system maintenance tasks without reducing performance. Just the core OS is installed for an ABE in Slice 3.
Note – Of the following software, the SPARC Enterprise T5120 and T5220 servers
support Solaris OS only.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 3
The following table lists the software preinstalled on your server.
TAB LE 1 Preinstalled Software
Software Location Function
Solaris 10 8/07 Root disk Slice 0 (and just the core OS on Slice 3
for an ABE)
Cool Tools GCC /opt/gcc and /opt/SUNW0scgfss GCC compiler for SPARC Systems
Corestat /opt/corestat.v.1.2.1 Measures core usage
CMT Tools /opt/SUNWspro/extra/bin Sun Studio Developer Tools
Sun Studio /opt/SUNWspro C, C++, and Fortran compiler
LDoms Manager /opt/LDoms_Manager-1_0_1-RR/Product
and /opt/SUNWldm/
LDoms MIB opt/ldoms_mib and /opt/SUNWldmib LDoms Management Information Base
SunUpdate Connection
/usr/platform/sun4v/sbin/sysfwdownload Downloads system firmware from
Operating system
Manages Logical Domains
Solaris OS host
The following table lists the software preloaded on your server. To use this software you must first install it from the preloaded location.
TAB LE 2 Preloaded Software
Software Location Function
CoolTuner /var/spool/stage/cooltuner Tuning tool for CoolThreads
Cool Stack /var/spool/pkg Increases server performance
Sun Java Enterprise Server
/var/spool/stage/JES5/Solaris_sparc Optimizes software investment

Cool Tools for Servers With CoolThreads Technology

Cool Tools provide a collection of freely available tools designed to enable fast and efficient development and deployment of optimally configured software solutions on CoolThreads™ servers.
These tools significantly improve performance and time-to-market for applications running on UltraSPARC® processor-based servers.
An overview of the Cool Tools and full documentation is available at the following URL:
4 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
http://www.sun.com/servers/coolthreads/overview/cooltools.jsp
Not all of the Cool Tools listed on the Cool Tools web page are available on your server. The following are not included:
Consolidation Tool
Cooltst
Sun Application Porting Assistant
Note – The Cool Tools GCC compiler and Corestat tools are preinstalled. The
CoolTuner and Cool Stack software is preloaded and must be installed from the preloaded location before using. See TABLE 1 and TABLE 2.

Logical Domains

Using Logical Domains (LDoms) increases your server usage, efficiency, and return on investment, and also reduces your server footprint. The LDoms Manager software creates and manages logical domains, and maps logical domains to physical resources.
Note – The LDoms MIB must be configured before it is ready to use. A README
file with configuration instructions is located in the LDoms MIB installation directory, /opt/ldoms_mib.
For more information on LDoms, go to:
http://www.sun.com/servers/coolthreads/ldoms/

Sun Java Enterprise Server and Solaris OS

The Sun Java Enterprise Server is a comprehensive set of software and lifecycle services that make the most of your software investment.
For an overview and documentation, go to:
http://www.sun.com/service/javaes/index.xml
Important Information About the SPARC Enterprise T5120 and T5220 Servers 5

Solaris Live Upgrade

Solaris Live Upgrade technology significantly reduces service outage during an OS upgrade. This technology enables the Solaris OS to run normally during an upgrade or normal maintenance on an inactive boot environment.
Your server is configured with a liveupgrade partition on slice 3 that contains an exact duplicate of the Solaris OS that is preinstalled in the root partition. This liveupgrade partition is an Alternate Boot Environment (ABE).
For more information about Solaris Live Upgrade, go to:
http://www.sun.com/software/solaris/liveupgrade/

Sun Studio - C, C++ & Fortran Compilers and Tools

Sun Studio delivers high performance by optimizing C, C++, and Fortran compilers for the Solaris OS on multi-core systems.
For an overview and documentation, go to:
http://developers.sun.com/sunstudio/index.jsp

Mandatory Patch Information

Patches are available at:
http://support.fujitsu-siemens com.
6 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
The following patches are required for your server:
TABLE 3 Mandatory Patches for Both Servers
Patch IDs Description Fixes Provided
• 122642-01 or later
• 127753-01 or later
• 113886-46
• 113887-46
• 120812-20 OpenGL 1.5
• 127741-01 or later
• 127745-01 or later
System panics with n2cp alignment error
OpenGL 1.3 OpenGL 1.3
Data integrity in the nxge driver
IPsec performance Fixes CR 6568352: IPsec performance does not
These patches fix Change Request (CR) 6590132: System panics (n2cp alignment error) in IPsec testing
These OpenGL patches fix the following CRs:
• CR 6579285: FTC: overall pass rate decreased between XVR-100 and XVR-300 (by 25%+)
• CR 6579303: CONFORM: mustpass tests fail
Fixes issues reported by Sun Alert ID 103076
scale using hardware crypto providers
Note – Before contacting support, ensure that all mandatory patches are installed on
your server. In addition to applying the periodic PTF, check the above web site on a regular basis for the availability of new patches.
To determine if a patch is present, see “To Download Patches” on page 7.
Note – These patches might not be included in some versions of the preinstalled or
preloaded software on your server. If the patches are missing from your server, download them as described in “To Download Patches” on page 7.

To Download Patches

1. Determine whether the patches have been installed on your system.
For example, using the showrev command, type the following for each patch number:
# showrev -p | grep "Patch: 119578"
If you see patch information listed for the queried patch, and the dash extension
(the last two digits) matches or exceeds the required version, your system has the proper patches already installed and no further action is required.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 7
For example, if patch 119578-16 or later is installed, your system has the required version of this patch.
If you do not see patch information listed for the queried patch, or if the dash
extension precedes the required version, go to Step 2. For example, if no version of the 119578 patch, or a version with an extension of
-15 or earlier is installed, you must download and install the new patch.
2. Access the above-mentioned web site to download the patches.
3. Follow the installation instructions provided in a specific patch’s README file.

Patches for Option Cards

If you add option cards to your server, refer to the documentation and README files for each card to determine if additional patches are needed.

General Functionality Issues and Limitations

This section describes general issues known to exist at this release of the SPARC Enterprise T5120 and T5220 servers.

Cryptographic Function

The SPARC Enterprise T5120 and T5220 servers currently don't support the cryptographic function mounted on the UltraSPARC T2 multicore processor.

RAID Function

A hardware RAID function is provided as standard in SPARC Enterprise T5120 and T5220 servers. However, with regard to data protection, reliability, and serviceability, Fujitsu Siemens Computers does not support this function.
Fujitsu Siemens Computers recommends use of software RAID functions for internal disks as specified below:
PRIMECLUSTER GDS
8 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
Solaris Volume Manager (attached to Solaris OS)

LDOM Manager

The SPARC Enterprise T5120 and T5220 servers currently don't support this function.

Hardware and Mechanical Issues

This section describes hardware issues known to exist at this release of the SPARC Enterprise T5120 and T5220 servers.

DVD and USB Module on Front Panel

Some DVD/USB modules do not have a pull-tab feature. Instead, a finger detent in the floor of the DVD/USB module is used to remove the unit. Thus, some DVD/USB containers can be inadvertently disconnected from the disk backplane when a direct connect USB device or USB cable is pulled from a front panel USB port.
Workaround: Apply counter-pressure to the DVD assembly when removing a USB device. In addition, do not remove a USB device while a DVD or CD is inserted and operating.

Replacing Components in the System Chassis

FBDIMMs and the heatsink on the UltraSPARC T2 processor might be hot to the touch immediately after powering down the system (CR 6550166). Wait for components to cool down prior to service actions requiring access to components in the system chassis—approximately one minute.

Hotswapping Fan Modules

When removing a fan module, hold the adjacent fan module in place to avoid unintentionally dislodging the adjacent fan module.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 9

Unexpected LED Behavior

There is a LED behavior that occurs when you create a volume. All of the disks in the volume blink at the same time about every 16 seconds. This is normal behavior and can be ignored.

Solaris OS, Firmware, and General Software Issues

This section describes firmware and software issues known to exist at this release of the SPARC Enterprise T5120 and T5220 servers.

Supported Sun Explorer Utility Version

The SPARC Enterprise T5120 and T5220 servers are supported by the Sun Explorer
5.10 (or later) data collection utility, but is not supported by earlier releases of the utility. Installing Sun Cluster or Sun Net Connect software from the preinstalled Java ES package could automatically install an earlier version of the utility on your system. After installing any of the Java ES software, determine whether an earlier version of the Sun Explorer product has been installed on your system by typing the following:
# pkginfo -l SUNWexplo
For information on how to get the Sun Explorer Utility, please contact a certified service engineer.

Solaris OS Issues

Login Prompt Resets Five Seconds After Solaris OS Boots (CR
6607315)
When using the keyboard as the input device (input-device=keyboard), if you log in immediately after getting the Solaris login prompt, about five seconds later you are logged out. This issue does not occur with the virtual-console.
10 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
Workaround: Use the virtual-console as the input device.
Inconsistent Console Behavior When Not Using the Virtual Console (CR 6581309)
Console behavior on the control domain is inconsistent when a graphics device and keyboard are specified for console use. This occurs when the OpenBoot variables input-device and output-device are set to anything other than the default value of virtual-console.
If the control domain is set this way, some console messages are sent to the graphics console and others are sent to the virtual console. This results in incomplete information on either console. In addition, when the system is halted, or a break is sent to the console, control is passed to the virtual console which requires keyboard input over the virtual console. As a result, the graphics console appears to hang.
Workaround: To avoid this problem, use only the virtual console. From OpenBoot, ensure that the default value of virtual-console is set for both the input- device and output-device variables.
Once the graphics console appears hung, connect to the virtual console from the system processor to provide the required input. Press carriage return on the virtual console keyboard once to see the output on the virtual console.
If this workaround does not work, contact a certified service engineer.
XAUI and CPU Resources Added After Initial LDoms Setup Are Not Available to LDoms Manager (CR 6597815)
When you add CPU or XAUI resources to a machine that has been configured to use logical domains, you must revert to the factory default machine configuration to allow the LDoms Manager to allocate those resources to guest domains.
Problem When N2 PCIe Link Fails to Train as x8 (CR 6556505)
The system could encounter a problem during a power-on or reset sequence where the I/O Bridge (PCIe root complex) of the UltraSPARC-T2 CPU does not train at all or trains at a lane width less than 8 and no error or fault is generated to indicate to the user this problem has been encountered.
Identifying the problem:
Important Information About the SPARC Enterprise T5120 and T5220 Servers 11
Though no error or fault is reported, it is easy to identify due to the fact that when this problem is encountered, no PCIe I/O devices will be available to the system. If you power-on the system or reset the domain and try and boot from a disk or network device and you get an error similar to the following:
{0} ok boot disk Boot device: /pci@0/pci@0/pci@2/scsi@0/disk@0 File and args: ERROR: boot-read fail
Can't locate boot device
{0} ok
At the OpenBoot prompt issue the command show-devs. Check the output to for PCIe devices. If there are no devices then you have encountered this problem.
Note – All PCIe devices begin with the path /pci@0/pci@0.
Corrective Action:
If this problem is encountered you should take down all domains and power off the system. Next run Power On Self Test (POST) to identify whether this is a persistent failure or not.
To enable POST use the SP ALOM CMT compatibility CLI command setsc and enable POST to run at max level.
For example:
sc> setsc diag_mode normal sc> setsc diag_level max
Next, power on the system and POST will execute testing the CPU, Memory and IO sub-systems. If the problem is persistent POST will fail the PCIe root complex and disable the component /SYS/MB/PCIE.
If POST does detect the problem, the system's motherboard will need to be replaced. Please contact your sales representative to schedule a motherboard replacement.
Diagnosed FBR and FBU Errors to Branch (CR 6536482)
Currently, cpummem de does not diagnose the fbr/fbu errors.
12 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
Panic in nxge_start When dupb Fails (CR 6567838)
If jumbo frames are enabled, it is theoretically possible for the system to panic as a result of a NULL pointer reference. This scenario is only possible with frame size larger than 4076. Extensive testing of jumbo frames with MTU=9194 did not result in a system panic.
Workaround: Because disabling jumbo frames or using jumbo frames with a smaller MTU impacts system performance, apply the following workarounds only if the system does panic.
The first workaround is to disable jumbo frames.
Open the following file:
/platform/sun4v/kernel/drv/nxge.conf
Ensure that any line with accept_jumbo=1; is commented out. Also ensure that there is no set nxge:nxge_jumbo_enable=1 in the /etc/system file.
If jumbo frames are enabled, then the workaround is to set MTU to a value equal or smaller than 4076. Using port1 as an example:
1. Create a /etc/hosts file and add the following line in it:
99.99.9.1 nxge-port1
Here nxge-port1 is the name you give to the interface. 99.99.9.1 is the IP address you want to assign to the interface.
2. Create a /etc/hostname.nxge1 file and place the following two lines in it:
nxge-port1 nxge-port1 mtu 4076
The MTU set as above will remain even after reboot. Command ifconfig nxgeX mtu 4076 (where X is the instance number) can also be used to set MTU, but the
value will change back to the default one after reboot.
RX Jumbo Frame Throughput of nxge Drops to 30 Mbps Due to Packet Dropping (CR 6554478)
Receive-side performance of the nxge driver drops significantly if the following two conditions are true:
Important Information About the SPARC Enterprise T5120 and T5220 Servers 13
1. Jumbo frame is enabled because the following line is present and not commented out in the file nxge.conf:
accept_jumbo=1
Note – Refer to the Sun Quad GbE UTP x8 PCIe ExpressModule User’s Guide, the Sun
Dual 10GbE XFP PCIe ExpressModule User’s Guide, the Sun x8 Express Dual 10 Gigabit Ethernet Fiber XFP Low Profile Adapter User’s Guide, or the Sun x8 Express Quad Gigabit Ethernet UTP Low Profile Adapter User’s Guide, as appropriate, for details.
The file nxge.conf is under the /platform/sun4v/kernel/drv directory on sun4v systems, and it is under the /platform/sun4u/kernel/drv directory on sun4u systems.
2. Maximum Transmission Unit (MTU) is set to a value larger than 8172. Note that, when jumbo frames are enabled, the MTU size defaults to 9194.
14 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
Wor karou nd:
If jumbo frame is to be enabled, set the MTU to 8172 by doing the following (using interface port1 as an example:
1. Create a file /etc/hosts and add the following line in it:
99.99.9.1 nxge-port1
nxge-port1 is the name you give to the interface.
99.99.9.1 is the IP address you want to assign to the interface.
2. Create a file /etc/hostname.nxge1 and place the following two lines in it:
nxge-port1 nxge-port1 mtu 8172
3. If you want the system to set the netmask to a special value automatically, add
the following line in /etc/netmasks (using netmask FFFFFF00 as an example):
99.99.9.1 255.255.255.0
4. Reboot the system.
The nxge1 interface will be automatically plumbed with IP address 99.99.9.1, MTU value 8172 and netmask ffffff00. ifconfig -a should show something similar to the following:
# ifconfig -a nxge1: flags=1201000802<BROADCAST,MULTICAST,IPv4,CoS,FIXEDMTU> mtu 8172 index 3 inet 99.99.9.1 netmask ffffff00 broadcast 99.255.255.255 ether 0:14:4f:6c:88:5
If you want to set parameters permanently for other interfaces, then create /etc/hostname.nxge0, /etc/hostname.nxge2 and /etc/hostname.nxge3 similarly and add their name IP-address pairs to the same /etc/hosts file and add their netmasks to the same /etc/netmasks file.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 15
Physical-Platform Information Missing From prtpicl and prtdiag (CR 6586624)
The Solaris command prtdiag will not always display Environmental Status and FRU Status. Moreover, if the -v option (verbose) is specified with this command, the firmware version and the Chassis Serial Number might also not be displayed. In addition, the Solaris command prtpicl will not always display the physical­platform section.
Workaround: System information that is supposed to be obtained using the Solaris command prtdiag -v, can be obtained by using the following ALOM CMT compatibility CLI commands:
sc> showenvironment - displays the system’s environmental status sc> showfru component NAC - displays a component’s FRU status sc> showplatform - displays the Chassis Serial Number sc> showhost - displays the firmware version
The physical-platform node properties that are supposed to be obtained using the Solaris command, prtpicl -v, can be obtained by walking through the targets of the show SYS command with the ILOM CLI and the ILOM graphical user interface. Refer to the Integrated Lights Out Manager (ILOM) 2.0 User’s Guide for details.
Solaris Shutdown Hangs and Fewer System Services Are Seen (CR 6588499)
Very rarely, a Solaris shutdown performed immediately after the Solaris OS boots could cause the system to hang. This is due to some system services attempting to stop while others are still in the process of starting. The hang will occur after svc.startd tries to stop the system services with a message similar to the following:
svc.startd: The system is coming down. Please wait svc.startd: 74 system services are now being stopped.
Note – The number of system services being stopped will vary.
16 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
Workaround: Reboot the system by dropping to the service processor (SP) and then power cycle the host system with the following commands from the ALOM CMT compatibility CLI:
sc> poweroff sc> poweron sc> powercycle
Or, use the following commands from the ILOM CLI:
-> stop /SYS
-> start /SYS
prtdiag Without -v Option Does Not Show Failures in
Output (CR 6586847)
The prtdiag command entered with no options does not display current failures that are reported with the prtdiag -v option.
Workaround: Always use the -v option when running the prtdiag command.
Device Paths Wrong in prtdiag When Running e1000g Driver (CR 6479347)
Device paths could be incorrect in the prtdiag utility for three out of four on board network devices when running the e1000g driver.
Workaround: Force load all instances of the e1000g driver and then restart the picld daemon. For example:
devfsadm -i e1000g svcadm restart svc:/system/picl
Another workaround is to use the -r option when booting or rebooting the system.
prtdiag -v Formatting Issues (CR 6587389)
Some of the information produced by the prtdiag(1M) utility is difficult to read when the -v option is used. White space is missing between the first and second fields in the report.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 17
The following formatting issues could be displayed in the prtdiag -v command output:
Fans sensors – Missing spaces tab between Location and Sensor columns
Temperature sensors – DIMMs missing spaces tab between Location and Sensor
columns
LEDs – Location missing for SERVICE, LOCATE, ACT, PS_FAULT, TEMP_FAULT,
and FAN_FAULT DIMMs. Missing spaces tab between Location and LED. In addition the locations of sensors have the first portion of their location truncated, resulting in no location being reported for some items such as system status LEDs.
To see this formatting information, use the showenvironment command in the ALOM CMT compatibility CLI:
sc> showenvironment - displays the system environmental status
Solaris locator Command Not Working (CR 6564180)
The locator command sets or queries the state of the system locator if such a device exists. Your server does have a locator LED. However, this command incorrectly returns an error message similar to the following.
# locator -n ’system’ locator not found
Workaround: To obtain locator LED information, use the following commands.
In the ALOM CMT compatibility CLI, to show the locator position:
sc> showlocator
In the ALOM CMT compatibility CLI, to set the locator LED to on or off:
sc> setlocator on OR sc> setlocator off
In the ILOM CLI, to show the locator position:
-> show /SYS/LOCATE/
18 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
In the ILOM CLI, to set the locator LED to on or off:
-> set /SYS/LOCATE value=off
OR
-> set /SYS/LOCATE value=Fast_Blink
prtdiag -v Voltage Indicators Section Has Non-Voltage
Indicators (CR 6587380)
Non-voltage indicators (such as PS0/TEMP_FAULT) will appear under the Voltage Indicators section of the prtdiag output. The information reported under the condition column is accurate and represents the current condition of the components.
prtdiag -v Slow To Respond on 1U (CR 6588550)
The prtdiag -v command is slow and could thus appear to hang. It could take up to five minutes for the command to complete.
Control-C from prtdiag, Blank Environmental Data Fields When Run Again (CR 6552999)
When the verbose (-v) option is specified to the prtdiag command in the control domain, additional environmental status information is displayed. If the output of this information is interrupted by issuing a Control-C, the picld(1M) daemon may enter a state that prevents it from supplying the environmental status information to prtdiag from that point onward, and the additional environmental data will no longer be displayed.
Workaround: Restart the picld SMF service in the control domain with the following command:
# svcadm restart picl
Important Information About the SPARC Enterprise T5120 and T5220 Servers 19

Firmware and General Software Issues

IPMI Chassis Power Cycle Does Not Always Work (CR
6602913)
Sometimes IMPI power off or power cycle operations might fail. These operations instruct the Virtual Blade Server Control (VBSC) to power off or power cycle but return immediately without waiting for the completion of the operation. If it fails, repeat the IPMI power off, power cycle operation, or use one of the other available interfaces to perform this operation.
No keyboard support found; Can’t open USB keyboard package (CR 6601900)
When the OBP variable is set to input-device=keyboard, warning messages could be seen when the system host is powered on or reset. A U.S. keyboard will work as expected, but international keyboards (French, German, ...) will behave as U.S. keyboards.
{0} ok setenv input-device keyboard input-device = keyboard {0} ok reset-all No keyboard support found Can’t open USB keyboard package
Workaround: Do not use the USB keyboard and instead use a virtual console by setting the input-device variable to virtual-console.
Ierrs Generated When 100Mb/Full With Forced Speed/Duplex is Set in e1000g.conf (CR 6555486)
These Ierrs are caused by the Forced Speed/Duplex parameter. When the port is configured to 100Mb Full Duplex with Auto-Negotiation, the Ierrs are not generated.
20 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
Workaround: Use Auto-Negotiation to set the Link Speed/Duplex. For example, to set 100Mb Full Duplex for a e1000g0 device, change the settings in the
e1000g.conf file as follows:
ForceSpeedDuplex=7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7; # This will force Speed and Duplex for following settings for a typical instance. # 1 will set the 10 Mbps speed and Half Duplex mode. # 2 will set the 10 Mbps speed and Full Duplex mode. # 3 will set the 100 Mbps speed and half Duplex mode. # 4 will set the 100 Mbps speed and Full Duplex mode. # 7 will let adapter autonegotiate. AutoNegAdvertised=8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0; # This parameter determines the speed/duplex options that will be # advertised during auto-negotiation. This is a bitmap with the # following settings. # Bit | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 # Setting| N/A | N/A | 1000F | N/A | 100F | 100H | 10F | 10H # # For example: # To advertise 10 Half only AutoNegAdvertised = 1 # To advertise 10 Full only AutoNegAdvertised = 2 # To advertise 10 Half/Full AutoNegAdvertised = 3
# To advertise 100 Half only AutoNegAdvertised = 4 # To advertise 100 Full only AutoNegAdvertised = 8 # To advertise 100 Half/Full AutoNegAdvertised = 12 # To advertise 1000 Full only AutoNegAdvertised = 32 # To advertise all speeds AutoNegAdvertised = 47
ereport.fm.ferg.invalid DMU Core and Block Error
Status(0) No bits Set (CR 6598381)
In rare circumstances, the PIU (PCIe Interface Unit) might issue a spurious error interrupt.
The symptom of this happening will be events with the following diagnostics.
SUNW-MSG-ID: FMD-8000-0W, TYPE: Defect, VER: 1, SEVERITY: Minor EVENT-TIME: Mon Aug 27 10:07:33 EDT 2007 PLATFORM: SUNW,SPARC-Enterprise-T5220, CSN: -, HOSTNAME: xxxxxxx SOURCE: fmd-self-diagnosis, REV: 1.0 EVENT-ID: dd9a4415-9be4-cb55-d061-8804b8009d3c
Important Information About the SPARC Enterprise T5120 and T5220 Servers 21
The following is an example of dumping the event:
# fmdump -eV -u dd9a4415-9be4-cb55-d061-8804b8009d3c TIME CLASS Aug 27 2007 10:06:15.496599680 ereport.fm.ferg.invalid nvlist version: 0 class = ereport.fm.ferg.invalid ena = 0xd4e233fe480002 info = DMU Core and Block Error Status(0): No bits set raw-data = 0x2 0x1a62441a01d844 0x30000000000005 0x4b63c07df9ff 0x3e002421030607 0x 3e 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 __ttl = 0x0 __tod = 0x46d2da57 0x1d998280
These events are harmless and can be ignored.
USB Device Causes Panic on DMA to Address 0 (CR 6555956)
Very rarely a panic could occur during reboot with the following message:
"Fatal error has occurred in: PCIe root complex."
The panic only occurs on reboot and has never been observed on the reboot that follows the panic. (If an endless panic/reboot cycle does occur call a sales representative or a certified service engineer for service.)
Workaround: Ensure that the system is set to automatically reboot after a panic. At the ALOM CMT compatibility CLI enter the following command:
sc> bootmode bootscript="setenv auto-boot? true"
VBSC Did Not Detect Memory One Time: ERROR: MB/CMP0/BR3/CH0/D0 must be populated (CR 6604305)
Very rarely, the Virtual Blade Server Control (VBSC) probing of DIMMs could fail due to ILOM simultaneously updating DIMM information. When the DIMM probing fails, the host either boots with a reduced memory configuration, or fails to boot. This situation is not likely to happen when the service processor (SP) is reset because VBSC will have already probed them DIMMs before ILOM would start the dynamic fruid updates, but it can happen when the host is being repeatedly powered on/off without resetting the SP.
Workaround: Power off the host, reset the SP, and power on the host again.
22 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
rm-io Followed by Multiple set-vcpu Operations Could Cause HV Abort or ldmd Core Dump (CR 6597761)
During a single delayed reconfiguration operation, do not attempt to add CPUs to a domain if any were previously removed during the same delayed reconfiguration. Either cancel the existing delayed reconfiguration first (if possible), or commit it (by rebooting the target domain), and then apply the CPU addition.
Failure to heed this restriction can, under certain circumstances, lead to the Hypervisor returning a parse error to the LDoms Manager, resulting in the LDoms Manager aborting. Additionally, if any VIO devices had been removed during the same delayed reconfiguration operation, when the LDoms Manager restarts after the abort, it will incorrectly detect the need to perform a recovery operation, resulting in a corrupt configuration being created, and leading to a Hypervisor abort and machine power-down.
cryptotest Intermittent SUNW_RANDOM Generate Failure (CR 6572985)
During long test runs, SunVTS cryptotest could fail intermittently with an error similar to the following:
"cryptotest.FATAL n2rng0: SUNW_RANDOM generate failed: values generated fall outside statistcal tolerance":
Workaround: Install SunVTS Patch 127294-01.
modunload While the nxge Port is Running, Could Cause a System Panic (CR 6551509)
If you modunload the nxge driver while it is running, the system could panic. Due to an issue in the nxge driver, it is possible, though very unlikely, that the nxge driver could cause a panic during a system reboot. This panic occurs if the system is still transferring substantial amounts of network data over an nxge interface while it is going down. It is very unlikely that this condition will occur in normal circumstances.
The panic message would be mutex_enter: bad mutex, ... The panic stack will include the two nxge driver functions nxge_freeb() and nxge_post_page().
If such a panic occurs, the system will recover, and continue to reboot normally. The system, including the nxge interfaces, will come back up with no further panics.
Workaround: Unplumb the interfaces prior to unloading the driver.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 23
It is usually not necessary to unload a driver from a running kernel, but in those rare cases where it might be called for, you must unplumb all driver instances prior to unloading it.
First, find out which nxge instances are plumbed (active):
# ifconfig -a lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000 bge0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2 inet 129.153.54.82 netmask ffffff00 broadcast 129.153.54.255 ether 0:14:4f:2a:9f:6a nxge2: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 19 inet 129.153.54.175 netmask ffffff00 broadcast 129.153.54.255 ether 0:14:4f:6c:85:aa nxge3: flags=201000803<UP,BROADCAST,MULTICAST,IPv4,CoS> mtu 1500 index 20 inet 129.153.54.171 netmask ffffff00 broadcast 129.153.54.255 ether 0:14:4f:6c:85:ab
Then, for each active port (each port named nxge plus an instance number, for example, nxge2, nxge3, ...), unplumb it:
# ifconfig nxge2 unplumb # ifconfig nxge3 unplumb
If you run ifconfig -a once again, you see that there are no active nxge interfaces:
# ifconfig -a lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000 bge0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2 inet 129.153.54.82 netmask ffffff00 broadcast 129.153.54.255 ether 0:14:4f:2a:9f:6a
It is now safe to unload the nxge driver.
24 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
When Either NIU Port in the N2 CPU is Disabled the Corresponding XAUI is Configured (CR 6599334)
When either NIU port in the CPU is disabled the corresponding XAUI should not be configured. However, the ASR database for the corresponding XAUI is not updated if either of the NIU port is disabled using the disablecomponent SP CLI command or due to failures detected by POST. In this case the corresponding XAUI card is still available in the OpenBoot device tree.
Booting Solaris OS From External USB DVD-ROM Drive Could Cause Panic and Fail to Boot (CR 6588452)
Booting the Solaris 10 OS from an external USB DVD-ROM drive could panic the server and fail to boot the OS. This happens because the Solaris OS names the device storage@1, and the system firmware names the device cdrom@1.
Both the OpenBoot firmware and Solaris OS follow the 1275 USB bindings rules to name nodes. For example:
TABLE 4 1275 USB Bindings Rules for Naming Nodes
bInterface Class bInterface Subclass bInterface Protocol Name
0x08 1 Any storage
0x08 2 Any cdrom
0x08 3 Any tape
0x08 4 Any floppy
0x08 5 Any storage
0x08 6 Any storage
0x08 Any Any storage
The Solaris 10 OS always names the node as storage@n. So, the storage device with a subclass of 2, 3, or 4 (SPARC Enterprise T5120 and T5220 must be 2) cannot boot with Solaris 10 OS DVD.
Workaround: Use the drive whose subclass is not 2, 3, or 4 as the replacement.
SPARC Enterprise T52x0 Hangs When Trying to Boot With Infiniband HBA Card Installed (CR 6578410)
The server could hang when trying to boot with an Infiniband HBA card installed.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 25
Workaround: Add the following setting to the /etc/system file.
set tavor:tavor_iommu_bypass = 0
mpt Handles SMART Event From the LSI Firmware (CR
6574127)
If you encounter failed disks in RAID0 or RAID1 configurations, and you see the following scenario, the disk drive should be replaced.
The Fault LED is lit on a disk drive that is part of a RAID0 or RAID1 volume. No related error messages on the console or in the system log. The error condition may also be seen remotely by running the showenvironment command on the service processor. The disk drive that has the Fault LED illuminated displays a status of Failed and the service indicator is set to ON.
Corrective Action: The disk drive should be replaced.
Setting Properties for N2/NIU nxge Devices Could Fail (CR
6561389)
Setting a property for a N2/NIU nxge device node might not work correctly. The following is an example:
name="SUNW,niusl" parent="/niu@80" unit-address="0" accept_jumbo=1; name="SUNW,niusl" parent="/niu@80" unit-address="1" accept_jumbo=1;
Entries from /etc/path_to_inst:
/niu@80" 0 niumx /niu@80/network@0" 0 nxge /niu@80/network@1 1 nxge
Entries from /etc/driver_aliases:
niumx "SUNW,niumx nxge "SUNW,niusl
Workaround: Use the global declaration without the device path in the nxge.conf file. For example, add the following line to the nxge.conf file.
accept_jumbo = 1;
26 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
ldc_skt and ds Might Not Handle LDC Resets Properly (CR
6583567)
Very rarely, a communication channel between the primary domain and the service processor (SP) could hang and disable communication over the channel.
Workarounds:
If the channel is one that is used by a primary domain service or application other than the Fault Management daemon (fmd), for example the LDoms Manager ldmd, you could see warning or error messages concerning communication failures. In this case, the channel can be brought back up by restarting the affected service or application.
If the channel is the one used by fmd, there are no warning or error messages. fmd will not receive reports and diagnosis of the errors does not occur.
If a domain crashes or a service spontaneously restarts without any associated fault messages, you must recover as follows to minimize potential loss of error telemetry.
1. Restart fmd on the primary domain.
2. Wait 30 seconds.
3. Reset the SP with either of the following commands:
sc> resetsc -y [ALOM CMT compatibility CLI]
OR
-> reset /SP [ILOM CLI]
4. Restart fmd on the primary domain. Enter the following command from the Solaris OS:
# svcadm restart svc:/system/fmd:default
If the channel is the one used by the operating system (Solaris) to communicate with the SP, you could see warning or error messages regarding failure to obtain the PRI; failure to access ASR data; or failure to set LDoms variables or failure in SNMP communication. In this case, the channel can be brought back up by resetting the SP. If the SP is reset, restart the fmd on the primary domain. If resetting the SP fails to bring the channel back up, then it might also be necessary to reboot the primary domain.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 27
ID 216524 daemon.error Registration With DMI Failed err = 831 (CR 6237994)
Registration with DMI could fail with an err=831 message.
Workaround: Disable DMI service at the startup. For example:
% mv /etc/rc3.d/S77dmi /etc/rc3.d/_S77dmi
Domain ETM and LDC Deadlock When Transmit Queue Full (CR 6594506)
After certain hardware error events, it is possible that PSH events are no longer transported between the Service Processor (SP) and the domain (CR 6594506). The scenarios subject to this CR:
In a non-LDoms environment, an unrecoverable error in the Solaris domain
In an LDoms environment, an unrecoverable error in the control domain
In either an LDoms or non-LDoms environment, a fatal error in the system (a fatal
error resets the system at the HW level)
Note – In an LDoms environment, unrecoverable errors in a non-control LDoms
guest domain are not subject to this CR.
For example, an unrecoverable error in the control domain causes Solaris to panic. Messages similar to the following are reported to the control domain console:
SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major EVENT-TIME: 0x46c61864.0x318184c6 (0x1dfeda2137e) PLATFORM: SUNW,SPARC-Enterprise-T5220, CSN: -, HOSTNAME: wgs48-100 SOURCE: SunOS, REV: 5.10 Generic_Patch DESC: Errors have been detected that require a reboot to ensure system integrity. See http://www.sun.com/msg/SUNOS-8000-0G for more information. AUTO-RESPONSE: Solaris will attempt to save and diagnose the error telemetry IMPACT: The system will sync files, save a crash dump if needed, and reboot REC-ACTION: Save the error summary below in case telemetry cannot be saved
Or, an unrecoverable error causes the Hypervisor to abort and messages similar to the following are reported to the SP console when logged into the ALOM CMT compatibility CLI console:
Aug 17 22:09:09 ERROR: HV Abort: <Unknown?> (228d74) - PowerDown
28 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
After the control domain recovers, there is a diagnosis performed. Messages to the console indicate the cause of the unrecoverable error. For example:
SUNW-MSG-ID: SUN4V-8000-UQ, TYPE: Fault, VER: 1, SEVERITY: Critical EVENT-TIME: Fri Aug 17 18:00:57 EDT 2007 PLATFORM: SUNW,SPARC-Enterprise-T5220, CSN: -, HOSTNAME: wgs48-100 SOURCE: cpumem-diagnosis, REV: 1.6 EVENT-ID: a8b0eb18-6449-c0a7-cc0f-e230a1d27243 DESC: The number of level 2 cache uncorrectable data errors has exceeded acceptable levels. Refer to http://sun.com/msg/SUN4V-8000-UQ for more information. AUTO-RESPONSE: No automated response. IMPACT: System performance is likely to be affected. REC-ACTION: Schedule a repair procedure to replace the affected resource, the identity of which can be determined using fmdump -v -u <EVENT_ID>.
At this point, CR 6594506 might have been encountered. This will prevent future PSH events (for example, new HW errors, correctable or uncorrectable) from being transported into the domain and properly diagnosed.
Workaround: After the domain recovers and the diagnosis message is printed to the Solaris console, reset the SP:
sc> resetsc -y [ALOM CMT compatibility CLI]
OR
-> reset /SP [ILOM CLI]
Once the SP is restarted and you are able to login as the admin (which means all daemons are ready), execute the following in the Solaris control domain:
# fmadm unload etm # fmadm load /usr/platform/sun4v/lib/fm/fmd/plugins/etm.so
raidctl -h and Man Page Display Some Unsupported
Features
The SPARC Enterprise T5120 and T5220 servers currently only support RAID 0 and RAID 1 while using the on board 1068E LSI SAS controller. The raidctl man page and the raidctl -h options incorrectly display features that might exist on future products and other SAS controllers. The raidctl utility can be used to create and delete RAID 0 & RAID 1 volumes.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 29
Unable to Remove RAID 1 Volume After RAID 1 and RAID 0 Volumes Are Created (CR 6592238)
When two volumes are created over an mpt controller, the raidctl utility is unable to delete one of the RAID volumes and cannot list the correct disk information. The following is the error message:
# raidctl -l Device record is invalid.
raidctl -l Continuously Outputs Disk: 0.0.0 (CR 6589612)
In this case, if you run raidctl -l, you get the following output:
# raidctl -l Controller: 1 Volume:c1t0d0 Volume:c1t2d0 Disk: 0.0.0 Disk: 0.0.0 ...
You must use the Control C keyboard sequence to stop the output.
IO-Domain-Reset: ERROR: Last Trap: Watchdog Reset (CR
6593547)
You might see the following error in either the I/O or the Control domain when attempting to boot and the auto-boot sequence will be aborted:
"ERROR: Last Trap: Watchdog Reset".
Workaround: Type boot at the OpenBoot OK prompt to proceed.
XAUI and CPU Resources Added After Initial LDoms Setup Are Not Available to LDoms Manager (CR 6597815)
When you add CPU or XAUI resources to a server configured to use logical domains, you must revert to the factory default configuration to allow the LDoms Manager software to allocate those resources to guest domains.
30 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
OpenBoot PROM Banner Shows the Same Memory Number After Disabling a DIMM and Resetting All (CR 6579390)
If you manually disable any CPU or memory resource with the system ASR functionality (for example, with the disablecomponent command from the service processor) while the host is powered on, then a host power cycle is required for the change to take effect. If the host is rebooted without a power cycle, messages might be printed on the host console stating that the resources have been disabled, even though the resources are still present. These messages can be ignored, but to allow Solaris to boot, set the LDoms variable auto-boot-on-error? to true.
Guest Domain wanboot miniroot Download Could Take More Than 30 Minutes (CR 6543749)
During boot or installation over wide area networks, the time it takes to download the miniroot could significantly increase when using a virtual network device. Early tests showed the miniroot download to be five to six times slower compared to similar boot or installation over physical network devices.
This performance degradation is relevant only when trying to boot or install over wide area networks using a virtual network device. A similar boot or installation using a physical network device works as expected, as does a traditional local area net boot or installing from a virtual network device.
OpenBoot PROM Variables Cannot Be Modified by eeprom at the OS Prompt When ldmd Is Running (CR 6540368)
LDom variables for a domain can be specified using any of the following methods:
At the OpenBoot prompt
Using the Solaris OS eeprom(1M) command
Using the Logical Domains Manager CLI (ldm)
Modifying, in a limited fashion, from the system processor (SP) using the
bootmode command; that is, only certain variables, and only when in the factory­default configuration.
The goal in all cases, is for variable updates made using any of these methods always persist across reboots of the domain, and always reflect in any subsequent logical domain configurations saved to the SP.
In Logical Domains 1.0 software, there are a few cases where variable updates do not persist:
Important Information About the SPARC Enterprise T5120 and T5220 Servers 31
When running in a factory-default configuration, variable updates specified
through the Solaris OS eeprom(1M) command persist across a reboot of the primary domain into the same factory-default configuration, but do not persist into a configuration saved to the SP. Conversely, in this scenario, variable updates specified using the Logical Domains Manager do not persist across reboots, but are reflected in a configuration saved to the SP. When running the factory-default configuration, if you want a variable update to persist across a reboot into the same factory-default configuration, use the eeprom command. If you want it saved as part of a new logical domains configuration saved to the SP, use the appropriate Logical Domains Manager command.
All methods of updating a variable (OpenBoot firmware, eeprom command, ldm
subcommand) persist across reboots of that domain, but not across a power-cycle of the system, unless a subsequent logical domain configuration is saved to the SP. In addition, in the control domain, updates made using OpenBoot firmware persist across a power-cycle of the system. That is, updates persist even without subsequently saving a new logical domain configuration to the SP.
When reverting to the factory-default configuration from a configuration
generated by the Logical Domains Manager, all LDoms variables start with their default values.
Options True False Menu Interrupts OpenBoot PROM Reset (CR 6594395)
The ldm set-variable command enables you to set an LDoms variable to any arbitrary string. However, many LDoms variables have only a small set of valid values. For example boolean variables such as auto-boot? and diag-switch? only accept the values true or false. If an LDom variable is set to a value that is not valid, OpenBoot issues a warning message during boot with a list of correct values, but without giving the name of the variable in question. For example:
Options: true More [<space>,<cr>,q,n,p,c] ?
The preceding message is given by OpenBoot if auto-boot? is set to a NULL string. The boot stops at this point waiting for input. If you enters a space or a carriage return the complete error message is displayed and the boot process continues:
Options: true false
32 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
A common way to encounter this error is if you left off the = sign when using the
ldm set-variable command.
# ldm set-variable auto-boot? true guest_domain
The previous command results in two NULL LDoms variables.
auto-boot?= true=
auto-boot? is a boolean variable and setting it to NULL results in an OpenBoot
warning during boot. The proper format for the preceding command is as follows:
# ldm set-variable auto-boot?=true guest_domain
Changing OBP nvram Parameters Does Not Take Affect After resetsc (CR 6596594)
If the service processor is reset while the control domain is at the ok prompt, OpenBoot will permanently lose its ability to store nonvolatile LDoms variables or security keys until the host has been reset. Guest domains are not affected by this problem. Attempts to update LDoms variables or security keys results in the following warning messages:
{0} ok setenv auto-boot? false WARNING: Unable to update LDOM Variable
OR
{0} ok set-security-key wanboot-key 545465 WARNING: Unable to store Security key
Workaround: To recover from this state, reset the control domain using the reset­all OpenBoot command.
{0} ok reset-all
Changes to OpenBoot PROM Variables With Nondefault LDoms Configuration Do Not Persist (CR 6593132)
If an LDoms variable is set to a nondefault value when an LDoms configuration is saved to the service processor, and then later changed back to its default value, the change to its default value will not survive a power-cycle. For example:
Important Information About the SPARC Enterprise T5120 and T5220 Servers 33
1. The system is booted in the factory-default configuration.
2. The user changes an ldom-variable from the ok prompt or from Solaris.
ok setenv auto-boot? false (or from Solaris) # eeprom auto-boot?=false
3. The user saves an LDoms configuration.
# ldm set-spconfig my-new-config
4. The user then sets the variable back to its default value.
ok set-default auto-boot?
(or from Solaris)
# eeprom auto-boot?=true
At this point the LDoms variable auto-boot? retains the default value of true across reboot and reset events, but if the system is power-cycled, the variable returns to the value it held immediately before the ldm set-spconfig command. This action only affects variables that are being set to their default-values. If auto-boot? was true before the LDoms configuration was saved, and then you changed its value to false (non-default), then auto-boot? would retain the value of false, even across power-cycles.
Workaround: To ensure that LDom variable changes persist across power cycles resave the SP configuration after changing an LDoms variable. In this example, after changing auto-boot? back to its default value, resave the SP configuration:
# ldm remove-spconfig my-new-config # ldm add-spconfig my-new-config
Alternatively, you could manually change the LDoms variables back to their default values following a power cycle.
Occasional LDom Warning Message After POST (CR 6592934)
In the unlikely event that POST times out before completing its test cycle, the Virtual Blade Server Control (VBSC) will issue the following message to the console:
ERROR: POST timed out. Not all system components tested.
34 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
The system will continue to boot, but in a degraded state. During the boot process the following error messages will be printed:
WARNING: Unable to connect to Domain Service providers WARNING: Unable to get LDOM Variable Updates WARNING: Unable to update LDOM Variable
Any programs or services that depend on an LDC channel will also run in a degraded state, or not at all. Some programs that require working LDC to function are ldmd, fmd, and eeprom.
Workaround: If the following error is observed on the console during boot, power­cycle the system, and ensure that POST runs to completion:
ERROR: POST timed out. Not all system components tested.
You can also boot without running POST.
L2 Cache ue Error Injections Produce dau Ereports and Memory Faults (CR 6592272)
It is possible that after an uncorrectable L2 writeback error, a bogus memory fault message (SUN4V-8000-E2) is reported to the console. For example:
SUNW-MSG-ID: SUN4V-8000-E2, TYPE: Fault, VER: 1, SEVERITY: Critical EVENT-TIME: Wed Sep 5 18:49:35 EDT 2007 PLATFORM: SUNW,SPARC-Enterprise-T5220, CSN: -, HOSTNAME: wgs48-100 SOURCE: cpumem-diagnosis, REV: 1.6 EVENT-ID: 59bf6418-5dcb-c1b0-b06a-f26fa18e4ee7 DESC: The number of errors associated with this memory module has exceeded acceptable levels. Refer to http://sun.com/msg/SUN4V­8000-E2 for more information. AUTO-RESPONSE: Pages of memory associated with this memory module are being removed from service as errors are reported. IMPACT: Total system memory capacity will be reduced as pages are retired.
Workaround: Schedule a repair procedure to replace the affected memory module. Use fmdump -v -u event_id to identify the module.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 35
Use fmdump -eV -u uuid with the UUID from the console message to determine if the memory error is bogus. For example:
# fmdump -eV -u 59bf6418-5dcb-c1b0-b06a-f26fa18e4ee7 | \ grep dram-esr
dram-esr = 0x1000000000008221
If the dram-esr is 0x1000000000008221, this CR 6592272 has been encountered. The memory error can be ignored. No memory component replacement is necessary. Use fmadm repair uuid to repair the bogus memory UE.
Temporary PCIe Link Failure During Boot Causes Fatal Error Later (CR 6553515)
If a temporary PCIe link failure occurs during boot or any time later, the system could fail. Note that if the link is up and working again before HV gets control, the error is a problem in HV handling the leftover status. The following is an example of the error message:
{0} ok 4000 dload users/bog/rustn2obp_0502 Boot device: /pci@0/pci@0/pci@1/pci@0/pci@2/network@0:,users|bog|rustn2obp_0502 File and args: FATAL: /pci@0/pci@0/pci@1/pci@0/pci@2/network@0: Last Trap: Non­Resumable Error TL: 1 %TL:1 %TT:7f %TPC:f0238978 %TnPC:f023897c %TSTATE:820001600 %CWP:0 %PSTATE:16 AG:0 IE:1 PRIV:1 AM:0 PEF:1 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:0 %ASI:20 %CCR:8 XCC:nzvc ICC:Nzvc %TL:2 %TT:3f %TPC:f024327c %TnPC:f0243280 %TSTATE:14414000400 %CWP:0 %PSTATE:4 AG:0 IE:0 PRIV:1 AM:0 PEF:0 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:0 %ASI:14 %CCR:44 XCC:nZvc ICC:nZvc Normal GL=1 0: 0 0 1: f0200000 0 2: f0200000 0 3: fff78000 0 4: fec320fc 3ffe60000 5: f02833e4 3ffe60000 6: fee826c8 3ffe60600 7: fee817d8 f02432bc
36 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
%PC f0238978 %nPC f023897c %TBA f0200000 %CCR 8200016 XCC:nzvC ICC:nZVc {0} ok
Workaround: If system fails to boot because of this problem, retry it.
Processor Always Starts on Lowest Available Strand Regardless If asr Disabled (CR 6541482)
If processor strand 0 of the first available physical core has been marked disabled (as seen in the list of disabled devices in the output of the showcomponent command from the sc> prompt), a new master strand is selected by the initialization process, and the disabled strand is taken offline. But the system initialization and execution of Power-On-Self-Test (POST) occurs using the disabled processor strand since power on and resets, execution always starts on strand 0 of the first available physical core.
When this happens, the system might fail to run the diagnostics and the system might fail in an unpredictable manner. The system might not start the required firmware and software components as a result.
Workaround: If strand 0 of the first physical core is known to be good, then it can be enabled by using the enablecomponent ILOM CLI command followed by power­on reset of the system (poweroff followed by poweron ILOM CLI commands).
If strand 0 of the first physical core is known to be bad, then there is no workaround. The entire processor will need to be replaced.

Integrated Lights Out Manager (ILOM) Issues

ILOM is the default service processor (SP) command-line interface (CLI). The default user is root and the default password is changeme.
ILOM also provides an Advanced Lights Out Management (ALOM) CMT compatibility CLI. To use the ALOM CMT compatibility CLI, you must first log in to an ILOM CLI as root and create an admin account with administrator privileges, and assign alom as the default CLI (cli_mode=alom).
Refer to the Integrated Lights Out Manager 2.0 (ILOM 2.0) Supplement for SPARC
Enterprise T5120 and T5220 Servers and the SPARC Enterprise T5120 and T5220 Servers Administration Guide for more information on ILOM.
The following are known ILOM CLI and ALOM CMT compatibility CLI (on ILOM) issues.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 37
Missing CPU Cores and Strands Shown in PICL Physical­Platform Tree (CR 6596503)
The output of the prtpicl command used with the -v option might show CPU cores or strands with an OperationalStatus of enabled when, in fact, they do not exist.
Workaround: Use the output from the prtdiag or prtpicl -c cpu commands, which do show the correct information.
showfaults Shows the Motherboard as Faulty Instead of the DIMM (CR 6582853)
In a system with DIMMs or PCI-E adapters that have been faulted by PSH (Predictive Self-Healing) diagnosis on the host, the ALOM showfaults command displays the faulty FRU as the motherboard (/SYS/MB) instead of the DIMM or PCI­E adapter. This problem will occur for the following PSH Message-ID's (MSGID):
SUN4V-8000-E2, SUN4V-8000-DX, SUN4-8000-4P, SUN4-8000-A2, SUN4-8000-75, SUN4-8000-9J, SUN4-8000-D4, PCIEX-8000-0A, PCIEX-8000-DJ, PCIEX-8000-HS
The following is an example from the ALOM CMT compatibility CLI illustrating the problem:
sc> showfaults -v Last POST Run: Jul 13 18:32:11 2007
Post Status: Passed all devices ID Time FRU Class Fault 0 Jul 13 19:31:34 /SYS/MB Host detected fault, MSGID: SUN4V-8000-DX UUID: 7b471945-ceef-eea0-c3ad-85ca140be5b2
In addition to the problem with the FRU displayed by the ALOM showfaults commands, the output displayed by the ILOM CLI command show /SYS/faultmgmt, the fault_state property of components, and the Faulted Components listed under the Fault Management tab in the ILOM web interface will be incorrect for the PSH Message-ID's listed above. Also, the FB-DIMM fault indicator will not operate and the FRUID for the motherboard will have a fault recorded.
Workaround: Use the Fault Management utilities on the host to find the location of the faulty DIMM(s) or PCI-E adapters. Instructions for using these utilities for these faults can be found in the Predictive Self-Healing Knowledge Articles located at:
http://www.sun.com/msg/MSGID
38 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
Where MSGID is one of the PSH Message ID's listed above and displayed by the ALOM showfaults command.
Once the faulty DIMM(s) have been replaced and the PSH fault has been cleared, the entry in showfaults will be deleted and the fault recorded in the motherboard FRUID will be cleared.
If Socketed EEProm (SCC) is Changed, SP Does Not Always Read Some SP Properties From New EEProm (CR 6596430)
If the SP configuration variable sc_backupuserdata is set to false, the following user configuration values are not backed up to the Socketed EEProm:
if_emailalerts, mgt_mailhost, mgt_mailalert, sc_customerinfo, sc_powerondelay, sc_powerstatememory, sc_backupuserdata
Workaround: Manually copy the preceding user settings before replacing the mother board in a motherboard swap scenario. After the swap is complete, manually set the user parameters.
If All Platform Identity Checks Fail (CR 6593801)
When all platform identity checks fail, system power on should be disabled. If enough of the FRUs are corrupted, the system cannot determine its identity, which makes certain components nonoperational and could crash the server.
Login to SP Times Out Intermittently (CR 6568750)
Logging into the SP could intermittently fail and timeout after 60 seconds. If this error occurs, you see an error message similar to the following:
SUNSP00123456789F login: username Password: password Starting VBSC Server VBSC server started.. Logging out after 60 seconds.
Workaround: None. This is not seen during normal login, only when logging in with a script.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 39
Use setdefaults Command to Clear Last POST Run From showfaults (CR 6573354)
After POST is run, showfaults displays the status. The only way to clear the status is to enter the setdefaults command. For users familiar with ALOM CMT, the previous way to clear the status was to enter the resetsc command.
Mouse Redirection is Extremely Slow or Not Usable With RHEL4U4 Host and Windows XP Client (CR 6502777)
Mouse redirection is extremely slow and could be unusable through javarconsole. The keyboard redirection is slow but is usable.
Workaround: Use an alternate javarconsole client (Solaris, or Linux) to display the RHEL4U4 server.
Fewer Than 21 Entries in the Event Log Causes showlogs to Display None of the Events (CR 6589043)
If there are fewer than 21 entries in the event log, the showlogs command displays none of the events. This situation is known to occur in the following scenarios:
After a fresh installation of the system (out-of-box), the Service Processor (SP)
event log is very likely to have fewer than 21 entries.
After you clear the SP event log with the Browser User Interface (BUI) or ILOM
CLI, the ALOM CMT compatibility CLI showlogs command displays no new events until at least 21 new events are logged.
Workaround: In either of the preceding scenarios, use the showlogs -v option to display the logs. After 21 or more events are logged in the log file, you can revert back to using showlogs with no options.
XAUI0 or XAUI1 Disabled With disablecomponent but the Devices are Still Created in OpenBoot PROM (CR 6599333)
The ASR database does not support disabling XAUI devices. When a XAUI device is disabled with the disablecomponent command or due to failures detected by POST, the corresponding network device is still available in the OpenBoot firmware.
40 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
uadmin 2 0 and reboot Reads Old Bootmode Settings (CR
6585340)
You can change LDoms variables in the control domain in one of three ways: with the OpenBoot setenv command in the control domain, with the Solaris eeprom command in the control domain, or using ILOM bootmode bootscript option. Changes made with the setenv and eeprom commands take effect immediately. Changes made with the bootmode command are supposed to take effect on the next reset, no matter what kind or reset it is.
Changes made in any of these three ways are supposed to stay in effect until the next change, also made in any of these three ways. That is, it doesn’t matter how the value of an LDoms variable is changed—once changed, the value is supposed to stay in effect until it is changed again.
However, due to this issue, changes made with the bootmode command will become effective only after a power-on reset and will, on every reset (other than a power-on reset) that follows, override any intervening change made with the setenv or eeprom commands. That is, the changes made by the bootmode command require a power-on reset to be effective. Changes made with the setenv or eeprom commands will only persist until the next reset, at which point the variable will revert to the value set by the last bootmode command. This stickiness of the bootmode setting will persist until the machine is power-cycled. Upon power­cycling, the prior bootmode setting will not take effect. Any subsequent change made by the setenv or eeprom command will now persist over resets, at least until the next bootmode command followed by a power-cycle.
Workaround: Restart the control domain with a power-on reset right after the bootmode command is executed and restart again after the control domain boots to either OpenBoot or Solaris. The first power-on reset will make the bootmode command effective and the second power-on reset will workaround the stickiness issue.
The control domain can be reset using power-on reset with the ALOM CMT compatibility CLI powercycle command. If the control domain is booted to the Solaris OS, remember to properly shutdown the OS before executing the powercycle command.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 41
ALOM CMT Compatibility CLI Displays Incorrect Guest Status Sent From OpenBoot PROM and Solaris (CR 6567748)
The domain status information output of the ALOM CMT compatibility CLI showplatform command could be ambiguous. The domain status for a domain could indicate Running when the OS is not running. For example, the following output might be seen while the domain is at the OpenBoot ok prompt.
Domain Status
------ -----­ S0 Running
Similar ambiguity exists for the domain status information displayed by the ILOM BUI and ILOM CLI.
The ambiguity also exists in the ILOM control MIB, but not in the platform entity MIB. Thus, the ambiguous domain status might be visible to third party systems monitoring tools if they monitor this entry.
Workaround: Ignore the domain status information from all CLIs and BUI output as well as from the domain status entry in the ILOM control MIB. Retrieve the true status of the domain manually by accessing the domain console.
Intermittent POST PIU0 Link Train Errors (CR 6571886)
POST could encounter intermittent POST PIU0 link train errors during power cycle test.
Workaround: If this problem persists power-cycle the system as follows (in the ALOM CMT compatibility CLI):
sc> poweroff -fy sc> clearasrdb sc> poweron -c
SP Serial Line Terminal Server break Command Does Not Work (CR 6577528)
If you use Telnet to connect to the SP serial line with a terminal server (such as the Cisco ASM series) and try to send a break to the Solaris host, the break command does not work and is ignored by the SP. Use the break command from the SP CLI to send a break to the Solaris host.
42 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
The following is sample output of sending a break to the Solaris host from the ALOM CMT compatibility CLI:
1. Log into the host with the console command.
sc> console
2. Enter #. to return to the host prompt.
sc> #. Solaris-host-prompt>
3. Enter #. to escape to the SP ALOM CMT compatibility CLI. The escape sequence is not echoed.
Solaris-host-prompt> #. sc>
4. Enter the break command.
sc> break -c -y
5. Enter #. to return to the SP ALOM CMT compatibility CLI.
sc> #. c)ontinue, s)ync, r)eboot, h)alt?
The following is sample output of sending a break to the Solaris host from the SP ILOM CLI:
1. Log into the host with the ILOM console command.
-> start /SP/console Are you sure you want to start /SP/console (y/n)? y Serial console started. To stop, type #. Solaris-host-prompt>
2. Enter #. to escape to the SP ILOM CLI. The escape sequence is not echoed.
Solaris-host-prompt> #.
->
Important Information About the SPARC Enterprise T5120 and T5220 Servers 43
3. Enter the break command as follows.
-> set /HOST send_break_action=break
->
Log back into the Solaris host with the console command.
-> start /SP/console Are you sure you want to start /SP/console (y/n)? y Serial console started. To stop, type #. c)ontinue, s)ync, r)eboot, h)alt?
Refer to the Integrated Lights Out Manager 2.0 User’s Guide and the Integrated Lights Out Manager (ILOM) Supplement for SPARC Enterprise T5120 and T5220 Servers for
details on how to use the break command from the SP CLIs.
Write to Virtual Blade Server Control (VBSC) Illegal Seek (CR
6582340)
When connected to the virtual-console and the escape character sequence is input to enter the SP CLI, two error messages might get printed before reaching the CLI prompt:
read: Connection reset by peer Write to vbsc: Illegal seek sc>
This situation occurs when there is a lot of output through the console, and could indicate that the console is in use when it is not.
Workaround: If you are refused write access the next time you initiate a connection to the host with the console command, enter the console -f (the force option) to get read and write access.
DIMM FRUs Must Be Presented With the IPMI Interface (CR
6591367)
You cannot obtain the system DIMM FRU information with the ipmitool utility. Obtain the DIMM FRU information with the SP ALOM CMT compatibility CLI (with the showfru command) or the SP ILOM CLI (with the show fru-name command) on the SP. Refer to the Integrated Lights Out Manager 2.0 User’s Guide and the Integrated Lights Out Manager (ILOM) Supplement for SPARC Enterprise T5120 and T5220 Servers for details on the preceding commands to obtain FRU information on the SP.
44 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
Intermittent Issue With netsc commit and showsc Hang (CR 6549028)
The netsc_commit command could cause the system to hang. For example:
sc> setsc netsc_ipaddr 12x.14x.4x.1x sc> showsc netsc_ipaddr 12x.14x.4x.1x sc> setsc netsc_ipnetmask 255.255.255.0 sc> showsc netsc_ipnetmask 255.255.255.0 sc> setsc netsc_commit true <--- HUNG
Through a serial connection, the network might not be visible.
sc> showsc netsc_ipaddr 12x.14x.4x.1x sc> shownetwork SC network configuration is: ^C ^C ^C ^C IP Address: Gateway address: Netmask: Ethernet Address: sc> resetsc -y Performing hard reset on the SC failed resetsc: No such inventory sc>
Workaround: Reboot the system:
# init 6
If rebooting fails to reset the SP, AC power-cycle the system to recover the SP. Note that you will lose active domains.
Use of consolehistory -e Results in SP Becoming Unusable (CR 6587869)
Do not use consolehistory -e with a value greater than 1000 lines. If you want to see the entire consolehistory log, use the -v option instead.
Important Information About the SPARC Enterprise T5120 and T5220 Servers 45
If you run consolehistory -e with a value greater than 1000 lines, you could encounter that some of the SP commands appear to not function and report Unknown or Unable to get value type of errors. If this happens, the only way to recover is to reboot the SP (resetsc from the SP ILOM CLI). Refer to the Integrated Lights Out Manager 2.0 User’s Guide for details on how to reboot the SP.
reset /SP With SP ILOM CLI SSH Connection Results in Error Message (CR 6588999)
When you connect to the SP ILOM CLI with SSH and the SP is reset, you could see an error message similar to the following:
Performing hard reset on /SP failed reset: Transport error - check errno for transport error
This error can be ignored. The command actually succeeds and the SP is reset. When the SP resets, you lose the SSH connection to the SP.
Missing Fan Board Module Output is Displayed in prtdiag and ILOM CLI and BUI (CR 6595955)
If a component is not physically present on a system (such as the fan module), the status field in the prtdiag -v (Environmental Status Section) output shows no value and is blank.
Remove Boot and Run Options From consolehistory Usage (RFE 6510082)
For users familiar with ALOM CMT and using the ALOM CMT compatibility CLI, the help for the consolehistory command displays both a boot and a run option. Both options result in the same output because ILOM makes no distinction between the run and boot buffers.
SP useradd and usershow Errors Followed by Hang (CR
6585114)
During automated testing, the SP could encounter problems with useradd and usershow commands, then soon after all login attempts could fail.
Workaround: AC power-cycle the system.
46 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
Cleanup Output Display When Resetting ILOM (CR 6585292)
Some extraneous and misleading warning messages are displayed in the ALOM CMT compatibility CLI resetsc command output.
The following warning messages can be ignored.
sc> resetsc ... Linux version 2.4.22 (kbellew@sanpen-rh4-0) (gcc version 3.3.4) #2 Wed Jul 18 19:25:18 PDT 2007 r21410 Loading modules: fpga Warning: loading /lib/modules/2.4.22/misc/fpga/fpga.o will taint the kernel: non-GPL license - Proprietary See http://www.tux.org/lkml/#export-tainted for information about tainted modules ... Module fpga loaded, with warnings fpga_flash Warning: loading /lib/modules/2.4.22/misc/fpga_flash/fpga_flash.o will taint the kernel: no license See http://www.tux.org/lkml/#export-tainted for information about tainted modules Module fpga_flash loaded, with warnings immap Warning: loading /lib/modules/2.4.22/misc/immap/immap.o will taint the kernel: no license Refer to: http://www.tux.org/lkml/#export-tainted for information about tainted modules Module immap loaded, with warnings ... EXT3-fs warning: maximal mount count reached, running e2fsck is recommended EXT3 FS 2.4-0.9.19, 19 August 2002 on tffs(100,1), internal journal EXT3-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds EXT3-fs warning: maximal mount count reached, running e2fsck is recommended ... ipt_recent v0.3.1: ... < ... >. http://snowman.net/projects/ipt_recent/ arp_tables: (C) 2002
802.1Q VLAN Support v1.8 ... < ... > All bugs added by ...
Important Information About the SPARC Enterprise T5120 and T5220 Servers 47

Documentation Errata

Table 2-6 on page 43 of the SPARC Enterprise T5120 and T5220 Server Administration Guide is incorrect. The device identifier /SYS/MB/NETport_number replaced with the device identifier /SYS/MB/GBEcontroller_number.
The device description for /SYS/MB/GBEcontroller_number should be as follows:
GBE controllers (Number: 0-1)
GBE0 controls NET0 and NET1
GBE1 controls NET2 and NET3
In other words, replace this row of Table 2-6
Device Identifiers Devices
/SYS/MB/NETport_number Ethernet ports (Number: 0-3)
with this row:
Device Identifiers Devices
/SYS/MB/GBEcontroller_number GBE controllers (Number: 0-1)
GBE0 controls NET0 and NET1
GBE1 controls NET2 and NET3
should be
48 SPARC Enterprise T5120 and T5220 Servers Product Notes • October 2007
Loading...