Sun Microsystems STOREDGETM 5310 NAS User Manual

Sun StorEdge™ 5310 NAS
Troubleshooting Guide
Sun Microsystems, Inc. www.sun.com
Part No. 817-7513-11 August 2004, Revision A
Submit comments about this document at: http://www.sun.com/hwdocs/feedback
Copyright 2004 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved. Sun Microsystems, Inc. has intellectual property rights relating to technology that is described in this document. In particular, and without
limitation, these intellectual property rights may include one or more of the U.S. patents listed at http://www.sun.com/patents and one or more additional patents or pending patent applications in the U.S. and in other countries.
This document and the product to which it pertains are distributed under licenses restricting their use, copying, distribution, and decompilation. No part of the product or of this document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any.
Third-party software, including font technology, is copyrighted and licensed from Sun suppliers. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademar k in
the U.S. and in other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, AnswerBook2, docs.sun.com, Sun StorEdge, Java, and Solaris are trademarks or registered trademarks of
Sun Microsystems, Inc. in the U.S. and in other countries. Mozilla, Netscape, and Netscape Navigator are trademarks or registered trademarks of Netscape Communications Corporation in the United
States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and in other
countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges
the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun’s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun’s written license agreements.
U.S. Government Rights—Commercial use. Government users are subject to the Sun Microsystems, Inc. standard license agreement and applicable provisions of the FAR and its supplements.
DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.
Copyright 2004 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, Californie 95054, Etats-Unis. Tous droits réservés. Sun Microsystems, Inc. a les droits de propriété intellectuels relatants à la technologie qui est décrit dans ce document. En particulier, et sans la
limitation, ces droits de propriété intellectuels peuvent inclure un ou plus des brevets américains énumérés à http://www.sun.com/patents et un ou les brevets plus supplémentaires ou les applications de brevet en attente dans les Etats-Unis et dans les autres pays.
Ce produit ou document est protégé par un copyright et distribué avec des licences qui en restreignent l’utilisation, la copie, la distribution, et la décompilation. Aucune partie de ce produit ou document ne peut être reproduite sous aucune forme, par quelque moyen que ce soit, sans l’autorisation préalable et écrite de Sun et de ses bailleurs de licence, s’il y ena.
Le logiciel détenu par des tiers, et qui comprend la technologie relative aux polices de caractères, est protégé par un copyright et licencié par des fournisseurs de Sun.
Des parties de ce produit pourront être dérivées des systèmes Berkeley BSD licenciés par l’Université de Californie. UNIX est une marque déposée aux Etats-Unis et dans d’autres pays et licenciée exclusivement par X/Open Company, Ltd.
Sun, Sun Microsystems, le logo Sun, AnswerBook2, docs.sun.com, Sun StorEdge, Java, et Solaris sont des marques de fabrique ou des marques déposées de Sun Microsystems, Inc. aux Etats-Unis et dans d’autres pays.
Mozilla, Netscape, et Netscape Navigator sont des marques de Netscape Communications Corporation aux Etats-Unis et dans d’autres pays. Toutes les marques SPARC sont utilisées sous licence et sont des marques de fabrique ou des marques déposées de SPARC International, Inc.
aux Etats-Unis et dans d’autres pays. Les produits portant les marques SPARC sont basés sur une architecture développée par Sun Microsystems, Inc.
L’interface d’utilisation graphique OPEN LOOK et Sun™ a été développée par Sun Microsystems, Inc. pour ses utilisateurs et licenciés. Sun reconnaît les efforts de pionniers de Xerox pour la recherche et le développement du concept des interfaces d’utilisation visuelle ou graphique pour l’industrie de l’informatique. Sun détient une license non exclusive de Xerox sur l’interface d’utilisation graphique Xerox, cette licence couvrant également les licenciées de Sun qui mettent en place l’interface d ’utilisation graphique OPEN LOOK et qui en outre se conforment aux licences écrites de Sun.
LA DOCUMENTATION EST FOURNIE "EN L’ÉTAT" ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES OU TACITES SONT FORMELLEMENT EXCLUES, DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE, Y COMPRIS NOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L’APTITUDE A UNE UTILISATION PARTICULIERE OU A L’ABSENCE DE CONTREFAÇON.
Please
Recycle

Contents

1. Troubleshooting Overview 1
How to Use This Manual 1
Important Notices and Information on the Sun StorEdge 5310 NAS 2
Troubleshooting Tools 3
Troubleshooting Procedures 4
Troubleshooting Flow Charts 6
Diagnostic Information Sources 8
StorEdge Diagnostic Email 8
Data Collection for Escalations 10
Log Error Messages 19
SYSLOG 19
Error Codes from the Sun StorEdge 5310 NAS LCD Display and syslog 20
About SysMon Error Notification 21
Sun StorEdge 5310 NAS Error Messages 21
UPS Subsystem Errors 22
File System Errors 24
PEMS Events 24
Maintenance Precautions 26
Static Electricity Precautions 27
Contents iii
2. NAS Head 1
Hardware 1
Contacting Technical Support 1
Problems With Initial System Startup 2
Resetting the Server 3
Preparing the System for Diagnostic Testing 4
Troubleshooting the Server Using Built-In Tools 10
Diagnosing System Errors 10
LEDs 11
Beep Codes 11
POST Screen Messages 11
LEDs and Pushbuttons 11
Front Panel LEDs and Pushbuttons 12
Rear Panel LEDs 16
Front-Panel System Status LED 18
Rear Panel Power Supply Status LED 20
Server Main Board Fault LEDs 21
System ID LEDs 23
Power-On Self Test (POST) 24
POST Screen Messages 24
POST Error Beep Codes 27
POST Progress Code LED Indicators 30
OS Operations 36
Filesystem Check (fsck) Procedure 36
StorEdge Network Capture Utility 37
Upgrades 38
Cacls - Access Control List 38
Proc filesystem 39
iv Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
FTP Server 40
Updating the OS on the Sun StorEdge 5310 NAS 40
Sun StorEdge 5310 NAS Firmware 40
Operating System 40
Common Problems Encountered on the Sun StorEdge 5310 NAS 42
CIFS/SMB/Domain 43
NFS Issues 61
Network Issues 66
NIC speed and duplex negotiation issues. 67
File System Issues 70
Drive Failure Messages 74
File and Volume Operations 76
Administration Interfaces 78
StorEdge Features and Utilities 82
Hardware Warning Messages 84
Backup Issues 88
Direct Attached Tape Libraries 90
Frequently Asked Questions 92
CIFS/SMB/Domain Issues 92
NIS/NIS+ Issues 104
TCP/IP and Network Configuration 106
Quota Configuration 109
Checkpoint Configuration 115
Volume Creation and Expansion 120
Reserved Filesystems and Directories 123
NFS Issues 124
Administration Interfaces and Utilities 128
Backup and Migration Issues 142
Contents v
Macintosh Connectivity 146
Miscellaneous Log Messages 147
Direct Attached Tape Libraries 148
SCSI ID Settings 148
StorEdge File Replicator 149
StorEdge File Replicator Issues 152
3. Storage Arrays 1
Fibre Channel FC 1
Array Overview 1
Using the Array 8
Troubleshooting and Recovery 22
Troubleshooting the Module 22
Recovering from an Overheated Power Supply 26
Setting the Tray ID Switch 29
Verifying the Link Rate Setting 30
Relocating a Command Module 31
Upgrade Requirements 31
Adding New Drives to Empty Slots 33
Replacing All Drives at the Same Time 36
Replacing One Drive at a Time 39
Relocation Considerations 43
Raid Storage Manager (RSM) 44
Updating Firmware and NVSRAM on the Array 95
Updating ESM Firmware 99
4. StorEdge File Replicator 1
Overview 1
Real-time Mirroring 3
vi Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Pseudo Real-time Mirroring 3
StorEdge File Replicator 3
Mirroring Variations 7
Operational State 9
Mirror Creation 10
Mirror Replication 11
Mirror Sequencing 12
Link Down and Idle Conditions 12
Cracked and Broken Mirrors 12
Cannot perform first-time synchronization of mirror system: 13
Filesystem errors, such as run check, directory broken, etc.: 13
Error messages, panics or hang condition when enabling mirror: 13
5. Clustering 1
Overview 1
6. Checkpoints/Snapshots 1
Overview 1
Vol um es 1
Checkpoint Lifecycle 3
Object Checkpoint Restore 16
StorEdge cp Command 17
7. FRU/CRU Replacement Procedures 1
Tools and Supplies Needed 1
Determining a Faulty Component 2
Safety: Before You Remove the Cover 2
Removing and Replacing the Cover 2
Field Replaceable Unit (FRU) Procedures 4
NAS Head FRU Replacement Procedures 4
Contents vii
Opening the Front Bezel 5
Memory 6
Power Supply Unit 7
Fan Module 9
High Profile Riser PCI Cards 12
Gigabit Ethernet Card 13
Low Profile Riser PCI Cards 15
Qlogic HBA Removal and Replacement 16
LCD Display Module 17
Flash Disk Module 18
System FRU (Super FRU) 22
Array FRU replacement Procedures 23
Replacing a Controller 23
Replacing a Controller Battery 29
Replacing a Drive 36
Replacing a Fan 39
Replacing a Power Supply 41
Replacing an SFP Transceiver 44
viii Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Tables
TABLE 1-1 List of Adapters 16 TABLE 1-2 Routing Table 16 TABLE A-3 UPS Error Messages 22 TABLE A-4 File System Errors 24 TABLE A-5 PEMS Error Messages 24 TABLE 2-1 Index to Problems 4 TABLE 2-2 Bootup Beep Codes 6 TABLE 2-3 Server LEDs 11 TABLE 2-4 Front Panel LEDs 13 TABLE 2-5 Front Panel Pushbuttons 15 TABLE 2-6 Rear Panel LEDs 16 TABLE 2-7 System Status LED States 18 TABLE 2-8 Power Supply Status LED States 20 TABLE 2-9 Standard POST Error Messages and Codes 24 TABLE 2-10 Extended POST Error Messages and Codes 26 TABLE 2-11 BMC-Generated POST Beep Codes 27 TABLE 2-12 BIOS-Generated Boot Block POST Beep Codes 28 TABLE 2-13 Memory 3-Beep and LED POST Error Codes 29 TABLE 2-14 BIOS Recovery Beep Codes 30 TABLE 2-15 Boot Block POST Progress LED Code Table (Port 80h Codes) 31
ix
TABLE 2-16 POST Progress LED Code Table (Port 80h Codes) 32 TABLE 2-17 Status LED Indicators 87 TABLE 2-18 Supported Tape Libraries and Tape Drives 148 TABLE 3-1 Lights on the Back of a Command Module 14 TABLE 3-2 Lights on the Front of a Command Module 23 TABLE 3-3 Lights on the Back of a Command Module 24 TABLE 3-4 Enterprise Management Window Menus 48 TABLE 3-5 Enterprise Management Window Toolbar Buttons 49 TABLE 3-6 Array Management Window Tabs 52 TABLE 3-7 Array Management Window Menus (1 of 2) 53 TABLE 3-8 Array Management Window Toolbar Buttons 54 TABLE 3-9 RAID Level Configurations 58 TABLE 3-10 Mappings View Tab 69 TABLE 3-11 Volume-to-LUN Terminology 69 TABLE 3-12 Storage Array Status Icon Quick Reference 86 TABLE 4-1 Standard Terms 2
x Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Figures
FIGURE 2-1 Front Panel Pushbuttons and LEDs 13 FIGURE 2-2 Rear Panel LEDs 16 FIGURE 2-3 Location of Front-Panel System Status LED 18 FIGURE 2-4 Location of Rear-Panel Power Supply Status LEDs 20 FIGURE 2-5 Fault and Status LEDs on the Server Board 21 FIGURE 2-6 Location of Front-Panel ID Pushbutton and LED 23 FIGURE 2-7 Examples of POST LED Coding 31 FIGURE 2-8 The Update Software Panel 41 FIGURE 3-1 Controller 2 FIGURE 3-2 Label Locations on the Controller 3 FIGURE 3-3 Battery Charging/Charged and Cache Active Lights 4 FIGURE 3-4 Drives and Lights 4 FIGURE 3-5 Drive Numbering – Rackmount Module 5 FIGURE 3-6 Fans and Airflow 5 FIGURE 3-7 Power Supplies 6 FIGURE 3-8 SFP Transceiver and fibre Optic Cable 7 FIGURE 3-9 Tray ID Switch 8 FIGURE 3-10 Removing and Replacing a Deskside Module Back Cover 9 FIGURE 3-11 Power Supply Switches 10 FIGURE 3-12 Lights on the Back of a Command Module 13 FIGURE 3-13 Alarm Mute Button 20
xi
FIGURE 3-14 Lights on the Front of a Command Module 23 FIGURE 3-15 Lights on the Back of a Command Module 24 FIGURE 3-16 Power Supply Switches 28 FIGURE 3-17 Setting the Tray ID Switch 30 FIGURE 3-18 Verifying the Link Rate Setting 31 FIGURE 3-19 Removing and Installing a Drive 35 FIGURE 3-20 Power Supply Switches 38 FIGURE 3-21 Removing and Installing a Drive 38 FIGURE 3-22 Removing and Installing a Drive 42 FIGURE 3-23 Enterprise Management Window 45 FIGURE 3-24 Array Management Window 45 FIGURE 3-25 Enterprise Management Window 46 FIGURE 3-26 Device Tree Example 47 FIGURE 3-27 Array Management Window 51 FIGURE 3-28 Unconfigured and Free Capacity Nodes 66 FIGURE 3-29 Mappings View Window 68 FIGURE 3-30 SANshare Storage Partitioning Example 73 FIGURE 3-31 Host Port Definitions Dialog 75 FIGURE 3-32 Heterogeneous Hosts Example 76 FIGURE 3-33 DVE Modification Operation in Progress 79 FIGURE 3-34 Persistent Reservations Dialog 83 FIGURE 3-35 Monitoring Storage Array Health Using the Enterprise Management Window 85 FIGURE 3-36 Event Monitor Configuration 87 FIGURE 3-37 Event Monitor Example 88 FIGURE 3-38 Problem Notification in the Array Management Window 91 FIGURE 3-39 Displaying the Recovery Guru Window 92 FIGURE 3-40 Recovery Guru Window Example 93 FIGURE 3-41 Status Changes During an Example Recovery Operation 94 FIGURE 3-42 Status Changes When The Example Recovery Operation is Completed 95 FIGURE 4-1 The lifecycle of a transaction in StorEdge File Replicator 4
xii Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
FIGURE 4-2 Write ordering on the Mirror 5 FIGURE 4-3 Lost transaction handling on the Mirror 6 FIGURE 4-4 The Mirror Log and Primary Journal 7 FIGURE 6-1 Physical and Logical Volume Relationship 2 FIGURE 6-2 The Copy-On-Write Mechanism for Checkpoints 4 FIGURE 6-3 Mappings for Block n Before Modification 5 FIGURE 6-4 Mappings for Block n After Modification 6 FIGURE 6-5 Creating a hardlink when a volume is checkpointed and has active checkpoints 8 FIGURE 6-6 Mappings for Block n After Deleting ckpti-1 10 FIGURE 6-7 After Deleting ckpti+1. 10 FIGURE 6-8 Accessing .chkpnt in UNIX 13 FIGURE 6-9 Accessing ".chkpnt" in Windows Explorer 15 FIGURE 6-10 Viewing ".chkpnt" in Windows Explorer 16 FIGURE 6-11 Sharing Blocks Between Live and Checkpoint File Systems 17 FIGURE 6-12 Windows File Copy Error Message During a Checkpoint Restore Operation 19 FIGURE 6-13 Windows Excel Open Error Message During a Checkpoint Restore Operation 19 FIGURE 7-1 Removing the Cover 3 FIGURE 7-2 Sun StorEdge 5310 NAS Bezel Replacement 5 FIGURE 7-3 Sun StorEdge 5310 NAS Expansion Unit 6 FIGURE 7-4 Replacing the Power Supply 8 FIGURE 7-5 Removing the Fan Module 10 FIGURE 7-6 The Gigabit Ethernet Card in the Low Profile Riser Slot 14 FIGURE 7-7 Connecting the LCD Display 18 FIGURE 7-8 The Flash Disk 20 FIGURE 7-9 Removing an SFP Transceiver and fibre Optic Cable 25 FIGURE 7-10 Removing and Replacing a Controller 25 FIGURE 7-11 Removing the Controller Cover (Upside Down View) 26 FIGURE 7-12 Replacing the Controller Battery 27 FIGURE 7-13 Label Locations for the Controller 28 FIGURE 7-14 Controller Host Link, Drive Link, and Fault Lights 29
xiii
FIGURE 7-15 Removing the SFP Transceiver and fibre Optic Cable 31 FIGURE 7-16 Removing and Replacing a Controller 31 FIGURE 7-17 Removing the Controller Cover (Upside Down View) 33 FIGURE 7-18 Removing and Installing the Controller Battery 33 FIGURE 7-19 Label Locations on the Controller 34 FIGURE 7-20 Drive Link, Host Link, Battery, and Fault Lights 36 FIGURE 7-21 Replacing a Drive 38 FIGURE 7-22 Replacing a Fan 40 FIGURE 7-23 Replacing a Power Supply 43 FIGURE 7-24 Replacing an SFP Transceiver 45
xiv Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004

Preface

This Troubleshooting Guide provides information on how to identify, isolate, and fix
TM
problems with the Sun StorEdge
5310 NAS. It also explains how to remove and
replace certain key server components.
Topics in this chapter include:
“Who Should Use This Book” on page -xvi
“How This Manual is Organized” on page -xvi
“Typographic Conventions” on page -xvi
“Related Documentation” on page -xvii
“Ordering Sun Documents” on page -xvii
“Shell Prompts in Command Examples” on page -xviii
“Sun Welcomes Your Comments” on page -xviii
xv
Who Should Use This Book
The intended audience for this book is Sun field service personnel who are responsible for maintaining Sun StorEdge 5310 NAS.
How This Manual is Organized
This manual contains the following chapters:
Chapter 1, “Troubleshooting Overview” on page 1-1
Chapter 2, “NAS Head” on page 2-1
Chapter 3, “Storage Arrays” on page 3-1
Chapter 4, “StorEdge File Replicator” on page 4-1
Chapter 5, “Clustering” on page 5-1
Chapter 6, “Checkpoints/Snapshots” on page 6-1
Chapter 7, “FRU/CRU Replacement Procedures” on page 7-1
Typographic Conventions
The following table describes the typographic conventions used in this book.
TABLE P-1 Typograp hic Conv e ntions
Typeface or Symbol Meaning Example
courier font Names of commands;
Names of files; On-screen computer output;
italics Book titles, new words;
Terms to be emphasized; Variables that you replace with a
real value;
boldface courier font What you type machine_name% su
xvi Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
ls -a to list all files.
Use Edit your .login file. machine_name% You have mail.
Read Chapter 6 in the User’s Guide;
These are called class options; You mu s t be root to do this; To delete a file, ty p e rm filename.
Related Documentation
These documents contain information related to the tasks described in this book:
Sun StorEdge 5310 NAS Quick Reference Manual Sun StorEdge 5310 NAS Hardware Installation, Configuration, and User Guide Sun StorEdge 5310 NAS Software Installation, Configuration, and User Guide Sun StorEdge 5310 NAS Setup Poster
Ordering Sun Documents
The SunDocsSM program provides more than 250 manuals from Sun Microsystems, Inc. If you are in the United States, Canada, Europe or Japan, you can purchase documentation sets or individual manuals by using this program.
For a list of documents and how to order them, see the catalog section of the SunExpress™ Internet site at http://store.sun.com.
Accessing Sun Documentation Online
The http://docs.sun.com Web site enables you to access the Sun technical documentation online. You can browse the docs.sun.com archive or search for a specific book title or subject.
Preface xvii
Shell Prompts in Command Examples
The following table shows the default system prompt and superuser prompt for the C, Bourne and Korn shell.
TABLE P-2 Shell Prompt
Shell Prompt
Bourne shell and Korn shell prompt machine name$
Bourne shell and Korn shell superuser prompt machine name#
Sun Welcomes Your Comments
Sun is interested in improving its documentation and welcomes your comments and suggestions. You can email your comments to Sun at:
docfeedback@sun.com
Please include the part number (8xx-xxxx-xx) of your document in the subject line of your email.
xviii Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
CHAPTER
1

Troubleshooting Overview

This chapter provides an overview of diagnostic functions and tools needed for troubleshooting the Sun StorEdge 5310 NAS.
This chapter contains the following sections:
“How to Use This Manual” on page 1-1
“Important Notices and Information on the Sun StorEdge 5310 NAS” on page 1-2
“Diagnostic Information Sources” on page 1-8

1.1 How to Use This Manual

Before going deep into this manual, check the following to ensure that common problems have been resolved.
Are both of the power cords plugged in?
Are green LEDs displaying on the power sources? If no, check the power source.
Does the LCD Display panel show the system name and CPU% on it? If no, check
the power source.
Can you ping the system? If no, check the network cables and IP address on the
LCD Display. If you are still having problems, check with your system administrator.
If the user can’t access shares, are the shares set up on the system? Check the
shares section to make sure that the shares are set up with the proper name.
Is an NFS client having permissions issues on a CIFS file? Vice versa? Check the
FAQ for file permission issues to resolve.
1-1

1.2 Important Notices and Information on the Sun StorEdge 5310 NAS

Caution – Do not plug a USB keyboard into the front USB connector. This will
cause the system to crash.
Caution – Do Not power on the Sun StorEdge 5310 NAS, until two minutes after
the JBOD has been powered up, to ensure that the disk drives have finished spinning up.
Caution – /dvol/etc folder contains config information and needs to be backed up
to ensure that all configuration information is available upon a failure. Back up the /dvol/etc folder to an existing LUN on the Sun StorEdge 5310 NAS.
Note – /dvol/etc folder contains config information and needs to be backed up
to ensure that all configuration information is available upon a failure. It is recommended to back the /dvol/etc folder up to an existing LUN on the Sun StorEdge 5310 NAS.
Note – You must enable FTP from the CLI using the load ftpd command.
Currently, enabling the FTP from the web interface does not work.
Note – When configuring the Sun StorEdge 5310 NAS through a firewall, ensure
that the correct ports are not blocked. Refer to “StorEdge Web Admin does not work properly through a firewall.” on page 2-80 for more details.
Note – There is a line of tape that must be removed to be able to remove the fan
tray.
1-2 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004

1.3 Troubleshooting Tools

1.3.0.1 Storage Automated Diagnostic Environment (StorAde)

If you have the Storage Automated Diagnostic Environment installed in the host, check the internal status of the array with this tool. See the documentation for this tool for further information.
All that you need to use the Storage Automated Diagnostic Environment is web browser access to the host where it is installed.

1.3.0.2 Command Line Interface (CLI)

The CLI can be accessed through the MENU system or by using Telnet. This is a useful sections for troubleshooting many types of issues. The CLI is also where you load tools like FTP. See the Diagnostic Tools and Procedures section for details.

1.3.0.3 Log Error Messages

Both the Sun StorEdge 5310 NAS and attached hosts create log message files or error messages of system conditions and events. These log files are the most useful immediate tools for troubleshooting.

1.3.0.4 Sun StorEdge 5310 NAS Generated Messages

A syslog daemon in the array writes system error message logs to a location determined by the site system administrator. Consult with the site system administrator to obtain access to this log.

1.3.0.5 Client Generated Messages

CIFS clients will get messages on the monitor when they have attached shares on the Sun StorEdge 5310 NAS. These messages will be useful in determining issues that arise.
NFS clients will have messages generated in its /var/adm/messages file.
Chapter 1 Troubleshooting Over view 1-3
A variety of software logging tools monitor the various branches of the storage network. When an error is detected, the error’s severity level is categorized and classified. Errors are reported or logged according to severity level.

1.3.0.6 Log Message Severity Levels

Emergency—Specifies emergency messages. These messages are not distributed
to all users. Emergency priority messages are logged into a separate file for reviewing.
Alert—Specifies important messages that require immediate attention. These
messages are distributed to all users.
Critical—Specifies critical messages not classified as errors, such as hardware
problems. Critical and higher-priority messages are sent to the system console.
Error—Specifies any messages that represent error conditions, such as an
unsuccessful disk write.
Warning—Specifies any messages for abnormal, but recoverable, conditions.
Notice—Specifies important informational messages. Messages without a priority
designation are mapped into this priority message.
Information—Specifies informational messages. These messages are useful in
analyzing the system.
Debug—Specifies debugging messages.

1.4 Troubleshooting Procedures

1.4.0.1 High-Level Troubleshooting Tasks

This section lists the high-level steps you can take to isolate and troubleshoot problems in the array. It offers a methodical approach, and lists the tools and resources available at each step.
1. Discover the error by checking one or more of the following messages or files:
Storage Automated Diagnostic Environment alerts or email messages, if available
“event log” from the Sun StorEdge 5310 NAS
/var/adm/messages file at the host system
CIFS clients messages
2. Determine the extent of the problem by using one or more of the following methods:
1-4 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Review the Storage Automated Diagnostic Environment topology view
Using the Storage Automated Diagnostic Environment revision checking
functionality, determine whether the package or patch is installed
3. Check the status of a Sun StorEdge 5310 NAS by using one or more of the following methods:
Review the status of the light-emitting diodes (LED) on the array
Run the commands that check and display the configuration
Manually open a telnet session to the array and check the system status
Review the Storage Automated Diagnostic Environment device monitoring
reports, if available
4. Test and isolate field-replaceable units (FRUs) using the following tools:
Storage Automated Diagnostic Environment diagnostic tests, if available (these
tests might require a loopback cable for isolation)
Use the Troubleshooting Guide procedures documentation to help isolate FRU
failures
Note – These tests isolate the problem to a FRU that must be replaced. Follow the
instructions in the Sun StorEdge 5310 NAS Troubleshooting Guide for proper FRU replacement procedures.
5. Replace the failed FRU.
6. Verify the fix using the following tools:
Storage Automated Diagnostic Environment GUI Topology View and Diagnostic
Tests, if available
/var/adm/messages on the data host
CIFS client Access
Array LEDs
syslog file

1.4.0.2 Initial Troubleshooting Guidelines

To begin a problem analysis, check one or more of the following information sources for troubleshooting and perform one or more of the following checks:
The LED's can help you quickly identify if a problem is occurring. See the
Hardware Troubleshooting section to help isolate the failed component.
Chapter 1 Troubleshooting Over view 1-5
Sun StorEdge 5310 NAS messages, found in the syslog file, indicating a
problem. See Error Messages section for more information about array generated messages.
Host-generated message, found in the /var/adm/messages file, CIFS clients
may have errors on their monitor or in the event log.

1.5 Troubleshooting Flow Charts

Use the flow charts below to diagnose problems.
1-6 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Follow the steps below to diagnose hardware problems.
Chapter 1 Troubleshooting Over view 1-7
Follow the steps below to diagnose software problems.

1.6 Diagnostic Information Sources

1.6.1 StorEdge Diagnostic Email

The diagnostic email includes information about the StorEdge system configuration, disk subsystem, file system, network configuration, SMB shares, backup/restore information, /etc information, system log, environment data and administrator information. The diagnostics are a primary tool for checking configuration and troubleshooting.
Before you can send email diagnostics from the StorEdge, SMTP (email) must be configured. Please see the FAQ, “How do I set up SMTP (email)?”
1-8 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
To collect diagnostics, proceed as follows:
1. Access the StorEdge via Telnet or serial console.
2. Press enter at the [menu] prompt and enter the administrator password.
3. Press the spacebar until “Diagnostics” is displayed under “Extensions” at the lower right.
4. Select the letter corresponding to “Diagnostics”.
5. Wait a few seconds while the StorEdge builds the diagnostic.
6. Select option “2”, Send Email
7. Select option “1”, Edit problem description
8. Enter a precise description of the problem
9. Press [Enter]
10. Select option “8”, Send Email
Diagnostic is sent
If an email server is not configured or not available, it is also possible to save the diagnostics to a file on the StorEdge. To do this, proceed as above to access the “Diagnostics” menu.
1. Select option “1”, Save File.
2. Select option “1”, Edit path
3. Enter a valid path name in the path box. Format is /<volumename>/<directory>/<new filename>.
4. Press [Enter]
5. Select option “2”, save diagnostics file
System will respond with diagnostic saved
6. Access the volume that you saved the file to with SMB or NFS.
7. Copy the file to a local workstation
Important – Saving the diagnostics file locally will not include the necessary
attachments. When escalating an issue with diagnostics, you must also include the contents of the /etc directory, and the contents of /cvol/log.
Chapter 1 Troubleshooting Over view 1-9
This functionality is also available through the StorEdge Web Admin. To access these settings, log in, and click the envelope icon on the top taskbar. All of the options described above are available.

1.6.2 Data Collection for Escalations

1.6.2.1 Collecting Information from the Sun StorEdge 5310 NAS
The following are important considerations for data collection. Data collection is critical in cases that require escalation. We should always collect as much data as needed to resolve the worst-case scenario, in order to be able to resolve all scenarios. The worst-case scenario in this case, is that the issue has never before been seen, and we’ll need to recreate the problem in the lab. To do this, we’ll need to know about the client systems, the workload, the network, and so on.
1.6.2.2 Accurately quantify the problem
First, the problem must be quantified. We have identified a negative behavior of some type. We must precisely identify the scope of the problem and all possible details in order to resolve the issue. For example, if the StorEdge has a performance issue, we must exactly measure the performance, identify which problems exhibit the problem, and determine under what circumstances the problem occurs.
1.6.2.3 Collect general data
The first part of the data collection is to collect information that will be useful in every case. Much of this is contained in the StorEdge system diagnostics. From the diagnostics, we can see the StorEdge OS version, internal settings, recent log activity, and more. It is very important to generate the diagnostics during or immediately after the manifestation of the problem. Otherwise, the log and statistics will not show any data on the failure. Always collect a diagnostic email when escalating issues.
You should also collect any error messages generated by this problem, and any steps already taken in the attempt to resolve the problem, and the results obtained.
1.6.2.4 Collect specific data
Based on the above data, additional information may be required. This document will help you to tailor this data collection. Here are some examples:
1-10 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Version(s) of software on client system(s)
Version(s) of software on server system(s)
Network topology
Steps and/or sequence of events leading to the failure
What was the user doing or attempting to do when the failure occurred?
Problem symptom (error codes, failed operation, crash)
Syslog data
Network traces
Diagnostic email
1.6.2.5 Check remote access capabilities
In some cases, it is useful for one of your escalation resources to directly access the system. This can be a way to greatly simplify advanced data collection. Please note that this step is not always necessary or useful, but it can be a very valuable tool at times. When you know that advanced investigation will be required, it’s always wise to ask if remote access via TCP/IP or dial-up is available.
1.6.2.6 Data Collection for Specific Issues
Software compatibility issues
Some applications do not function properly when StorEdge is used in place of a server running a native operating system. Most, but not all, of these issues can be resolved with data collection and troubleshooting. It may be necessary to upgrade the application, the client operating system, or the StorEdge operating system. Keep in mind that the problem may lie in any of these, or a combination of all three.
The first step is to do research. Check to see if a newer version of the application or the StorEdge operating system is available. Check the release notes to see if the compatibility issue is addressed. If either version is far out of date, perform an upgrade to see if the problem is resolved. Another useful step is to try to operation on a other available network clients.
To escalate the issue, begin data collection by generating a system diagnostic with all attachments. If there is a specific symptom which can be identified, generate the system diagnostic as close as possible after this time, so that any effects can be observed in the logs and statistics.
The procedure for this can be found later in this document under Diagnostic Procedures. Next, it is necessary to collect as much data as possible on the client and application. At a minimum, the following information is required:
Client Operating System version, including any service packs or minor revisions
Chapter 1 Troubleshooting Overview 1-11
Software version, including any service packs, options or minor revisions
Client configuration information– mount options, NIC configuration, platform,
etc.
Network information – topology, switch and router information, path from client
to StorEdge
Server information – Detailed information on any application or authentication
servers, including all of the above details.
An exact set of steps to reproduce the problem. This should be very detailed,
including every menu selection and text entry
Details on any symptoms experienced by the client
The goal of this data set is to allow someone in a remote location to reproduce and resolve the issue without impacting the customer.
The next step is to verify the problem and collect network traces. If possible, copy the data residing on the StorEdge to another server temporarily. Verify that it works as expected. If it still exhibits the same symptom, the issue likely resides with the application.
Use a network capture utility to capture the network traffic generated by the failure condition between the client, the StorEdge and any other server involved in the issue. Define traffic filters so that only this traffic is captured.
Next, repeat the network capture, using the server which the application runs successfully on. This will allow engineering to make a direct comparison of a successful operation and an unsuccessful operation.
StorEdge has a built-in network monitoring tool. Details on the operation of this tool can be found in the Diagnostic Procedures section of this document. However, in this case it would be best to use a network analysis tool on the client. The main reason for this is that the StorEdge tool will not be able to capture the data when an alternate server is used for comparison.
1.6.2.7 Security Issues
When troubleshooting security problems, it is useful to experiment. Try other workstations, other operating systems and different user accounts, including a root or a Domain Admin account. These are very useful in locating the source of the problem.
When escalating a security issue collect the following data:
1-12 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Cacls
For issues with access to a file or directory, collect the output of the cacls command. This command is available from the CLI. At the CLI, enter “cacls <full pathname>”. The full pathname should begin with the volume name, as in this example: “cacls /vol1/testfile.txt”.
Cacls output contains the following information:
First, the basic mode information and UID/GID of the owner is displayed. Here is an example:
drwxrw---- 34 22 /vol1/data
In this case, we can see that the item is a directory, with 750 permissions: Read/write/execute (7) for the owner (UID 34), Read/write for members of the owner’s group (GID 22), and no permissions (0) for everyone else.
Listed next are Creation time, FS Creation time, and FS mtime. These are timestamps associated with the file and the filesystem, generally only useful for troubleshooting timestamp issues.
Next is the Windows security descriptor. In its simplest form, it will read “No security descriptor”. This means that no Windows security is present, and that Windows will simulate security based on the above NFS permissions.
If a Windows security descriptor is present, the following information is displayed:
Security Descriptor:The type of security descriptor. This can be disregarded.
Owner:The user name or SID of the owner.
Primary Group: The group name or SID of the group owner.
Discretionary Access Control List (DACL):A list of users who have access to the
file, by SID.
A SID is a number that uniquely identifies a user or group. The data to the right of the final dash identifies the user within the domain; the rest of the number indicates domain and type of account information. This user information is known as the RID (relative ID). The RID is the number used for user mapping. It can be cross­referenced with the StorEdge user or group mapping data determine the user/group name and NFS UID/GID.
User access token
For issues with the access of a particular user, it may be useful to capture the access token. The access token identifies an SMB user along with other details such as domain and group memberships. See the instructions under /proc filesystem. This item is particularly useful when the issue involves group membership. Note that this data is only useful for SMB issues.
Chapter 1 Troubleshooting Overview 1-13
Proc filesystem
The /proc filesystem is a virtual filesystem used to collect system data. The location of some of the more useful data is listed below. To collect the data, copy the file, or use the “cat” CLI command to dump it to the screen while logging the terminal session.
/proc/cifs/DOMAIN.USER.6789ABCD…
These are user access tokens. They may be useful in troubleshooting SMB issues.
These file names begin with the domain name, then the username, then some hexadecimal digits. The hexadecimal digits are a representation of the IP address, which can be used to discern between multiple logins for a user. If you do not see the user token that you need, it may be necessary to log the user off for thirty seconds, and then back on in order to capture the token.
/proc/cifs/pdc
The currently connected domain, domain controller, and the IP address of the domain controller.
/proc/cifs/ntdomain
A list of all trusted domains, their related SIDs, and the local machine and local domain SIDs.
Network trace
A network trace can be very valuable towards diagnosing problems that involve network communication. Set the trace to filter traffic between StorEdge, the client, and any authentication server. In this case, it is usually best to use the StorEdge built-in packet capture utility.
1.6.2.8 StorEdge network capture utility
StorEdge includes a built-in network monitoring tool. This allows you to capture packets from the network and save them to a file. This can be a valuable troubleshooting tool.
To configure network monitoring, it must first be loaded at the StorEdge CLI.
1. To access the StorEdge CLI, connect to the StorEdge via Telnet or serial console, and type “admin” at the [menu] prompt and enter the administrator password.
2. At the CLI, enter “load netm”. Then type “menu” to configure capture and capture packets.
1-14 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
3. Press the spacebar until “Packet Capture” is displayed under “Extensions” at the lower right.
4. Select the letter corresponding to “Packet Capture”.
5. Select option “1”, Edit Fields.
The available options are as follows:
Capture FileWhere to save the capture file. </volumename/directory/filename>
Frame Size (B)Size in bytes of each frame to capture. The default is normally used.
IP Packet Filter“No” captures all traffic, “Yes” allow you to filter what is received.
A filter allows you to select which IP address or addresses you will capture traffic from. You can also filter on a particular TCP or UDP port.
Dump EnableSelect “Yes” to allow StorEdge to save the capture in the event of a
problem.
6. After configuring these options, select option “7”, “Start Capture”
7. Reproduce the network event you wish to capture.
8. Select option “7”, “Stop Capture”.
9. Access the file via NFS or SMB and copy the file as needed.
Client and Server data
Collect all possible information on the client system having the issue and any authentication or application servers involved in the issue. This information should include operating system version, patch level and platform.
Duplication instructions
If possible, provide a step-by-step procedure to recreate this problem. Include every setting and every configuration detail.
Chapter 1 Troubleshooting Overview 1-15
1.6.2.9 TCP/IP Connectivity problems
A good tool to investigate network connectivity problems is the netstat command. This command is available from the StorEdge CLI. Simply type “netstat” at the CLI and a list of all network interfaces and routes is displayed, along with several useful statistics. Two tables are displayed, as follows:
TABLE 1-1 List of Adapters
Name Mtu Netmask Address Ipackets Ierr Opackets Oerr Coll
lo0 1536 255.0.0.0 127.0.0.1 77 0 77 0 0
fxp1 1500 255.255.255.0 10.10.35.2 269947 0 97815 0 0
fxp2 1500 --no-address-- 0 0 0 0 0 0
The first table is a list of adapters and statistics for each.
TABLE 1-2 Routing Table
Netmask Destination Gateway Interf Flags Refs Use
l0.0.0.0 l0.0.0.0 64.60.56.1 fxp1ug 5 70796
255.255.255.0 64.60.56.0 10.10.35.2 fxp1 uc 00
255.255.255.255 127.0.0.1 127.0.0.1 lo0 uh 077
The second table is the routing table. The adapter “lo0” is the loopback device and does not represent a physical adapter. The route “0.0.0.0” is the default gateway. The following should be checked in this display:
Check for typos in IP addresses and netmasks.
Check “Ierr”, “Oerr”, and Coll”. These are all packet errors. They may indicate a
bad NIC or cable, connected to the StorEdge or elsewhere, or possibly, in the case of the “coll” statistic, an incorrect speed and duplex setting.
Check Ipackets and Opackets for the appropriate network adapter. These are
packets received and sent by each adapter. A disconnected or bad cable will result in no Ipackets for the connected interface. No Opackets may indicate that there is no route defined which uses this interface.
Check for modified gateways. A “d” or “m” in the flags column indicates a
dynamically added or dynamically modified route. If an important route is modified, it may no longer be able to send packets to the desired destination.
1-16 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
These are the result of an ICMP message from another router or firewall, typically due to mis-configuration of that device. It is also possible to configure StorEdge to ignore ICMP requests to change the default gateway.
Check the “Use” statistic in the routing table. This statistic indicates how many
times a route has been used. If you have defined a route for a specific purpose, such as mirroring, and this counter is not incrementing, then the route was most likely not defined correctly.
Also, check the basics. Try another client on the same subnet, try another cable, try another switch port for both client and StorEdge.
To escalate TCP/IP connectivity issues collect a network trace from the StorEdge, using the internal utility, and also from the client or server attempting to connect to the StorEdge if possible. Also include details on the client system, especially network configuration information and operating system version. A network diagram which includes IP addresses and information on switches and router hardware on the network is also very helpful.
1.6.2.10 Performance Issues
The following is a general list of barriers to peak performance:
Network Configuration:
Verify speed and duplex negotiation.
Verify that port aggregation is configured when multiple NICs are connected to a
subnet.
Ensure that Jumbo frames are not configured.
Ensure that Spanning Tree Protocol is not configured.
Ensure that all configured NIS, DNS, SMTP servers and etc. are reachable and
resolvable. (Note: always configure by IP rather than name where possible)
Configuration:
Checkpoints: Checkpoints can be overused and have a drastic effect of performance of the system. Verify that customers understand the use of checkpoints and how the retention policy can play a significant role in system performance.
If using ADS, improper configuration of dynamic DNS configuration can adversely affect performance.
Chapter 1 Troubleshooting Overview 1-17
Other processes / High CPU Utilization
When performance is low, one possible reason is that the system is busy with other processes. One way to check this is to observe the CPU utilization. This is best viewed from the activity monitor screen in the telnet interface. The CPU utilization can be found in the lower right corner, listed as a percentage.
The rest of the activity monitor screen may also be helpful, as it may give an indication of the source of the demand on resources. The display is arranged in four columns. The left most column lists each volume, and for each volume, the current disk space in use as a percentage of the volume and I/O requests. Note that a volume utilization of over 75% can cause a significant slowdown. The second column shows the load on each resource, such as CPU, memory or network adapters. These numbers do not correspond to any defined performance parameters, so they are only useful for relative comparison to another point in time. The third and fourth columns list clients currently connected to the StorEdge, and how many network I/O requests are coming from each.
Having determined that the slow server response corresponds with high CPU utilization, the next step is to collect system diagnostic while the CPU utilization is high (usually 90% or higher). The diagnostics provide a per-process breakdown of CPU and memory utilization, along with all associated log messages and configuration.
It is also possible to acquire this per-process utilization breakdown at the CLI with the “status” command. This can be useful when the CPU utilization spikes are very brief in duration, rendering them difficult to capture via a diagnostic. In this case, you would log the telnet or terminal session, and run the status command several times in succession while a performance problem is occurring. System diagnostics should also be captured to supplement this information.
Command Line performance utilities
StorEdge provides several built-in utilities designed to measure performance. These are best used to isolate a problem. For example, using aratewrite to write directly to the RAID set may help to determine whether a write performance problem is on a particular volume, or even the network.
Usage for these utilities is as follows:
ratewrite: write contents of a file, report performance. The file creation does not
use network connection. This can determine if issue is disk or network related. usage: ratewrite FILENAME [+OFFSET] TOTALKB [BLOCKSIZE] example:
support > ratewrite /vol1/testfile 1000000 4096 1024000000 bytes (976.5M) in 36.844 seconds 26.50MB/sec
1-18 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
rateread: read contents of a file, report performance. The file read does not use
network connection. This can determine if issue is disk or network related and also if problem is in reading or writing data.
usage: rateread FILENAME [+OFFSET] TOTALKB [BLOCKSIZE] example:
support > rateread /vol1/testfile 8192 1024000000 bytes (976.5M) in
0.877 seconds 1.086GB/sec
ratecopy: copy a file, test the performance of a file copy from source to target.
Uses network connection and can be used to determine if any network issues exsist.
usage: ratecopy SOURCE_FILENAME DEST_FILENAME [BLOCKSIZE] example:
support > ratecopy /vol1/testfile vol1/testout 1024000000 bytes (976.5M) in 25.116 seconds 38.88MB/sec
aratewritewrite a file direct asynchronously. Test performance VS ratewrite.
usage: aratewrite FILENAME [+OFFSET] TOTALKB [BLOCKSIZE] [MB_PER_COMMIT]
example:
support > aratewrite /vol1/testfile 1000000 4096 Writing 976MB in 4KB blocksize with 0MB per commit. 1024000000 bytes (976.5M) in
14.982 seconds 65.18MB/sec

1.6.3 Log Error Messages

1.6.4 SYSLOG

The syslog is an important tool for troubleshooting. It provides a place to begin isolating system issues. There are many levels of warnings that can be used to notify you via email that there is a problem.
1.6.4.1 Understanding Sun StorEdge 5310 NAS Log Messages
The Sun StorEdge 5310 NAS provides an Event Management subsystem that monitors the chassis and reports event information to:
The system log, which is only in memory
A syslog server
SNMP Traps (SNMP v1 and v2)
A local file on one of the created volumes
Email notification
Chapter 1 Troubleshooting Overview 1-19
Components of an Event Message
Time/date Severity
05/23/04 05:55:30 C sysmon[63]: Disk drive at enclosure 1 row 0 column 2
failed.
Time/date- Time and Date of the event
Severity- Severity can be one of those listed below
Facility- The system module that reported the message
FID- The kernel ID of the Facility
Message Body- The contents of the message
Severity Level Definitions (highest to lowest)
Emergency—Specifies emergency messages. These messages are not distributed
Alert—Specifies important messages that require immediate attention. These
Critical—Specifies critical messages not classified as errors, such as hardware
Error—Specifies any messages that represent error conditions, such as an
Warning—Specifies any messages for abnormal, but recoverable, conditions.
Notice—Specifies important informational messages. Messages without a priority
Information—Specifies informational messages. These messages are useful in
Debug—Specifies debugging messages.
Facility FID Message body
to all users. Emergency priority messages are logged into a separate file for reviewing.
messages are distributed to all users.
problems. Critical and higher-priority messages are sent to the system console.
unsuccessful disk write.
designation are mapped into this priority message.
analyzing the system.

1.6.5 Error Codes from the Sun StorEdge 5310 NAS LCD Display and syslog

This section details the specific error messages sent through e-mail, SNMP notification, the LCD panel, and the system log to notify the administrator in the event of a system error. SysMon, the monitoring thread in the Sun StorEdge 5310
1-20 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
NAS, monitors the status of RAID devices, UPSs, file systems, head units, enclosure subsystems, and environmental variables. Monitoring and error messages vary depending on model and configuration.
In the tables in this section, table columns with no entries have been deleted.

About SysMon Error Notification

SysMon, the monitoring thread in the Sun StorEdge 5310 NAS, captures events generated as a result of subsystem errors. It then takes the appropriate action of sending an e-mail, notifying the SNMP server, displaying the error on the LCD panel, writing an error message to the system log, or some combination of these actions. E-mail notification and the system log include the time of the event.

Sun StorEdge 5310 NAS Error Messages

The following sections show error messages for the Sun StorEdge 5310 NAS UPS, file system usage, and the PEMS.
Chapter 1 Troubleshooting Overview 1-21

UPS Subsystem Errors

Refer to Table A-3 for descriptions of UPS error conditions.
TABLE A-3 UPS Error Messages
Event E-Mail Subject: Text SNMP Trap LCD Panel Log
Power Failure AC Power Failure:
AC power failure. System is running on UPS battery.
Action: Restore system power. Severity = Error
Power Restored AC power restored:
AC power restored. System is running on AC power.
Severity = Notice
Low Battery UPS battery low:
UPS battery is low. The system will shut down if AC power is not restored soon.
Action: Restore AC power as soon as possible.
Severity = Critical
Normal Battery UPS battery recharged:
The UPS battery has been recharged. Severity = Notice
Replace Battery Replace UPS Battery:
The UPS battery is faulty. Action: Replace the battery. Severity = Notice
UPS Alarms ­Ambient temperature or humidity outside acceptable thresholds
UPS abnormal temperature/humidity:
Abnormal temperature/humidity detected in the system.
Action: 1. Check UPS unit installation, OR
2. Contact technical support. Severity = Error
EnvUpsOn Battery
EnvUpsOff Battery
EnvUpsLow Battery
EnvUps Normal Battery
EnvUps Replace Battery
EnvUps Abnormal
U20 on battery
U21 power restored
U22 low battery
U22 battery normal
U23 battery fault
U24 abnormal ambient
UPS: AC power failure. System is running on UPS battery.
UPS: AC power restored.
UPS: Low battery condition.
UPS: Battery recharged to normal condition.
UPS: Battery requires replacement.
UPS: Abnormal temperature and/or humidity detected.
1-22 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
TABLE A-3 UPS Error Messages
Event E-Mail Subject: Text SNMP Trap LCD Panel Log
Write-back cache is disabled.
Controller Cache Disabled:
Either AC power or UPS is not charged
Cache Disabled
write-back cache for ctrl x disabled
completely. Action: 1 - If AC power has failed,
restore system power. 2 - If after a long time UPS is not charged completely, check UPS.
Severity = Warning
Write-back cache is enabled.
Controller Cache Enabled:
System AC power and UPS are reliable
Cache Enabled
write-back cache for ctlr n enabled
again. Write-back cache is enabled. Severity = Notice
The UPS is shutting down.
UPS shutdown:
The system is being shut down
!UPS: Shutting
down because there is no AC power and the UPS battery is depleted.
Severity = Critical
UPS Failure UPS failure:
Communication with the UPS unit has failed.
EnvUpsFail U25 UPS
failure
UPS:
Communication
failure. Action: 1. Check the serial cable
connecting the UPS unit to one of the CPU enclosures, OR
2. Check the UPS unit and replace if necessary.
Severity = Critical
Chapter 1 Troubleshooting Overview 1-23

File System Errors

File system error messages occur when the file system usage exceeds a defined usage threshold. The default usage threshold is 95%.
TABLE A-4 File System Errors
Event E-Mail Subject: Text SNMP Trap LCD Panel Log
File System Full
File system full:
File system <name> is xx% full. Action: 1. Delete any unused or
temporary files, OR
2. Extend the partition by using an unused partition, OR
3. Add additional disk drives and extend the partition after creating a new partition.
(Severity=Error)

PEMS Events

Sun StorEdge 5310 NAS employs the PEMS board to monitor environmental systems and to send messages regarding fan, power supply, and temperature anomalies.
Note – Device locations are shown in the Sun StorEdge 5310 NAS Hardware
Installation, Configuration, and User Guide included in your documentation CD.
Table A-5 describes the PEMS error messages for the Sun StorEdge 5310 NAS.
TABLE A-5 PEMS Error Messages
PartitionFull F40
FileSystemName full
File system <name> usage capacity is xx%.
Event E-Mail Subject: Text SNMP Trap LCD Panel Log
CPU Fan Error
1-24 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Fan Failure:
The CPU fan has failed. Fan speed = xx RPM.
Action: The system will shut down in 10 seconds to protect the CPU from damage. You should replace the CPU fan before turning the system back on.
Severity = Critical
envFanFail trap P11 CPU fan
failed
The CPU fan has failed! Better shut down.
TABLE A-5 PEMS Error Messages
Event E-Mail Subject: Text SNMP Trap LCD Panel Log
Fan Error Fan Failure:
Blower fan xx has failed. Fan speed = xx RPM.
Action: The fan must be replaced as soon as possible. If the temperature begins to rise, the situation could become critical. Severity = Error
Power Supply Module Failure
Power supply failure:
The power supply unit xx has failed. Action: The power supply unit must be
replaced as soon as possible. Severity = Error
Power Supply Module Te mp er at ur e
Power supply temperature critical:
The power supply unit xx is overheating. Action: Replace the power supply to
avoid any permanent damage. Severity = Critical
Temperature Error
Temperature critical:
Temperature in the system is critical. It is xxx Degrees Celsius.
Action: 1. Check for any fan failures, OR
2. Check for blockage of the ventilation, OR
3. Move the system to a cooler place. Severity = Error
envFanFail trap P11 Fan xx
failed
envPowerFail trap
envPowerTemp Critical trap
P12 Power xx failed
P22 Power xx overheated
envTemperatue Error trap
P51 Temp error
Blower fan xx has failed!
Power supply unit xx has failed.
Power supply unit xx is overheating.
The temperature is critical.
Primary Power Cord Failure
Secondary Power Cord Failure
Power cord failure:
The primary power cord has failed or been disconnected.
Action: 1. Check the power cord connections at both ends, OR
2. Replace the power cord. Severity = Error
Power cord failure:
The secondary power cord has failed or been disconnected.
Action: 1. Check the power cord connections at both ends, OR
2. Replace the power cord. Severity = Error
envPrimary PowerFail trap
envSecondary PowerFail trap
Chapter 1 Troubleshooting Overview 1-25
P31 Fail PWR cord 1
P32 Fail PWR cord 2
The primary power cord has failed.
The secondary power cord has failed.

1.7 Maintenance Precautions

The sections that follow provide subassembly-level removal and installation guidelines. After completing all necessary removal and replacement procedures, verify that all components are working properly.

1.7.0.1 Tools Required

To service the Sun StorEdge 5310 NAS, you need:
Phillips screw driver
Flat head screw driver

1.7.0.2 Electrostatic Discharge Information

Static electricity can cause damage to static-sensitive devices and/or microcircuitry. For this reason, it is important that proper packaging and grounding techniques be observed. To further ensure the prevention of electrostatic damage, observe these procedures:
Transport products in static-safe containers.
Cover work stations with approved static-dissipating material.
Wear a wrist strap, and always be properly grounded when touching static-
sensitive equipment/parts.
Use only properly grounded tools and equipment.
Avoid touching pins, lead or circuitry.
Note – The following section can be ignored if you are swapping out a fan, power
supply or hard drive.

1.7.0.3 Preparation Procedures

Complete the following steps before you begin the removal/installation procedures:
1. Shut the system down properly according to your operating system’s instructions.
2. Turn the Sun StorEdge 5310 NAS off.
3. Disconnect the power cord from the power source, then from the Sun StorEdge 5310 NAS server.
4. Shut down the storage enclosure and remove its power cords.
1-26 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
5. Disconnect the all other external peripheral devices from the Sun StorEdge 5310 NAS server if applicable.
6. Disconnect all optical fibre and network interface cables from the Sun StorEdge 5310 NAS server and the storage enclosure.
7. Remove the Sun StorEdge 5310 NAS and the storage enclosure from the rack.

1.8 Static Electricity Precautions

1.8.0.1 Grounding Procedure

You must maintain reliable grounding of this equipment. The Sun StorEdge 5310 NAS system (including head and optional Expansion Unit) must be connected to a dedicated 20A receptacle.

1.8.0.2 Static Electricity

The Sun StorEdge 5310 NAS server and Expansion Unit contain several components sensitive to static-electrical discharge. Surges of static electricity (caused by shuffling your feet across a floor and touching a metallic surface, for example) can cause damage to electrical components.
Static electricity can cause damage to static-sensitive devices and/or microcircuitry. For this reason, it is important that proper packaging and grounding techniques be observed. To further ensure the prevention of electrostatic damage, observe these procedures:
Transport products in static-safe containers.
Cover work stations with approved static-dissipating material.
Wear a wrist strap, and always be properly grounded when touching static-
sensitive equipment/parts.
Use only properly grounded tools and equipment.
Avoid touching pins, leads, or circuitry.
To avoid damaging Sun StorEdge 5310 NAS and Expansion Unit internal components with static electricity, follow these instructions before performing any installation procedures.
1. Make sure both of the Sun StorEdge 5310 NAS (and optional Expansion Unit) AC power cables are plugged in, and that the unit is turned off.
Chapter 1 Troubleshooting Overview 1-27
2. Wear a wrist strap, and always be properly grounded when touching static­sensitive equipment/parts.
If a wrist strap is not available, touch any unpainted metal surface on the Sun StorEdge 5310 NAS (and optional Sun StorEdge 5310 NAS Expansion Unit) back panel to dissipate static electricity. Repeat this procedure several times during installation.
3. Avoid touching exposed circuitry, and handle components by their edges only.
Caution – Do not power on the Sun StorEdge 5310 NAS nor Sun StorEdge 5310
NAS Expansion Unit units until after you have connected to the Network.
The AC source must be electrically isolated by double or reinforced insulation from any hazardous AC or DC source. The AC source must be capable of providing up to 500 W of continuous power per feed pair.
Mains AC Power Disconnect—You are responsible for installing an AC power disconnect for the entire rack unit. This power source disconnect must be readily accessible, and it must be labeled as controlling power to the entire unit, not just to the server(s).
1-28 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
CHAPTER
2

NAS Head

This chapter addresses frequently asked questions for the Sun StorEdge 5310 NAS. The chapter contains these sections:
“Hardware” on page 2-1
“OS Operations” on page 2-36
“Updating the OS on the Sun StorEdge 5310 NAS” on page 2-40
“Sun StorEdge 5310 NAS Firmware” on page 2-40
“Common Problems Encountered on the Sun StorEdge 5310 NAS” on page 2-42
“Frequently Asked Questions” on page 2-92

2.1 Hardware

2.2 Contacting Technical Support

For technical support, call the phone numbers listed below, according to your location.
United States1-800-USA-4SUN (1-800-872-4786)
UK Tel: +44 870-600-3222
France Tel: +33 1 34 03 5080
Germany Tel: +49 1805 20 2241
Italy Tel: +39 02 92595228, Toll Free 800 605228
2-1
Spain Tel: +011 3491 767 6000
See the following link for US, Europe, South America, Africa, and APAC local country telephone numbers:
http://www.sun.com/service/contacting/solution.html
For general support and documentation on the servers, see the following link:
http://www.sun.com/supporttraining/

2.2.1 Problems With Initial System Startup

Problems that occur at initial system startup are usually caused by incorrect installation or configuration. Hardware failure is a less frequent cause.
2.2.1.1 Checklist
Are all cables correctly connected and secured?
Is the power cord properly inserted and fully seated?
Are there any Baseboard Management Controller (BMC) beep codes? You may
have to listen carefully two or three times to hear them. See “POST Error Beep Codes” on page 2-27 for beep code details.
Is the BMC running? Try pressing the ID button on the front panel. If the blue ID
LED fails to illuminate, the BMC is not responding.
Are the cables going to the front panel board installed and seated properly (check
the front panel cable, the USB cable, and the 100-pin flex cable).
Are the processors fully seated in their sockets on the server board?
Are all add-in PCI boards fully seated in their slots on the server board?
Are all jumper and switch settings on add-in boards and peripheral devices
correct? To check these settings, refer to the manufacturer’s documentation that comes with them. If applicable, ensure that there are no conflicts—for example, two add-in boards sharing the same interrupt.
Are all DIMMs installed correctly?
Are all peripheral devices installed correctly?
If the system has a hard disk drive, is it properly formatted or configured?
Are all device drivers properly installed?
Are the configuration settings made in BIOS Setup correct?
Did you press the system power on/off switch on the front panel to turn the
server on (power on light should be lit)?
Is the system power cord properly connected to the system and plugged into a
NEMA 5-15R outlet for 100-120 V or a NEMA 6-15R outlet for 200-240V?
Is AC power available at the wall outlet?
Are there any POST LEDs illuminated? If so check “Power-On Self-Test (POST)”
on page 2-7.
2-2 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Are there any POST beep codes? If so check “POST Error Beep Codes” on page 2-
27.

2.2.2 Resetting the Server

Quite often, a problem can be solved merely be resetting the server or shutting it down and powering it back up. You may restart or shut down the Sun StorEdge 5310 NAS using software or hardware.
2.2.2.1 Shutdown Commands for Software Menu
To shutdown the system using the menu:
1. Use the Web Administrator or Telnet to the Sun StorEdge 5310 NAS to shutdown the server.
2. Via Web Admin, go to Managing the System and choose Shutdown Server.
3. Via Telnet go to the main menu.
4. Press 0 for Server Shutdown.
This screen will give you the option of reboot or halt.
5. Choose one of the options and the server will shut down.
Note – There could be a few second delay before the server shuts down.
2.2.2.2 Shutdown Commands for Hardware LCD Display
To shutdown the system using the LCD display:
1. Press the Select button on the LCD panel to access menus.
2. The LCD panel displays options A and B. Press the Down Arrow to select option “B. Shutdown Server” then press the Select button.
3. Press Select to select the “A. Power Off” option.
4. Press the Down Arrow to change “No” to “Yes”.
5. Press Select to confirm and begin shutting down.
Chapter 2 NAS Head 2-3

2.2.3 Preparing the System for Diagnostic Testing

Caution – Turn off devices before disconnecting cables. Before disconnecting any
peripheral cables from the system, turn off the system and any external peripheral devices. Failure to do so can cause permanent damage to the system and/or the peripheral devices.
1. Turn off the system and all external peripheral devices. Disconnect all of them from the system, except the keyboard and video monitor.
2. Make sure the system power cord is plugged into a properly grounded AC outlet.
3. Make sure your video display monitor and keyboard are correctly connected to the system. Turn on the video monitor. Set its brightness and contrast controls to at least two thirds of their maximum ranges (see the documentation supplied with your video display monitor).
4. Turn on the system. If the power LED does not light, see “Power LED Does Not Light” on page 2-8.
5. If errors are encountered, power off the system, remove all add-in cards, and turn the power back on.
2.2.3.1 Specific Problems and Corrective Actions
This section provides possible solutions for the specific problems listed in Table 2-1.
TABLE 2-1 Index to Problems
Problems Reference
“Problems Starting Up” page 2-5
“Power LED Does Not Light” page 2-8
“System Cooling Fans Do Not Rotate Properly” page 2-8
“Cannot Connect to a Server” page 2-9
“Problems with Network” page 2-9
Try the solutions in the order given.
2-4 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Problems Starting Up
If the server does not start up properly, use the information in this section to diagnose problems.
Server Does Not Power On
If the server does not power on, check the following:
Does the main server board have power? Open the chassis lid and check the 5V
Standby LED on the baseboard to see if it is illuminated. If your server is plugged in, this LED should be green. See Figure 2-5, “Fault and Status LEDs on the Server Board,” on page 2-21 for the location of this LED.
Check the power cord connection. The Sun StorEdge 5310 NAS allows the use of
two power supplies, and the system will not power on if one power cord is used and it is plugged into the wrong power connector.
Remove all add-in cards and see if the server boots using just the on-board
components. If the server boots successfully, add the cards back in one at a time with a reboot after each addition to see if you can isolate a suspect card.
Remove and reseat the memory modules. Ensure that you have properly
populated the memory modules. On the main board, memory is populated in pairs. See “Memory” on page 7-6 for memory module installation and placement. Refer to the silkscreen on the main board for proper memory module placement. Try using memory modules from a known, compatible, server.
Check the internal cable connections to ensure that they are properly connected.
Remove the processor(s) and reseat as a last resort.
Caution – Removing and replacing the processors is not recommended and should
only be done as a last resort. This is a procedure that should be attempted by Sun qualified service personnel.
Front Panel is Unresponsive and Video is Disabled
If the front panel is unresponsive to any pushbuttons you press, and video is disabled, it could be that the front panel is locked. By default, front panel locking is disabled; however, it is possible to enable front panel locking through the BIOS setup. To do this, an administrative password must be set using Security > Set Admin Password.
When the password is set, the front panel, mouse, and keyboard are locked after a timeout expires. The video is also blanked. The purpose of this is to prevent unauthorized access to a server by someone who plugs in a keyboard and video monitor. Access is regained simply by using the keyboard to type the password.
Chapter 2 NAS Head 2-5
Note – A corded PS/2 keyboard (not a wireless one) must be plugged into the
keyboard/mouse connector at the back of the server. When the front panel is locked, the lights on the keyboard flash, but the server is still fully functional.
Server Beeps at Power On or When Booting
The server indicates problems with “beep codes” during Power-On Self Test (POST) in the event there is no displayed video. A complete list of beep codes is given in “POST Error Beep Codes” on page 2-27.
Note – The RAID card also will beep when a disk drive has failed. Check the system
log to help isolate the problem.
The following beep codes identify system events during POST in case video fails to display.
TABLE 2-2 Bootup Beep Codes
Beeps Reason
1 One short beep before boot (normal, not an error)
1-2 Search for option ROMs. One long beep and two short beeps on checksum
failure.
1-2-2-3 BIOS ROM checksum
1-3-1-1 Test DRAM refresh
1-3-1-3 Test 8742 keyboard controller
1-3-3-1 Auto size DRAM. System BIOS stops execution here if the BIOS does not
detect any usable memory DIMMs.
1-3-4-1 Base RAM failure. BIOS stops execution here if entire memory is bad.
2-1-2-3 Check ROM copyright notice.
2-2-3-1 Test for unexpected interrupts.
1-5-1-1 FRB failure (processor failure)
1-5-2-2 No processors installed
1-5-2-3 Processor configuration error (for example, mismatched VIDs).
1-5-2-4 Front-side bus select configuration error (for example, mismatched BSELs)
2-6 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
TABLE 2-2 Bootup Beep Codes
Beeps Reason
1-5-4-2 Power fault
1-5-4-3 Chipset control failure
1-5-4-4 Power control failure
Server Starts Booting Automatically at Power On
The server board saves the last known power state in the event of a power failure. If you remove power before powering down the system using the power switch on the front panel, your system might automatically attempt to restore itself back to the state it was in after you restore power.
You can configure how you would like your server system to react when power is restored in the BIOS set-up (Security menu). You can have the server remain off or return to the last known power state.
Please keep in mind that unplugging the system or flipping a switch on the
power strip both remove power.
Follow the correct power removal sequence (make sure the system has shut down
before removing the power cord).
Power-On Self-Test (POST)
Each time you turn on the system, the BIOS begins execution of POST. POST discovers, configures, and tests the processors, memory, keyboard, and most installed peripheral devices. The time needed to test memory depends on the amount of memory installed. POST is stored in flash memory.
To execute and monitor POST:
1. Turn on your video monitor and system. After a few seconds, POST begins to run and displays a splash screen.
2. While the splash screen is displayed:
Press <F2> to enter the BIOS Setup
OR
Press <Esc> to view POST diagnostic messages and change the boot device
priority for this boot only.
OR
If the Service Partition is installed, press <F4> to run the System Setup Utility
Chapter 2 NAS Head 2-7
3. If you do not press <F2> or <Esc> or <F4> and do NOT have a device with an operating system loaded, the boot process continues and the system beeps once. The following message is displayed:
Operating System not found
4. At this time, pressing any key causes the system to attempt a reboot. The system searches all removable devices in the order defined by the boot priority.
During POST, the server BIOS presents screen messages to indicate error conditions. POST also provides beep codes to give you audible clues regarding the performance and operation of the server when there is no video display that can present error messages. In addition, a set of four bi-color diagnostic LEDs is located on the back edge of the server main board. These LEDs are active during POST and indicate the state of the server. Each of the four LEDs can have one of four states: Off, Green, Red, or Amber. See “Power-On Self-Test (POST)” on page 2-7 for a complete description of the screen messages, beep codes, and diagnostic LEDs.
Verifying Proper Operation of Key System LEDs
As POST determines the system configuration, it tests for the presence of each mass storage device installed in the system. As each device is checked, its activity light should turn on briefly. Check to see if the disk drive activity light for each drive turns on briefly.
2.2.3.2 Power LED Does Not Light
Check the following:
Is the system operating normally? If so, the power LED is probably defective or
the cable from the front panel to the server board is loose.
Are there other problems with the system? If so, check the items listed under
“System Cooling Fans Do Not Rotate Properly” on page 2-8.
If all items are correct and problems persist, contact your service representative or authorized dealer for help.
2.2.3.3 System Cooling Fans Do Not Rotate Properly
If the system cooling fans are not operating properly, system components could be damaged.
Check the following:
Is AC power available at the wall outlet?
Is the system power cord properly connected to the system and the wall outlet?
2-8 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Did you press the power button?
Is the power on light illuminated?
Have any of the fan motors stopped (use the server management subsystem to
check the fan status)?
Are the fan power connectors properly connected to the server board?
Is the cable from the front panel board connected to the server board?
Are the power supply cables properly connected to the server board?
Are there any shorted wires caused by pinched cables or power connector plugs
forced into power connector sockets the wrong way?
If the switches and connections are correct and AC power is available at the wall outlet, contact your service representative or authorized dealer for help.
2.2.3.4 Cannot Connect to a Server
Check the following:
Make sure the network cable is securely attached to the connector at the system
back panel. If the cable is attached but the problem persists, try a different cable.
Make sure the hub port is configured for the same duplex mode as the network
controller.
If you are directly connecting two servers (no hub), you will need a crossover
cable (see your hub documentation for more information on crossover cables).
Check the network controller LEDs that are visible through an opening at the
system back panel.
2.2.3.5 Problems with Network
If diagnostics pass, but the connection fails:
Make sure the network cable is securely attached.
The Activity LED does not light:
Make sure the network hub has power.
If the controller stopped working when an add-in adapter was installed:
Make sure the cable is connected to the port from the onboard network controller.
Try reseating the add in adapter.
If the add-in adapter stopped working without apparent cause:
Try reseating the adapter first; then try a different slot if necessary.
Chapter 2 NAS Head 2-9
2.2.3.6 Other Problems
If the preceding information does not fix the problem with your server, try the following:
Check for proper processor installation. Systems with a single processor must
have the CPU installed in CPU socket 1. If two processors are installed, the processors must be of the same speed and voltage (and within one stepping). Do not attempt to over clock the processors or other components on this system. Over clocking is generally not possible and may damage components and void the warranty of your server board and your boxed or tray processor.
Memory must be of the approved type and be properly seated.
Verify that all chassis and power supply fans are properly installed and
functioning.
Approved heat sinks must be properly installed on the processors. Do not attempt
to run the processors without a heat sink for even a few moments.
2.3 Troubleshooting the Server Using Built­In Tools
This chapter explains how to detect and isolate faulty components within the Sun StorEdge 5310 NAS. The chapter contains these sections:
“CIFS/SMB/Domain Issues” on page 2-92
“LEDs and Pushbuttons” on page 2-11
“Power-On Self Test (POST)” on page 2-24
“Contacting Technical Support” on page 2-1

2.4 Diagnosing System Errors

Use the following tools to help you isolate server problems:
“LEDs” on page 2-11
“Beep Codes” on page 2-11
“POST Screen Messages” on page 2-11
2-10 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004

2.4.1 LEDs

You can use the diagnostic LED indications to isolate faults. See “LEDs and Pushbuttons” on page 2-11.

2.4.2 Beep Codes

A built-in server speaker indicates failures with audible beeps. See “POST Error Beep Codes” on page 2-27.

2.4.3 POST Screen Messages

For many failures, the BIOS sends error codes and message to the screen. See “POST Screen Messages” on page 2-24

2.5 LEDs and Pushbuttons

Note – This section addresses LEDs and Pushbuttons on the Sun StorEdge 5310
NAS. The LEDs on the Sun StorEdge 5210 Expansion Unit are different.
This section describes the LEDs and pushbuttons on the Sun StorEdge 5310 NAS.
TABLE 2-3 Server LEDs
LED Name Function Location Color Status
ID Helps identify
the server from the front or rear
System status
Disk activity
Visible fault indicator
Indicates hard disk activity
One LED on front panel and one at rear corner
One LED on front panel and one at rear corner
Front panel and main board left side
Blue On = ID
Green or amber
Green Blinking = HDD activity
Off = POST in progress or system stop Green steady on = no fault Green blinking = degraded Amber steady = critical or non-recoverable state Amber blinking = non-critical state
Chapter 2 NAS Head 2-11
TABLE 2-3 Server LEDs
LED Name Function Location Color Status
Memory DIMM fault (1 - 6)
POST LEDs (1 - 4)
Fan fault (1 - 4)
CPU 1 and 2 fault
5V standby
Main power LED
Identifies failing DIMM module
Displays boot 80 POST codes
Identifies Sun StorEdge 5310 NAS fan failure
Identify CPU failure
Identify 5V standby power on state
Identifies power state of the server
At the front of each DIMM location on main board
Left rear of main board
On Sun StorEdge 5310 NAS fan module board
Back corner of processor socket on main board
Front left on main board
Front panel Green Off = power is off
Amber On = fault
Each LED can be off, green, red, or amber
Amber On = fault
Amber On = fault
Green Green = 5V standby power on
See “POST Progress Code LED Indicators” on page 2-30 for POST code LED details.
On = power is on

2.5.1 Front Panel LEDs and Pushbuttons

The front panel contains the pushbuttons and LEDs shown in Figure 2-1. Note that the illustration has the bezel removed.
2-12 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
NIC1 and NIC2 Activity LEDs
Power/Sleep Pushbutton
Power/Sleep LED
System Status LED
ID LED
ID Pushbutton Hard Disk Status LED
Reset Pushbutton
NMI Pushbutton
FIGURE 2-1 Front Panel Pushbuttons and LEDs
2.5.1.1 Front Panel LEDs
The front panel LEDs are summarized in Table 2-4.
TABLE 2-4 Front Panel LEDs
LED Color Function
Power Green This LED is controlled by software. It turns steady when the server is powered
up and is off when the system is off or in sleep mode.
NIC1 and NIC2 Green These LEDs are on when a good network link has been established. They blink
green to reflect network data activity.
Chapter 2 NAS Head 2-13
TABLE 2-4 Front Panel LEDs
LED Color Function
System Status/Fault
Green/ Amber
This LED can assume different states (green, amber, steady, blinking) to indicate critical, non-critical, or degraded server operation.
Steady green: Indicates the system is operating normally Blinking green: Indicates the system is operating in a degraded condition. Blinking amber: Indicates the system is in a non-critical condition. Steady amber: Indicates the system is in a critical or non-recoverable condition. Off: Indicates POST/system stop.
See “Front-Panel System Status LED” on page 2-18 for more details regarding this LED.
Hard Disk Drive Activity
Green The Drive Activity LED on the front panel is used to indicate drive activity from
the onboard SCSI controller. The server Main Board also provides a header, giving access to this LED for add-in IDE or SCSI controllers.
Blinking green (random): Hard disk activity Steady amber: Hard disk fault Off: No disk activity nor fault condition (or power is off).
System ID Blue The blue System Identification LED is used to help identify a system for
servicing when it is installed within a high density rack or cabinet that is populated with several other similar systems. The System ID LED is illuminated when the system ID button, located on the front panel, is pressed. If activated by the front panel pushbutton, the LED remains on until the pushbutton is depressed again. The LED also illuminates when the server receives a remote System Identify command from a remote management console. In this case, the LED turns off after a timeout period. The timeout period is configurable, with a default of 15 seconds. An additional blue System ID LED on the Main Board is visible through the rear panel. It mirrors the operation of the front panel LED.
2-14 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
2.5.1.2 Front Panel Pushbuttons
The front panel pushbuttons are summarized in Table 2-5.
TABLE 2-5 Front Panel Pushbuttons
Switch Function
Power/Sleep This pushbutton is used to toggle the system power on and off. This button is also used as a
sleep button for operating systems that follow the ACPI specification. Linux, for example, configures the power button to the instant off mode. There is no ACPI support for the Solaris OS.
Reset Depressing this pushbutton reboots and initializes the system.
NMI Pushing this recessed pushbutton causes a non-maskable interrupt to occur.
Note: NMI is not currently supported.
System ID This pushbutton toggles the state of the front panel ID LED and the server Main Board ID
LED. The Main Board ID LED is visible through the rear of the chassis and allows you to locate a particular server from behind a rack of servers.
Chapter 2 NAS Head 2-15

2.5.2 Rear Panel LEDs

The rear panel contains the LEDs shown in Figure 2-2.
NIC2 Network Activity LED
NIC1 Network Activity LED
FIGURE 2-2 Rear Panel LEDs
TABLE 2-6 Rear Panel LEDs
LED Color Function
Network Connection/ Network Activity
NIC2 Network Speed LED
Power Supply Status LEDs
NIC1 Network Speed LED
System Status LED*
POST LEDs (4)*
ID LED*
*LEDs are on main board, visible through rear of chassis
(redundant power supplies shown)
Green This LED is on the left side of each NIC connector.
Green = valid network connection. Blinking = transmit or receive activity.
Network Speed Amber/Green This LED is on the right side of the NIC connector.
Off = 10 Mbps operation. Green = 100 Mbps operation. Amber = 1000 Mbps operation.
POST LEDs (four) Multicolor
(Red/Green/Amber)
To help diagnose power-on self test (POST) failures, a set of four bi-color diagnostic LEDs is located on the back edge of the server Main Board. These LEDs are visible through holes in the rear panel. Each of the four LEDs can have one of four states: Off, Green, Red, or Amber. For detailed information on these LEDs, see “POST Progress Code LED Indicators” on page 2-30.
2-16 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
TABLE 2-6 Rear Panel LEDs
LED Color Function
System ID Blue This LED is located on the Main Board and is visible through
holes in the rear panel. It can provide a mechanism for identifying one system out of a group of identical systems. This can be particularly useful if the server is used in a rack­mount chassis in a high-density, multiple-system application. The LED is activated by depressing the front panel System ID pushbutton or if the server receives a remote System Identify command from a remote management console. If activated by the front panel pushbutton, the LED remains on until the pushbutton is depressed again. When the LED illuminates due to a remote System Identify command, the LED turns off after a timeout period. An additional blue System ID LED is located on the front panel that mirrors the operation of the rear Main Board LED.
System Status/Fault Green/Amber This LED reflects the state of the System Status LED on the
front panel.
Power Supply Green/Amber This is a bi-color LED that can be on, off, green, amber, or
blinking, or combination thereof. See “Rear Panel Power Supply Status LED” on page 2-20 for more detailed information.
Chapter 2 NAS Head 2-17

2.5.3 Front-Panel System Status LED

The front-panel system status LED is located as shown in Figure 2-3.
System Status LED
FIGURE 2-3 Location of Front-Panel System Status LED
The front-panel system status LED has the states indicated in Table 2-7.
TABLE 2-7 System Status LED States
System Status LED State System Condition
CONTINUOUS GREEN Indicates the system is operating normally.
BLINKING GREEN Indicates the system is operating in a degraded condition.
BLINKING AMBER Indicates the system is in a non-critical condition.
CONTINUOUS AMBER Indicates the system is in a critical or non-recoverable condition.
OFF Indicates POST/system stop.
Critical Condition
A critical condition or non-recoverable threshold crossing is indicated with a continuous amber status LED and is associated with the following events:
Temperature, voltage, or fan critical threshold crossing.
2-18 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Power subsystem failure. The Baseboard
1
Management Controller (BMC) asserts this failure whenever it detects a power control fault (for example, the BMC detects that the system power is remaining on even though the BMC has deasserted the signal to turn off power to the system).
The system is unable to power up due to incorrectly installed processor(s), or
processor incompatibility.
A satellite controller such as the HSC, or another IMPI-capable device, such as an
add-in server management PCI card, sends a critical or non-recoverable state, via the Set Fault Indication command to the BMC.
Critical Event Logging errors, including System Memory Uncorrectable ECC error
and Fatal/Uncorrectable Bus errors, such as PCI SERR and PERR.
Non-Critical Condition
A non-critical condition is indicated with a blinking amber status LED and signifies that at least one of the following conditions is present:
Temperature, voltage, or fan non-critical threshold crossing.
Chassis intrusion.
Satellite controller sends a non-critical state, via the Set Fault Indication
command, to the BMC.
A Set Fault Indication command from the system BIOS. The BIOS may use the Set
Fault Indication command to indicate additional, non-critical status such as system memory or CPU configuration changes.
Degraded Condition
A degraded condition is indicated with a blinking green status LED and signifies that at least one of the following conditions is present:
Non-redundant power supply operation. This only applies when the BMC is
configured for a redundant power subsystem. The power unit configuration is configured via OEM SDR records.
A processor is disabled by FRB or BIOS.
BIOS has disabled or mapped out some of the system memory.
This Troubleshooting Guide gives information on how to isolate the server component responsible for any of the critical, non-critical, or degraded conditions listed above.
1. Baseboard refers to the server Main Board.
Chapter 2 NAS Head 2-19

2.5.4 Rear Panel Power Supply Status LED

The rear-panel power supply status LEDs are located as shown in Figure 2-4.
Power Supply Status LEDs
(Redundant Power Supplies)
FIGURE 2-4 Location of Rear-Panel Power Supply Status LEDs
The rear-panel power supply status LED has the states indicated in Table 2-8.
TABLE 2-8 Power Supply Status LED States
Power Supply LED State Power Supply Condition
OFF No AC power present to power supply
BLINKING GREEN AC power present, but only the standby outputs are on
GREEN Power supply DC outputs are on and OK
BLINKING AMBER PSAlert# signal asserted, power supply on
AMBER Power supply shutdown due to over current, over temperature, over voltage,
or undervoltage
AMBER or OFF Power supply failed and AC fuse open or other critical failure
Note – If redundant power supplies are used in the Sun StorEdge 5310 NAS, the
power supply LEDs have the following meaning:
Both LEDs off = no power to power supplies or both power supplies bad. Both LEDs blinking green = power supplies receiving AC power, but server is off. Both LEDs solid green = server is fully powered on and power supplies are good. One LED solid green and one LED amber = AC power missing from one of the power supplies.
2-20 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004

2.5.5 Server Main Board Fault LEDs

There are several fault and status LEDs built into the server board (see Figure 2-5). Some of these LEDs are visible only when the chassis cover is removed. The LEDs are explained in this section.
System Status
LED
ID LED
5V Sytem Standby LED
POST
LEDs
DIMM Fault
LEDs (6)
CPU 1 Fault LED
CPU 2 Fault LED
FIGURE 2-5 Fault and Status LEDs on the Server Board
The fault LEDs are summarized below.
Chapter 2 NAS Head 2-21
POST LEDs: To help diagnose POST failures, a set of four bi-color diagnostic
LEDs is located on the back edge of the baseboard. Each of the four LEDs can have one of four states:
Off, Green, Red, or Amber. During the POST process, each light sequence represents a specific Port-80 POST code. If a system should hang during POST, the diagnostic LEDs present the last test executed before the hang. When reading the lights, the LEDs should be observed from the back of the system. The most significant bit (MSB) is the first LED on the left, and the least significant bit (LSB) is the last LED on the right.
See “POST Progress Code LED Indicators” on page 2-30 for details regarding the POST LED display.
CPU Fault LEDs: A fault indicator LED is located next to each of the processor
sockets. If the server Baseboard Management Controller (BMC) detects a fault in any processor, the corresponding LED illuminates.
Memory Fault LEDs: A fault indicator LED is located next to each of the DIMM
sockets. If the BMC detects a fault in a given DIMM, the corresponding LED illuminates.
One LED for each DIMM is illuminated if that DIMM has an uncorrectable or multi-bit memory error. The LEDs maintain the same state across power switch, power down, or loss of AC power.
Fan Fault LEDs: Depending on the server model, the fan header may include a
fan fault LED. If the BMC detects a fan fault, the LED illuminates. If the fan fault LED is lit, the entire fan module must be replaced.
System Status LED: Indicates functional status of the server board. Glows green
when all systems are operating normally. Glows amber when one or more systems are in a fault status. This LED mirrors the function of the system status LED on the front panel.
See Table 2-7 on page 2-18 for a description of the LED states.
+5V Standby LED. This green LED is on when the server is plugged into AC
power, whether or not the server is actually powered on. AC power is applied to the system as soon as the AC cord is plugged into the power supply.
System ID LED. This blue LED can be illuminated to identify the server when it
is part of a large stack of servers. See “System ID LEDs” on page 2-23 for details.
2-22 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004

2.5.6 System ID LEDs

A pair of blue LEDs, one at the rear of the server, and one on the front panel, can be used to easily identify the server when it is part of a large stack of servers. A single blue LED located at the back edge of the server board next to the backup battery is visible through the rear panel. The two LEDs mirror each other and can be illuminated by the Baseboard Management Controller (BMC) either by pressing a button on the chassis front panel or through server-management software. When the button is pressed on the front panel, both LEDs illuminate and stay illuminated until the button is pressed again. If the LED is illuminated through a remote System Identify command, the LED turns off after a timeout period. See Figure 2-5 on page 2-21 for the location of the rear Main Board LED. The front panel ID LED and the ID activation button are shown in Figure 2-6.
FIGURE 2-6 Location of Front-Panel ID Pushbutton and LED
ID LED
ID Pushbutton
Chapter 2 NAS Head 2-23

2.6 Power-On Self Test (POST)

The BIOS indicates the current testing phase during POST by writing a hex code to the Enhanced Diagnostic LEDs, located on the rear of the server main board and visible through the back of the chassis.
If errors are encountered, error messages or codes will either be displayed to the video screen, or if an error has occurred prior to video initialization, errors will be reported through a series of audible beep codes. POST errors are logged in to the System Event Log (SEL).
During the power-on self test (POST), the server may indicate a system fault by:
Displaying error codes and messages at the display screen
Beeping the speaker in a coded sequence
Illuminating the POST LEDs, visible from the rear panel, in a coded fashion

2.6.1 POST Screen Messages

During POST, if an error is detected, the BIOS displays an error code and message to the screen. The tables in this section describe the standard and extended POST error codes and their associated messages. The BIOS prompts the user to press a key in case of serious errors. Some of the error messages are preceded by the string “Error” to highlight the fact that the system may be malfunctioning. All POST errors and warnings are logged in the System Event Log (SEL) unless it is full.
Note – All POST errors are logged to the SEL, which is capable of holding
approximately 3200 entries. After the SEL is full, no further errors are logged. The SEL can be cleared using the SSU or the BIOS setup. The SEL is automatically cleared after running the PCT.
Table 2-9 and Table 2-10 contain the POST error messages and error codes.
TABLE 2-9 Standard POST Error Messages and Codes
Error Code Error Message Pause On Boot
100 Timer Channel 2 error Yes
101 Master Interrupt Controller Yes
102 Slave Interrupt Controller Yes
103 CMOS battery failure Yes
2-24 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
TABLE 2-9 Standard POST Error Messages and Codes (Continued)
Error Code Error Message Pause On Boot
104 CMOS options not set Yes
105 CMOS checksum failure Yes
106 CMOS display error Yes
107 Insert key pressed Yes
108 Keyboard locked message Yes
109 Keyboard stuck key Yes
10A Keyboard interface error Yes
10B System memory size error Yes
10E External cache failure Yes
113 Hard disk 0 error Yes
114 Hard disk 1 error Yes
115 Hard disk 2 error Yes
116 Hard disk 3 error Yes
11B Date/time not set Yes
11E Cache memory bad Yes
120 CMOS clear Yes
121 Password clear Yes
140 PCI error Yes
141 PCI memory allocation error Yes
142 PCI IO allocation error Yes
143 PCI IRQ allocation error Yes
144 Shadow of PCI ROM failed Yes
145 PCI ROM not found Yes
146 Insufficient memory to shadow PCI ROM Yes
Chapter 2 NAS Head 2-25
TABLE 2-10 Extended POST Error Messages and Codes
Error Code Error Message Pause On Boot
8100 Processor 1 failed BIST No
8101 Processor 2 failed BIST No
8110 Processor 1 internal error (IERR) No
8111 Processor 2 internal error (IERR) No
8120 Processor 1 thermal trip error No
8121 Processor 2 thermal trip error No
8130 Processor 1 disabled No
8131 Processor 2 disabled No
8140 Processor 1 failed FRB-3 timer No
8141 Processor 2 failed FRB-3 timer No
8150 Processor 1 failed initialization on last boot. No
8151 Processor 2 failed initialization on last boot. No
8160 Processor 01: unable to apply BIOS update Yes
8161 Processor 02: unable to apply BIOS update Yes
8170 Processor P1 :L2 cache failed Yes
8171 Processor P2 :L2 cache failed Yes
8180 BIOS does not support current stepping for Processor P1 Yes
8181 BIOS does not support current stepping for Processor P2 Yes
8190 Watchdog timer failed on last boot No
8191 4:1 core to bus ratio: processor cache disabled Yes
8192 L2 Cache size mismatch Yes
8193 CPUID, processor stepping are different Yes
8194 CPUID, processor family are different Yes
8195 Front side bus speed mismatch: System halted Yes, Halt
8196 Processor models are different Yes
8197 CPU speed mismatch Yes
8198 Failed to load processor microcode Yes
8300 Baseboard Management Controller (BMC) failed to function Yes
8301 Front panel controller failed to function Yes
2-26 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
TABLE 2-10 Extended POST Error Messages and Codes (Continued)
Error Code Error Message Pause On Boot
8305 Hotswap controller failed to function Yes
8420 Intelligent System Monitoring chassis opened Yes
84F1 Intelligent System Monitoring forced shutdown Yes
84F2 Server Management Interface failed Yes
84F3 BMC in update mode Yes
84F4 Sensor Data Record (SDR) empty Yes
84FF System event log full No
8500 Bad or missing memory in slot 3A Yes
8501 Bad or missing memory in slot 2A Yes
8502 Bad or missing memory in slot 1A Yes
8504 Bad or missing memory in slot 3B Yes
8505 Bad or missing memory in slot 2B Yes
8506 Bad or missing memory in slot 1B Yes
8601 All memory marked as fail: forcing minimum back online Yes

2.6.2 POST Error Beep Codes

The tables in this section list the POST error beep codes. Prior to system video initialization, the BIOS and BMC use these beep codes to notify users of error conditions.
TABLE 2-11 BMC-Generated POST Beep Codes
Beep Code1Description
1 One short beep before boot (normal, not an error)
1-2 Search for option ROMs. One long beep and two short beeps on checksum
failure.
1-2-2-3 BIOS ROM checksum
1-3-1-1 Test DRAM refresh
1-3-1-3 Test 8742 keyboard controller
1-3-3-1 Auto size DRAM. System BIOS stops execution here if the BIOS does not
detect any usable memory DIMMs.
1-3-4-1 Base RAM failure. BIOS stops execution here if entire memory is bad.
Chapter 2 NAS Head 2-27
TABLE 2-11 BMC-Generated POST Beep Codes
Beep Code1Description
2-1-2-3 Check ROM copyright notice.
2-2-3-1 Test for unexpected interrupts.
1-5-1-1 FRB failure (processor failure)
1-5-2-2 No processors installed or processor socket 1 is empty
1-5-2-3 Processor configuration error (for example, mismatched VIDs)
1-5-2-4 Front-side bus select configuration error (for example, mismatched BSELs)
1-5-4-2 Power fault: DC power unexpectedly lost (for example, power good from the
power supply was deasserted)
1-5-4-3 Chipset control failure
1-5-4-4 Power control failure (for example, power good from the power supply did
not respond to power request)
1 The code indicates the beep sequence; for example, 1-5-1-1 means a single beep, then a
pause, then 5 beeps in a row, then a pause, then a single beep, then a pause, and then finally a single beep.
TABLE 2-12
Beep Code Error Message Description
BIOS-Generated Boot Block POST Beep Codes
1 Refresh timer failure The memory refresh circuitry on the motherboard is faulty.
2 Parity error Parity can not be reset
3 Base memory failure Base memory test failure. See Table 2-13 on page 2-29 for
additional error details.
4 System timer System timer is not operational
5 Processor failure Processor failure detected
6 Keyboard controller Gate A20
failure
The keyboard controller may be bad. The BIOS cannot switch to protected mode.
7 Processor exception interrupt error The CPU generated an exception interrupt.
8 Display memory read/write error The system video adapter is either missing or its memory is
faulty. This is not a fatal error.
9 ROM checksum error System BIOS ROM checksum error
10 Shutdown register error Shutdown CMOS register read/write error detected
11 Invalid BIOS General BIOS ROM error
2-28 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
TABLE 2-13 Memory 3-Beep and LED POST Error Codes
Debug Port
Beep Code
3 00h Off Off Off Off No memory was found in the system
3 01h Off Off Off G Memory mixed type detected
3 02h Off Off G Off EDO is not supported
3 03h Off Off G G First row memory test failure
3 04h Off G Off Off Mismatched DIMMs in a row
3 05h Off G Off G Base memory test failure
3 06h Off G G Off Failure on decompressing post
3 07h Off G G G Generic memory error
80h Error
Indicator
MSB LSB
08h
09h G Off Off G
0Ah
0Bh G Off G G
0Ch
0Dh G G Off G
3 0Eh G G G Off SMBUS protocol error
3 0Fh G G G G Generic memory error
Diagnostic LED Decoder
(G = green, R = red, A = amber) Meaning
module
G Off Off Off
GOffGOff
GGOffOff
2.6.2.1 BIOS Recovery Beep Codes
In rare cases, when the system BIOS has been corrupted, a BIOS recovery process must be followed to restore system operability. During recovery mode, the video controller is not initialized. One high-pitched beep announces the start of the recovery process. The entire process takes two to four minutes. A successful update ends with two high-pitched beeps. In the event of a failure, two short beeps are generated and a flash code sequence of 0E9h, 0EAh, 0EBh, 0ECh, and 0EFh appears at the Port 80 diagnostic LEDs (see Table 2-14 on page 2-30).
Chapter 2 NAS Head 2-29
TABLE 2-14 BIOS Recovery Beep Codes
Beep Code
1 Recovery
Series of long low­pitched single beeps
Two long high pitched beeps
Error Message Port 80h LED Indicators Description
Start recovery process.
started
Recovery failed
Recovery complete
EEh Unable to process valid BIOS recovery
images. BIOS already passed control to OS and flash utility.
EFh BIOS recovery succeeded, ready for
powerdown, reboot.

2.6.3 POST Progress Code LED Indicators

To help diagnose POST failures, a set of four bi-color diagnostic LEDs is located on the back edge of the server main board. Each of the four LEDs can have one of four states: Off, Green, Red, or Amber.
The LED diagnostics feature consists of a hardware decoder and four dual color LEDs. During boot block POST and post boot block POST, the LEDs display all normal Port 80 codes representing the progress of the BIOS POST. Each POST code is represented by a combination of colors from the four LEDs. The LEDs are in pairs of green and red. The POST codes are broken into two nibbles, an upper and a lower nibble. Each bit in the upper nibble is represented by a red LED and each bit in the lower nibble is represented by a green LED. If both bits are set in the upper and lower nibble then both red and green LEDs are lit, resulting in an amber color. Likewise, if both bits are clear, the red and green LEDs are off.
Figure 2-7 shows examples of how the POST LEDs are coded.
2-30 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
POST LEDs (as viewed from back of server)
= upper nibble bits
= lower nibble bits
RED
GREEN OFF
AMBER
1 0 0 1 0 0 1 1
high bits
(on left)
AMBER
RED GREEN
low bits
(on right)
OFF
1 1 1 0 0 1 0 0
high bits
(on left)
FIGURE 2-7 Examples of POST LED Coding
low bits
(on right)
POST Code = 95h upper nibble = 1001 = 9h lower nibble = 0101 = 5h
POST Code = CAh upper nibble = 1100 = Ch lower nibble = 1010 = Ah
During the POST process, each light sequence represents a specific Port-80 POST code. If a system should hang during POST, the diagnostic LEDs present the last test executed before the hang. When you read the LEDs, observe them from the back of the system. The most significant bit (MSB) is the leftmost LED, and the least significant bit (LSB) is the rightmost LED.
Note – When comparing a diagnostic LED color sequence from the server Main
Board to those listed in the diagnostic LED decoder in the following tables, the LEDs on the Main Board should be referenced when viewed by looking into the system from the back. Reading the LEDs from left to right, the most-significant bit is located on the left.
TABLE 2-15 Boot Block POST Progress LED Code Table (Port 80h Codes)
POST
Code
10h Off Off Off R The NMI is disabled. Start power-on delay. Initialization code
11h Off Off Off A Initialize the DMA controller, perform the keyboard controller BAT
Diagnostic LED Decoder
(G = green, R = red, A = amber) Description
MSB LSB
MSB LSB
checksum verified.
test, start memory refresh, and enter 4 GB flat mode.
Chapter 2 NAS Head 2-31
TABLE 2-15 Boot Block POST Progress LED Code Table (Port 80h Codes) (Continued)
POST
Code
Diagnostic LED Decoder
(G = green, R = red, A = amber) Description
12h Off Off G R Get start of initialization code and check BIOS header.
13h Off Off G A Memory sizing.
14h Off G Off R Test base 512K of memory. Return to real mode. Execute any OEM
patches and set up the stack.
15h Off G Off A Pass control to the uncompressed code in shadow RAM. The
initialization code is copied to segment 0 and control will be transferred to segment 0.
16h Off G G R Control is in segment 0. Verify the system BIOS checksum. If the
system BIOS checksum is bad, go to checkpoint code E0h; otherwise, going to checkpoint code D7h.
17h Off G G A Pass control to the interface module.
18h G Off Off R Decompression of the main system BIOS failed.
19h G Off Off A Build the BIOS stack. Disable USB controller. Disable cache.
1Ah G Off G R Uncompress the POST code module. Pass control to the POST code
module.
1Bh A R Off R Decompress the main system BIOS runtime code.
1Ch A R Off A Pass control to the main system BIOS in shadow RAM.
E0h R R R Off Start of recovery BIOS. Initialize interrupt vectors, system timer, DMA
controller, and interrupt controller.
E8h A R R Off Initialize extra module if present.
EEh A A A Off Jump to boot sector.
TABLE 2-16 POST Progress LED Code Table (Port 80h Codes)
POST
Code
Diagnostic LED Decoder
(G = green, R = red, A = amber) Description
MSB LSB
20h Off Off R Off Uncompress various BIOS modules.
22h Off Off A Off Verify password checksum.
24h Off G R Off Verify CMOS checksum.
26h Off G A Off Read microcode updates from BIOS ROM.
28h G Off R Off Initializing the processors. Set up processor registers. Select least
featured processor as the BSP.
2-32 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
TABLE 2-16 POST Progress LED Code Table (Port 80h Codes) (Continued)
POST
Code
Diagnostic LED Decoder
(G = green, R = red, A = amber) Description
2Ah G Off A Off Go to Big Real mode.
2Ch G G R Off Decompress INT13 module.
2Eh G G A Off Keyboard controller test: the keyboard controller input buffer is
free. Next, the BAT command will be issued to the keyboard controller.
30h Off Off R R Swap keyboard and mouse ports, if needed.
32h Off Off A R Write command byte 8042: the initialization after the keyboard
controller BAT command test is done. The keyboard command byte will be written next.
34h Off G R R Keyboard Init: the keyboard controller command byte is written.
Next, the pin 23 and 24 blocking and unblocking commands will be issued.
36h Off G A R Disable and initialize the 8259 programmable interrupt controller.
38h G Off R R Detect configuration mode, such as CMOS clear.
3Ah G Off A R Chipset initialization before CMOS initialization.
3Ch G G R R Init system timer: the 8254 timer test is over. Starting the legacy
memory refresh test next.
3Eh G G A R Check refresh toggle: the memory refresh line is toggling.
Checking the 15 second on/off time next.
40h Off R Off Off Calculate CPU speed.
42h Off R G Off Init interrupt vectors: interrupt vector initialization is done.
44h Off A Off Off Enable USB controller in chipset.
46h Off A G Off Initialize SMM handler. Initialize USB emulation.
48h G R Off Off Validate NVRAM areas. Restore from backup if corrupted.
4Ah G R G Off Load defaults in CMOS RAM if bad checksum or CMOS clear
jumper is detected.
4Ch G A Off Off Validate date and time in RTC.
4Eh G A G Off Determine number of microcode patches present.
50h Off R Off R Load microcode to all CPUs.
52h Off R G R Scan SMBIOS GPNV areas.
54h Off A Off R Early extended memory tests.
56h Off A G R Disable DMA.
58h G R Off R Disable video controller.
Chapter 2 NAS Head 2-33
TABLE 2-16 POST Progress LED Code Table (Port 80h Codes) (Continued)
POST
Code
Diagnostic LED Decoder
(G = green, R = red, A = amber) Description
5Ah G R G R 8254 timer test on channel 2.
5Ch G A Off R Enable 8042. Enable timer and keyboard IRQs. Set video mode
initialization before setting the video mode is complete. Configuring the monochrome mode and color mode settings next.
5Eh G A G R Initialize PCI devices and motherboard devices. Pass control to
video BIOS. Start serial console redirection.
60h Off R R Off Initialize memory test parameters.
62h Off R A Off Initialize AMI display manager module. Initialize support code
for headless system if no video controller is detected.
64h Off A R Off Start USB controllers in chipset.
66h Off A A Off Set up video parameters in BIOS data area.
68h G R R Off Activate ADM: the display mode is set. Displaying the power-on
message next.
6Ah G R A Off Initialize language module. Display splash logo.
6Ch G A R Off Display sign on message, BIOS ID, and processor information.
6Eh G A A Off Detect USB devices.
70h Off R R R Reset IDE Controllers.
72h Off R A R Displaying bus initialization error messages.
74h Off A R R Display setup message: the new cursor position has been read and
saved. Displaying the hit setup message next.
76h Off A A R Ensure timer keyboard interrupts are on.
78h G R R R Extended background memory test start.
7Ah G R A R Disable parity and NMI reporting.
7Ch G A R R Test 8237 DMA controller: the DMA page register test passed.
Performing the DMA controller 1 base register test next.
7Eh G A A R Initialize 8237 DMA controller: the DMA controller 2 base register
test passed. Programming DMA controllers 1 and 2 next.
80h R Off Off Off Enable mouse and keyboard: the keyboard test has started.
Clearing the output buffer and checking for stuck keys. Issuing the keyboard reset command next
82h R Off G Off Keyboard interface test: A keyboard reset error or stuck key was
found. Issuing the keyboard controller interface test command next.
2-34 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
TABLE 2-16 POST Progress LED Code Table (Port 80h Codes) (Continued)
POST
Code
Diagnostic LED Decoder
(G = green, R = red, A = amber) Description
84h R G Off Off Check stuck key enable keyboard: the keyboard controller
interface test is complete. Writing the command byte and initializing the circular buffer next.
86h R G G Off Disable parity NMI: the command byte was written and global
data initialization has completed. Checking for a locked key next.
88h A Off Off Off Display USB devices.
8Ah A Off G Off Verify RAM size: Checking for a memory size mismatch with
CMOS RAM data next.
8Ch A G Off Off Lock out PS/2 keyboard/mouse if unattended start is enabled.
8Eh A G G Off Initialize boot devices: the adapter ROM had control and has now
returned control to the BIOS POST. Performing any required processing after the option ROM returned control.
90h R Off Off R Display IDE mass storage devices.
92h R Off G R Display USB mass storage devices.
94h R G Off R Report the first set of POST errors to Error Manager.
96h R G G R Boot password check: the password was checked. Performing any
required programming before Setup next.
98h A Off Off R Float processor initialize: performing any required initialization
before the coprocessor test next.
9Ah A Off G R Enable Interrupts 0, 1, 2: checking the extended keyboard,
keyboard ID, and NUM Lock key next. Issuing the keyboard ID command next.
9Ch A G Off R Initialize FDD devices. Report second set of POST errors to error
messager.
9Eh A G G R Extended background memory test end.
A0h R Off R Off Prepare and run setup: Error manager displays and logs POST
errors. Waits for user input for certain errors. Execute setup.
A2h R Off A Off Set base expansion memory size.
A4h R G R Off Program chipset setup options, build ACPI Tables, and build
INT15h E820h table.
A6h R G A Off Set display mode.
A8h A Off R Off Build SMBIOS table and MP tables.
AAh A Off A Off Clear video screen.
Chapter 2 NAS Head 2-35
TABLE 2-16 POST Progress LED Code Table (Port 80h Codes) (Continued)
POST
Code
ACh A G R Off Prepare USB controllers for operating system.
AEh A G A Off One beep to indicate end of POST. No beep if silent boot is
000h Off Off Off Off POST completed. Passing control to INT 19h boot loader next.
Diagnostic LED Decoder
(G = green, R = red, A = amber) Description
enabled.

2.7 OS Operations

2.7.1 Filesystem Check (fsck) Procedure

The first step in filesystem repair is to ensure that you have a complete, tested backup. The filesystem check carries some risk. Directories, files and filenames may be lost. A tested backup means that the data has been restored from tape, and checked for validity.
After the backup, the next step is to schedule the file system check. The volume that you are running the filesystem check against will be unavailable for the duration of the process. In addition, if this is the volume containing the /etc directory, all other volumes will be offline for the duration of the process. In any case, there will be a heavy load on the filesystem that will affect all clients. It is difficult to determine how long the process will take, as there are several variables which cause this time to vary, such as system specifications, size of volume, workload, and how many errors are found. The check should be run as soon as possible, as the filesystem problems can potentially worsen when writing to a damaged volume.
As a general rule, allow five hours for each run, more if a large number of errors are expected. Also note that if any errors are found, multiple runs are always required. Because of the time involved, consideration should be given to recreating the volume and restoring from a backup. This decision should be made based on the severity of the problem. A read-only filesystem check may be helpful in making this determination, but this may add several hours to the process.
Next, run the fsck procedure. This is done at the StorEdge CLI. It is strongly recommended to log the output of the filesystem check session for escalation purposes. Therefore, you should access the CLI with a client that is capable of logging, such as a LAN connected client or a serial console. Using a dial-up or WAN connected client is not recommended, as this can extend the run time of the procedure.
2-36 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
At the CLI, enter “fsck <volumename>”. You will then be prompted whether repairs should be made if errors are found. Generally, the answer should be “y” for “yes”. The other potentially useful option is “n” for “no”. This will run a check against the volume without writing the repairs. As noted above, this can be used to make decisions about running the filesystem check.
If errors are reported by the filesystem check, the filesystem check must be repeated until there are no errors. This may require several runs of the filesystem check. In this case, the following message is displayed:
sfs2ck vol1: no errors
It is also possible, but very rare, that the above message will never be seen. This can occur in extreme cases where the filesystem check is unable to completely repair a volume. In these cases, the volume should be deleted and restored from tape.
Another rare possibility is that the filesystem check can fail and either hang or reboot. In this case, proceed according to the instructions under the heading “System hang or reboot during normal operation” above, and escalate the issue immediately.
If repairs were made by the filesystem check, file and directory names are sometimes lost. These files are issued a name that begins with “Node”, followed by numbers related to the inode location in the filesystem. This number is generally not useful, other than to ensure a unique filename. These files and directories retain their original contents, to the extent possible. Manual inspection of these files is required to determine the original file type and filesystem location.

2.7.2 StorEdge Network Capture Utility

Sun StorEdge 5310 NAS includes a built-in network monitoring tool. This allows you to capture packets from the network and save them to a file. This can be a valuable troubleshooting tool.
To configure network monitoring, it must first be loaded at the StorEdge CLI.
To access the StorEdge CLI:
1. Connect to the StorEdge via Telnet or serial console, and type admin at the [menu]
prompt and enter the administrator password.
2. At the CLI, enter load netm.
3. Then type menu to configure capture and capture packets.
4. Press the spacebar until “Packet Capture” is displayed under “Extensions” at the lower right.
5. Select the letter corresponding to “Packet Capture”.
Chapter 2 NAS Head 2-37
6. Select option “1”, Edit Fields.
The available options are as follows:
Capture File—Where to save the capture file, in the format
/volumename/directory/filename
Frame Size (B)—Size in bytes of each frame to capture. The default is normally
used.
IP Packet Filter—”No” captures all traffic, “Yes” allow you to filter what is
received. A filter allows you to select which IP address or addresses you will capture traffic
from. You can also filter on a particular TCP or UDP port.
Dump Enable—Select “Yes” to allow StorEdge to save the capture in the event of
a problem.
7. After configuring these options, select option “7”, “Start Capture”
8. Reproduce the network event you wish to capture.
9. Select option “7”, “Stop Capture”.
10. Access the file via NFS or SMB and copy the file as needed.

2.7.3 Upgrades

2.7.4 Cacls - Access Control List

For issues with access to a file or directory, collect the output of the cacls command. This command is available from the CLI. At the CLI, enter “cacls <full pathname>”. The full pathname should begin with the volume name, as in this example: “cacls /vol1/testfile.txt”.
Cacls output contains the following information:
First, the basic mode information and UID/GID of the owner is displayed. Here is an example:
drwxrw---- 34 22 /vol1/data
In this case, we can see that the item is a directory, with 750 permissions: Read/write/execute (7) for the owner (UID 34), Read/write for members of the owner’s group (GID 22), and no permissions (0) for everyone else.
2-38 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Listed next are Creation time, FS Creation time, and FS mtime. These are timestamps associated with the file and the filesystem, generally only useful for troubleshooting timestamp issues.
Next is the Windows security descriptor. In its simplest form, it will read “No security descriptor”. This means that no Windows security is present, and that Windows will simulate security based on the above NFS permissions.
If a Windows security descriptor is present, the following information is displayed:
Security Descriptor:The type of security descriptor. This can be disregarded.
Owner:The user name or SID of the owner.
Primary Group: The group name or SID of the group owner.
Discretionary Access Control List (DACL):A list of users who have access to the
file, by SID.
A SID is a number that uniquely identifies a user or group. The data to the right of the final dash identifies the user within the domain; the rest of the number indicates domain and type of account information. This user information is known as the RID (relative ID). The RID is the number used for user mapping. It can be cross­referenced with the StorEdge user or group mapping data determine the user/group name and NFS UID/GID.

2.7.5 Proc filesystem

The /proc filesystem is a virtual filesystem used to collect system data. The location of some of the more useful data is listed below. To collect the data, copy the file, or use the “cat” CLI command to dump it to the screen while logging the terminal session.
/proc/cifs/DOMAIN.USER.6789ABCD…
These are user access tokens. They may be useful in troubleshooting SMB issues.
These file names begin with the domain name, then the username, then some hexadecimal digits. The hexadecimal digits are a representation of the IP address, which can be used to discern between multiple logins for a user. If you do not see the user token that you need, it may be necessary to log the user off for thirty seconds, and then back on in order to capture the token.
/proc/cifs/pdc
The currently connected domain, domain controller, and the IP address of the domain controller.
/proc/cifs/ntdomain
Chapter 2 NAS Head 2-39
A list of all trusted domains, their related SIDs, and the local machine and local domain SIDs.

2.7.6 FTP Server

To use the built in ftp server, you need to load the ftp daemon from the command line. The command is as follows:
load ftpd <CR>
This will allow you to ftp files to and from the Sun StorEdge 5310 NAS.

2.8 Updating the OS on the Sun StorEdge 5310 NAS

This section provides information on firmware and BIOS upgrades
This section contains the following topics:
“Sun StorEdge 5310 NAS Firmware” on page 2-40

2.9 Sun StorEdge 5310 NAS Firmware

2.9.1 Operating System

2.9.1.1 Upgrading the StorEdge Operating System
The StorEdge software can be updated either via the StorEdge Web Admin or via copying the file directly. Before performing the upgrade, you must have downloaded the software and extracted the upgrade file. This file should have the extension “.img”. This file should be stored locally on the client from which the software upgrade will be done. The operating system upgrade requires a system reboot which should be done immediately after copying new OS to the system.
2-40 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Important – After the reboot, the system may take as long as five minutes to
complete the software upgrade and return to service. There is no visual indication that this process is taking place. The StorEdge LCD displays “…booting…” during this process. If it is necessary to check the status of the upgrade, connect a display to the StorEdge.
To update the operating system via web interface:
1. To use the Web Admin, connect with a Web browser to http://<hostname or IP address of your StorEdge>.
2. Click “Grant” or “Yes” to accept any Java software authorization windows and you will reach the login screen.
3. Type the administrator password to access the administration interface.
4. Navigate to System Operations/ Update Software.
FIGURE 2-8 The Update Software Panel
5. Click on the Browse button and navigate to the directory containing the new OS image file.
6. Click to select the OS image file
7. Click the “Open” button. The file name and path are now displayed in Update Software window
8. Click on the Update button
9. Wait for the OS image file to upload to the StorEdge.
Chapter 2 NAS Head 2-41
10. When the update process is complete, click Yes to reboot, or No to continue without rebooting. The update does not take effect until the system is rebooted.
To update the operating system via file copy:
1. Access the StorEdge via SMB or NFS.
2. Via SMB, access the share c$.
You must be a member of the local Administrators group to access this share. Via NFS, mount to /cvol. By default, this is only possible from a trusted host.
3. In either case, copy the operating system image to the root of /cvol.
4. Next, reboot the StorEdge via one of the administration interfaces.
The operating system upgrade will take place before the system comes up.

2.10 Common Problems Encountered on the Sun StorEdge 5310 NAS

This chapter describes common problems with the Sun StorEdge 5310 NAS.
It includes the following sections:
“CIFS/SMB/Domain” on page 2-43
“NFS Issues” on page 2-61
“Network Issues” on page 2-66
“File System Issues” on page 2-70
“Drive Failure Messages” on page 2-74
“File and Volume Operations” on page 2-76
“Administration Interfaces” on page 2-78
“StorEdge Features and Utilities” on page 2-82
“Hardware Warning Messages” on page 2-84
“Backup Issues” on page 2-88
“Direct Attached Tape Libraries” on page 2-90
“StorEdge File Replicator Issues” on page 2-152
2-42 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004

2.11 CIFS/SMB/Domain

Changes to Windows group membership do not take effect. Changes to user mapping do not take effect.
Windows clients use a device called an access token to assign user data and group membership. This token is assigned when the client connects to the StorEdge. Any changes to this token are not implemented until the next time the user connects.
To cause any changes to take effect immediately, ensure that the user closes all sessions with the StorEdge.
The easiest way to do this is to log the user out of all connected workstations. It is necessary that the user remain disconnected for approximately 30 seconds because the token is cached for a short time.
To ensure that all users’ tokens are updated after making large-scale changes, reboot StorEdge. This action ensures that all sessions are disconnected.
Windows clients cannot connect by NetBIOS name. StorEdge not present in browse list / Network Neighborhood.
A master browser is a server that is configured to manage CIFS/SMB browse lists and respond to client requests for them. Windows server operating systems are configured to do this by default.
StorEdge is configured not to act as a master browser. This is done to dedicate all StorEdge resources to file sharing.
For the browsing to function correctly, each subnet or physical network segment must have a master browser. Therefore, if you wish to make the StorEdge available via browse lists, it should be located on the same segment and subnet as a Windows Server.
Note that configuring a WINS server improves the performance of browsing, and in some cases may compensate for the lack of a master browser on some segments. If possible, a WINS server should always be configured.
To determine which master browser, if any, StorEdge has located, generate a system diagnostic.
1. To access this functionality, access the StorEdge via Telnet.
2. Press enter at the [menu] prompt and enter the administrator password.
Chapter 2 NAS Head 2-43
3. Press the spacebar until “Diagnostics” is displayed under “Extensions” at the lower right.
4. Select the letter corresponding to “Diagnostics.”
“Please wait…” is then displayed in the upper left. After a short time, the system diagnostics is displayed.
5. Scroll through the diagnostics with the [spacebar] and [b] keys.
6. Under the heading “NETBIOS Cache” look for an entry with a <1D> tag.
<1D> is a segment master browser.
7. Verify that this <ID> entry matches your domain name and IP subnet.
8. If no browser is found, either move a server to the subnet that the StorEdge is on, or move StorEdge to a subnet with Windows servers.
Cannot join Windows Domain.
To authenticate users from a Windows Domain, StorEdge must locate a Domain Controller, authenticate, and then add a computer account to the domain.
Users from the domain are not able to establish a connection to the StorEdge until this entire process has succeeded.
The first step towards resolving this issue is data collection. The two primary sources of data are the system log and the StorEdge NetBIOS cache. Note that this data collection must take place as soon as possible after the failed attempt to join the domain.
To check the system log, proceed as follows:
1. Access the StorEdge via Telnet.
2. Press enter at the [menu] prompt and enter the administrator password.
3. Select option “2”, Show Log.
The fourteen most recent syslog messages are displayed.
4. Look for messages related to the attempt to join the domain.
The first message typically contains the words “join domain”.
5. If no messages are found, select option “1”, Show Entire Log.
6. Page through the log with the space bar, scrolling to the approximate time and date that you made the most recent attempt to join the domain.
7. Look again for the messages related to joining the domain.
2-44 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
8. If no applicable messages are found, repeat the attempt to join the domain, and check the log again.
The system log is also available through the StorEdge Web Admin.
To access it, log in, and navigate to: Notification and Monitoring/View System Log.
You can scroll through the log, or save it as a file.
To check the NetBIOS cache, proceed as follows:
1. Access the StorEdge via Telnet.
2. Press enter at the [menu] prompt and enter the administrator password.
3. Press the spacebar until “Diagnostics” is displayed under “Extensions” at the lower right.
4. Select the letter corresponding to “Diagnostics”.
5. Wait a few seconds while the StorEdge builds the diagnostic.
6. When the diagnostic is ready, you can page through it here, with [space] and [b], or you can email it or save it to a file.
7. In either case, search through the file for the heading “NETBIOS Cache”. Note each of the NetBIOS tags.
Each NetBIOS tag is displayed in the form: Hostname<##>, or Domain<##>, with one or more IP addresses associated with it. <##> is a number expressing a particular NetBIOS service being advertised.
The tags you should be concerned with are as follows:
Hostname<00>: Local workstation service for hostname.
Hostname<20>: Local server service for hostname.
Domain<00>: Indicates inclusion in the domain or workgroup for the included IP
address.
Note – Does not necessarily indicate domain membership.
Domain<1D>: Segment master browser(s) for the listed domain. This server
provides browsing services for this domain only on this IP subnet.
Domain<1C>: Domain Controller for listed domain. Either a Primary (PDC) or
Backup (BDC).
Domain<1B>: Primary Domain Controller for listed domain. By definition, the
browse master for its own subnet, and the collector of all data from other browse servers.
Chapter 2 NAS Head 2-45
Using these two information sources, you can begin to diagnose the problem. The following are the most common possible problems along with their indicating symptoms.
Wrong password / insufficient permissions: This is usually indicated by a logon failure or access denied message in the system log. The user account that is entered into the StorEdge Domain configuration screen must have the correct password, and must have the authority to create computer accounts. Typically, a user account that is a member of the Domain Admins global group is used.
No master browser on the subnet: CIFS/SMB relies on a hierarchical system of browser servers. Each IP subnet and network segment must have at least one such server, known as a “master browser” in order for systems on that subnet to locate network resources. StorEdge does not provide master browser services.
The first indication of this is the log message “No Master Browsers found for <domain>”. Check the NetBIOS cache for the <1D> or <1B> tag with an IP matching your subnet. Double check the domain name used against the one in the NetBIOS tag of the master browser. It may be necessary to move the StorEdge to the same subnet and segment as a master browser. All Windows server operating systems provide master browser services by default. Installing StorEdge on the same subnet as a Domain Controller is the best practice when possible.
If the problem persists after ensuring that StorEdge has a local master browser, check the solutions below under “Multiple subnets connected to StorEdge”.
Other browsing problems: The log message, “Join domain [local]: locate failed” indicates that a Domain Controller could not be found. Note that this message also appears in conjunction with the above “No Master Browser found” message. When that message is present, the above solutions should be followed first.
Start by looking at the NetBIOS cache. Look for <1B> or <1C> domain controller tags. If you see any of these, ensure that the domain name matches the one configured on StorEdge. If you see a <1D> segment master browser, but no <1B> or <1C> tags, check the NetBIOS cache on the master browser system. This is done with the “nbtstat –c” at the Windows command prompt. The output is essentially the same as the StorEdge NetBIOS cache display. If no domain controllers are present in the master browser’s NetBIOS cache, then there is a network browsing issue that needs to be addressed.
One possible solution is to add a WINS server. WINS helps to speed Windows browsing and compensates for browsing problems. WINS can compensate for browser deficiencies, and should be used whenever possible. In order for WINS to function properly, the browsers, domain controller and StorEdge must be configured to use the server for lookups. Another possible solution is to move the StorEdge to the same subnet as the Domain Controller, though this does not address the larger browsing problem.
If these solutions have been attempted to no avail, also see the following solution.
2-46 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Multiple subnets connected to StorEdge: Care must be taken when StorEdge is connected to multiple subnets, particularly when the subnets are disjoint, i.e. not connected to one another. A common example of this is a direct connection to a backup or database server.
The problem created by the disjoint subnets is that StorEdge registers each of its IP addresses via NetBIOS broadcast and/or WINS. The Domain Controller may select one of the addresses on a disjoint subnet, and fail to communicate with StorEdge, resulting in a failure to join the domain. The solution to this is to prevent NetBIOS registration of those addresses not connected to the main network.
In the case of the backup or database server, the solution is easy. The StorEdge “independent” NIC role was created expressly for this purpose. To configure the NIC role, proceed as follows:
1. Access the StorEdge via Telnet.
2. Press enter at the [menu] prompt and enter the administrator password.
3. Select option “A”, Host Name & Network.
4. Select option “1”, Edit fields.
5. Navigate through the fields with [Tab] or [Enter] until the “Role” field of the desired NIC is highlighted.
6. Select option “3”, Independent.
7. Select option “7”, Save Changes.
It is also possible to disable the NetBIOS registration without changing the NIC role. This can be done if you have a problem after attempting the above, or if you have a requirement to leave the role as primary. To make this configuration change, proceed as follows:
1. Connect to the StorEdge via Telnet, and type “admin” at the [menu] prompt and enter the administrator password.
2. At the CLI, enter “load smbtools”, and then “smbwins exclude addr=
192.168.243.1”.
This action prevents these IP addresses from being registered via NetBIOS. However, the master browsers and WINS servers do not immediately remove these addresses. To accomplish this, proceed as follows:
3. Remove the entry for StorEdge from any WINS server databases.
4. Locate the master browser for the local subnet and any local Domain Controllers.
5. Enter “nbtstat –R” at the Windows CLI on each of these systems.
Chapter 2 NAS Head 2-47
6. Reboot the StorEdge.
Note that the above changes do not take effect until after the reboot. This action removes the undesired entries in almost every case. The only case where
the entries may persist is in a multiple server WINS environment using replication. In this case, consult the provider of the WINS server operating system for removal instructions.
Anonymous connections restricted by Domain Controller: In this case, the master browser and domain controller are both located, but the system log shows a number of RPC errors related to security, along with the name and IP address of the Domain Controller to which it is attempting to authenticate.
Windows 2000 and later operating systems can be configured to refuse anonymous connections, otherwise known as null sessions. Typically, this is done for security reasons. Restricting anonymous connections is not recommended unless all clients and servers in the domain are running Windows 2000 or newer. StorEdge and other non-Windows servers require a change to this policy.
This setting is accessed via the registry editor on the Windows domain controller. Using the Registry Editor, navigate to the key: “HKEY_LOCAL_MACHINE\ SYSTEM\CurrentControlSet\Control\LSA”. Locate the value RestrictAnonymous. If it is set to “2”, modify it to “0” or “1”. A setting of “0”
The Domain Controller must be rebooted for this change to take effect.
Connected to a DC across a WAN link: In rare cases, it is possible that StorEdge will join a domain using a distant Domain Controller across a slow link. The symptoms in this case will vary. You could see timeouts, authentication failures due to a firewall, or even success with poor performance. The primary indication will be log messages indicating any of the above problems, and referring to communications with a Domain Controller on a faraway subnet.
To resolve this issue, first check the NetBIOS cache as directed above to ensure that the local domain controllers are present. If not, proceed as above to correct any difficulty locating them. After verifying the presence of one or more nearby Domain Controllers (<1B> or <1C> NetBIOS tags), proceed as follows to force StorEdge to use a particular Domain Controller:
1. Connect to the StorEdge via Telnet, and type “admin” at the [menu] prompt and enter the administrator password.
2. At the CLI, enter “set smb.pdc <IP address>”, replacing <IP address> with the IP address of one of the above domain controllers. In spite of the variable name, it is acceptable to use either a PDC <1B>, or a BDC <1C>.
3. After setting the variable, retry the attempt to join the domain. Check the system log to ensure success.
2-48 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Assuming that the difficulty connecting to the Domain Controller is temporary, and related to network load, it should not be necessary to save this variable with the savevars command. Doing so will limit the ability of StorEdge to find an alternate Domain Controller in the case that this one fails.
Cannot connect or authenticate to Windows 2003 Domain Controller.
By default Windows 2003 is configured to require signed digital communications from clients. This is also known as SMB packet signing. StorEdge does not support packet signing. Therefore, Windows 2003 must be configured to negotiate packet signing rather than assuming that it is present.
1. To configure this, you must access the Local Security Policy Editor on the Windows 2003 Server.
2. Next, navigate to Security Settings/Local Policies/Security Options.
3. Scroll down to “Microsoft network server: Digitally sign network communications (always)”
4. Double click the entry and click the “Disabled” button.
5. Click “OK”.
Changing this setting does not restrict the Windows 2003 server from using packet signing with those clients that support it.
Lost Connection with Windows Domain.
In some conditions, it is possible for StorEdge to lose connection to the Domain Controller. In this case, Windows users will be denied access to the StorEdge, and they will be prompted for a password.
Possible reasons for this include modification of administrative user password, network problems or failure of PDC.
The solution to each of these is the same. It is necessary to re-enter the user and password information in the domain setup screen. This is done as follows:
1. Access the StorEdge via Telnet.
2. Press enter at the [menu] prompt and enter the administrator password.
3. Press the spacebar until “CIFS/SMB Configuration” is displayed under “Extensions” at the lower right.
4. Select the letter corresponding to “CIFS/SMB Configuration”.
Chapter 2 NAS Head 2-49
5. Select the letter corresponding to “Domain Configuration”.
6. Use the [Enter] or [Tab] key to navigate to the User name field.
7. Enter a user name for the listed domain with the rights to add a computer account.
8. Press [Enter] to move to the “Password” field.
9. Enter the password for this user.
10. Select option “7”, Save Changes.
If the attempt to join the domain is unsuccessful, proceed according to the instructions in the Troubleshooting Guide: “Cannot join Windows Domain”.
This functionality is also available through the StorEdge Web Admin. To use the Web Admin, connect with a Web browser to http://<hostname or IP address of your StorEdge>. Click “Grant” or “Yes” to accept any Java software authorization windows and you will reach the login screen. Type the administrator password to access the administration interface.
Navigate to Windows Configuration/Configure Domains and Workgroups. Enter a user name for the listed domain with the rights to add a computer account and the associated password.
If the attempt to join the domain is unsuccessful, proceed according to the instructions in the Troubleshooting Guide: “Cannot join Windows Domain”.
CIFS/SMB share changed to hidden is still visible on network. Renamed share, but old name is still displayed in browse list.
Share lists are sometimes cached by the client’s network redirector. This problem will clear itself within a short time, 30 minutes at the most.
Cannot set share security, all shares inherit the security of the directory object.
The StorEdge security implementation allows only for securing files and directories. The effective security of a CIFS/SMB share is always the security of the directory to which it points.
2-50 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
StorEdge has same files in 2 different shares.
This is caused by creating multiple share names that point to the same directory or volume. Shares always point to a directory. Root level shares will always contain all files on the volume, regardless of how many shares are created to this volume. View shares as pointers, with the understanding that many of these pointers may exist to a single location.
User maps are incorrect. User maps are not automatically created.
The requirements for successful user mapping are to import all NFS users to the StorEdge and define a mapping rule. These requirements must be met before any CIFS/SMB users have connected. If CIFS/SMB users connect before both of these are in place, the user will be mapped to a StorEdge-generated UID.
Once the mapping has been created, it will not be overwritten by subsequent connections with updated credential info.
Windows users are not mapped to the expected NFS group. Mappings are not created for most Windows groups.
Although Windows users can maintain membership in many groups, the StorEdge user and group mapping functionality only recognizes the Primary Group. By default, all Windows Users are assigned the primary group, “Domain Users”. The only exception to this is if they are a member of the “Domain Admins” group at the time the user account is created, in which case this group is assigned as the primary group.
In order for group mapping between CIFS/SMB and NFS to be effective, primary group assignments must be made selectively. It may be necessary to create some groups. Primary group assignment is done is Windows User Manager for Domains, usually from a Domain Controller. See your Windows documentation for details on how to configure this setting. It is important to use only Windows “Global Groups” for this purpose. Windows Local Groups are intended to be used only locally, on the Domain Controllers themselves.
After making modifications to Windows users’ primary groups, groups with no mappings will be mapped to NFS groups according to the mapping policy. The mapping will be automatically created as soon as a CIFS/SMB user connects with a primary group which is not in the StorEdge group.map file. Before this happens, make sure that group information is imported to StorEdge, either manually or via NFS, and the desired mapping policy is in place.
Chapter 2 NAS Head 2-51
Another way to resolve this, for users with primary group assignments in the passwd file, is to use the “Map to Primary Group” policy.
Can’t copy greater than 4G file from Windows to StorEdge.
This problem may be seen on Windows 2000 and prior versions. If running Windows 2000, it can be fixed applying the latest service pack. If running an older version, there is no fix available, though you may be able to work around the problem with the Windows backup utility or a similar third party solution.
Can’t map drives via CIFS/SMB.
In order to map a drive or connect to a share, you must have read access to the directory to which the share points. If StorEdge is in domain mode, you must also be logged in to the domain. File and directory security can be checked at the StorEdge CLI.
1. To access the StorEdge CLI, connect to the StorEdge via Telnet, and type “admin” at the [menu] prompt and enter the administrator password.
2. At the CLI, enter “cacls <path>”. The path must include the volume name. If the path includes spaces, enclose the argument in double quotes, as in cacls “/vol1/my directory/my file”.
Cacls output contains the following information:
First, the basic mode information and UID/GID of the owner is displayed. Here is an example:
drwxrw---- 34 22 /vol1/data
In this case, we can see that the item is a directory, with 750 permissions: Read/write/execute (7) for the owner (UID 34), Read/write for members of the owner’s group (GID 22), and no permissions (0) for everyone else.
Listed next are Creation time, FS Creation time, and FS mtime. These are timestamps associated with the file and the filesystem, generally only useful for troubleshooting timestamp issues.
Next is the Windows security descriptor. In its simplest form, it will read “No security descriptor”. This means that no Windows security is present, and that Windows will simulate security based on the above NFS permissions.
If a Windows security descriptor is present, the following information is displayed:
Security Descriptor:The type of security descriptor. This can be disregarded.
Owner:The user name or SID of the owner.
2-52 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Primary Group: The group name or SID of the group owner.
Discretionary Access Control List (DACL):A list of users who have access to the
file, by SID.
A SID is a number that uniquely identifies a user or group. The data to the right of the final dash identifies the user within the domain; the rest of the number indicates domain and type of account information. This user information is known as the RID (relative ID). The RID is the number used for user mapping. It can be cross­referenced with the StorEdge user or group mapping data determine the user/group name and NFS UID/GID.
From there, it is simply a matter of assigning appropriate rights to the user attempting to access the directory. Set security as desired using a Windows Domain Admin account.
Can’t set Windows security at the root of a volume or at the base of a share.
Windows security is set by right clicking on an object, and then selecting the security tab. If you wish to do this for the root of a volume, first map a drive to the share, then right click on the mapped drive within “My Computer”. You will then be able to access the security tab as normal.
Cannot see the security tab from Windows clients.
Current versions of Windows do not display the security tab unless you have the right to view or change security.
File and directory security can be checked at the StorEdge CLI.
1. To access the StorEdge CLI, connect to the StorEdge via Telnet, and type “admin” at the [menu] prompt and enter the administrator password.
2. At the CLI, enter “cacls <path>”. The path must include the volume name. If the path includes spaces, enclose the argument in double quotes, as in cacls “/vol1/my directory/my file”.
Cacls output contains the following information:
First, the basic mode information and UID/GID of the owner is displayed. Here is an example:
drwxrw---- 34 22 /vol1/data
In this case, we can see that the item is a directory, with 750 permissions: Read/write/execute (7) for the owner (UID 34), Read/write for members of the owner’s group (GID 22), and no permissions (0) for everyone else.
Chapter 2 NAS Head 2-53
Listed next are Creation time, FS Creation time, and FS mtime. These are timestamps associated with the file and the filesystem, generally only useful for troubleshooting timestamp issues.
Next is the Windows security descriptor. In its simplest form, it will read “No security descriptor”. This means that no Windows security is present, and that Windows will simulate security based on the above NFS permissions.
If a Windows security descriptor is present, the following information is displayed:
Security Descriptor:The type of security descriptor. This can be disregarded.
Owner:The user name or SID of the owner.
Primary Group: The group name or SID of the group owner.
Discretionary Access Control List (DACL):A list of users who have access to the
file, by SID.
A SID is a number that uniquely identifies a user or group. The data to the right of the final dash identifies the user within the domain; the rest of the number indicates domain and type of account information. This user information is known as the RID (relative ID). The RID is the number used for user mapping. It can be cross­referenced with the StorEdge user or group mapping data determine the user/group name and NFS UID/GID.
From there, it is simply a matter of assigning appropriate rights to the user attempting to access the directory. Set security as desired using a Windows Domain Admin account.
Windows anti-virus, backup or file management software runs endlessly, following symbolic links.
By default, StorEdge follows symbolic links in Windows. Windows cannot differentiate between links and standard files. Therefore, if a symbolic link points to a location in the filesystem above its own location, Windows applications can get stuck in a loop following these links.
To correct this behavior, you can either manually exclude such links from the scan or backup, or you can set a variable to disable the following of symbolic links from CIFS/SMB clients. The variable affects all volumes and all CIFS/SMB clients.
This functionality is only available at the StorEdge CLI (command line interface).
1. To access the StorEdge CLI, connect to the StorEdge via Telnet, and type “admin” at the [menu] prompt and enter the administrator password.
2. At the CLI, enter set smb.dir_symlink.disable yes
2-54 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Loading...