Sun Microsystems STOREDGETM 5310 NAS User Manual

Sun StorEdge™ 5310 NAS
Troubleshooting Guide
Sun Microsystems, Inc. www.sun.com
Part No. 817-7513-11 August 2004, Revision A
Submit comments about this document at: http://www.sun.com/hwdocs/feedback
Copyright 2004 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved. Sun Microsystems, Inc. has intellectual property rights relating to technology that is described in this document. In particular, and without
limitation, these intellectual property rights may include one or more of the U.S. patents listed at http://www.sun.com/patents and one or more additional patents or pending patent applications in the U.S. and in other countries.
This document and the product to which it pertains are distributed under licenses restricting their use, copying, distribution, and decompilation. No part of the product or of this document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any.
Third-party software, including font technology, is copyrighted and licensed from Sun suppliers. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademar k in
the U.S. and in other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, AnswerBook2, docs.sun.com, Sun StorEdge, Java, and Solaris are trademarks or registered trademarks of
Sun Microsystems, Inc. in the U.S. and in other countries. Mozilla, Netscape, and Netscape Navigator are trademarks or registered trademarks of Netscape Communications Corporation in the United
States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and in other
countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. The OPEN LOOK and Sun™ Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges
the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun’s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun’s written license agreements.
U.S. Government Rights—Commercial use. Government users are subject to the Sun Microsystems, Inc. standard license agreement and applicable provisions of the FAR and its supplements.
DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.
Copyright 2004 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, Californie 95054, Etats-Unis. Tous droits réservés. Sun Microsystems, Inc. a les droits de propriété intellectuels relatants à la technologie qui est décrit dans ce document. En particulier, et sans la
limitation, ces droits de propriété intellectuels peuvent inclure un ou plus des brevets américains énumérés à http://www.sun.com/patents et un ou les brevets plus supplémentaires ou les applications de brevet en attente dans les Etats-Unis et dans les autres pays.
Ce produit ou document est protégé par un copyright et distribué avec des licences qui en restreignent l’utilisation, la copie, la distribution, et la décompilation. Aucune partie de ce produit ou document ne peut être reproduite sous aucune forme, par quelque moyen que ce soit, sans l’autorisation préalable et écrite de Sun et de ses bailleurs de licence, s’il y ena.
Le logiciel détenu par des tiers, et qui comprend la technologie relative aux polices de caractères, est protégé par un copyright et licencié par des fournisseurs de Sun.
Des parties de ce produit pourront être dérivées des systèmes Berkeley BSD licenciés par l’Université de Californie. UNIX est une marque déposée aux Etats-Unis et dans d’autres pays et licenciée exclusivement par X/Open Company, Ltd.
Sun, Sun Microsystems, le logo Sun, AnswerBook2, docs.sun.com, Sun StorEdge, Java, et Solaris sont des marques de fabrique ou des marques déposées de Sun Microsystems, Inc. aux Etats-Unis et dans d’autres pays.
Mozilla, Netscape, et Netscape Navigator sont des marques de Netscape Communications Corporation aux Etats-Unis et dans d’autres pays. Toutes les marques SPARC sont utilisées sous licence et sont des marques de fabrique ou des marques déposées de SPARC International, Inc.
aux Etats-Unis et dans d’autres pays. Les produits portant les marques SPARC sont basés sur une architecture développée par Sun Microsystems, Inc.
L’interface d’utilisation graphique OPEN LOOK et Sun™ a été développée par Sun Microsystems, Inc. pour ses utilisateurs et licenciés. Sun reconnaît les efforts de pionniers de Xerox pour la recherche et le développement du concept des interfaces d’utilisation visuelle ou graphique pour l’industrie de l’informatique. Sun détient une license non exclusive de Xerox sur l’interface d’utilisation graphique Xerox, cette licence couvrant également les licenciées de Sun qui mettent en place l’interface d ’utilisation graphique OPEN LOOK et qui en outre se conforment aux licences écrites de Sun.
LA DOCUMENTATION EST FOURNIE "EN L’ÉTAT" ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES EXPRESSES OU TACITES SONT FORMELLEMENT EXCLUES, DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE, Y COMPRIS NOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L’APTITUDE A UNE UTILISATION PARTICULIERE OU A L’ABSENCE DE CONTREFAÇON.
Please
Recycle

Contents

1. Troubleshooting Overview 1
How to Use This Manual 1
Important Notices and Information on the Sun StorEdge 5310 NAS 2
Troubleshooting Tools 3
Troubleshooting Procedures 4
Troubleshooting Flow Charts 6
Diagnostic Information Sources 8
StorEdge Diagnostic Email 8
Data Collection for Escalations 10
Log Error Messages 19
SYSLOG 19
Error Codes from the Sun StorEdge 5310 NAS LCD Display and syslog 20
About SysMon Error Notification 21
Sun StorEdge 5310 NAS Error Messages 21
UPS Subsystem Errors 22
File System Errors 24
PEMS Events 24
Maintenance Precautions 26
Static Electricity Precautions 27
Contents iii
2. NAS Head 1
Hardware 1
Contacting Technical Support 1
Problems With Initial System Startup 2
Resetting the Server 3
Preparing the System for Diagnostic Testing 4
Troubleshooting the Server Using Built-In Tools 10
Diagnosing System Errors 10
LEDs 11
Beep Codes 11
POST Screen Messages 11
LEDs and Pushbuttons 11
Front Panel LEDs and Pushbuttons 12
Rear Panel LEDs 16
Front-Panel System Status LED 18
Rear Panel Power Supply Status LED 20
Server Main Board Fault LEDs 21
System ID LEDs 23
Power-On Self Test (POST) 24
POST Screen Messages 24
POST Error Beep Codes 27
POST Progress Code LED Indicators 30
OS Operations 36
Filesystem Check (fsck) Procedure 36
StorEdge Network Capture Utility 37
Upgrades 38
Cacls - Access Control List 38
Proc filesystem 39
iv Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
FTP Server 40
Updating the OS on the Sun StorEdge 5310 NAS 40
Sun StorEdge 5310 NAS Firmware 40
Operating System 40
Common Problems Encountered on the Sun StorEdge 5310 NAS 42
CIFS/SMB/Domain 43
NFS Issues 61
Network Issues 66
NIC speed and duplex negotiation issues. 67
File System Issues 70
Drive Failure Messages 74
File and Volume Operations 76
Administration Interfaces 78
StorEdge Features and Utilities 82
Hardware Warning Messages 84
Backup Issues 88
Direct Attached Tape Libraries 90
Frequently Asked Questions 92
CIFS/SMB/Domain Issues 92
NIS/NIS+ Issues 104
TCP/IP and Network Configuration 106
Quota Configuration 109
Checkpoint Configuration 115
Volume Creation and Expansion 120
Reserved Filesystems and Directories 123
NFS Issues 124
Administration Interfaces and Utilities 128
Backup and Migration Issues 142
Contents v
Macintosh Connectivity 146
Miscellaneous Log Messages 147
Direct Attached Tape Libraries 148
SCSI ID Settings 148
StorEdge File Replicator 149
StorEdge File Replicator Issues 152
3. Storage Arrays 1
Fibre Channel FC 1
Array Overview 1
Using the Array 8
Troubleshooting and Recovery 22
Troubleshooting the Module 22
Recovering from an Overheated Power Supply 26
Setting the Tray ID Switch 29
Verifying the Link Rate Setting 30
Relocating a Command Module 31
Upgrade Requirements 31
Adding New Drives to Empty Slots 33
Replacing All Drives at the Same Time 36
Replacing One Drive at a Time 39
Relocation Considerations 43
Raid Storage Manager (RSM) 44
Updating Firmware and NVSRAM on the Array 95
Updating ESM Firmware 99
4. StorEdge File Replicator 1
Overview 1
Real-time Mirroring 3
vi Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Pseudo Real-time Mirroring 3
StorEdge File Replicator 3
Mirroring Variations 7
Operational State 9
Mirror Creation 10
Mirror Replication 11
Mirror Sequencing 12
Link Down and Idle Conditions 12
Cracked and Broken Mirrors 12
Cannot perform first-time synchronization of mirror system: 13
Filesystem errors, such as run check, directory broken, etc.: 13
Error messages, panics or hang condition when enabling mirror: 13
5. Clustering 1
Overview 1
6. Checkpoints/Snapshots 1
Overview 1
Vol um es 1
Checkpoint Lifecycle 3
Object Checkpoint Restore 16
StorEdge cp Command 17
7. FRU/CRU Replacement Procedures 1
Tools and Supplies Needed 1
Determining a Faulty Component 2
Safety: Before You Remove the Cover 2
Removing and Replacing the Cover 2
Field Replaceable Unit (FRU) Procedures 4
NAS Head FRU Replacement Procedures 4
Contents vii
Opening the Front Bezel 5
Memory 6
Power Supply Unit 7
Fan Module 9
High Profile Riser PCI Cards 12
Gigabit Ethernet Card 13
Low Profile Riser PCI Cards 15
Qlogic HBA Removal and Replacement 16
LCD Display Module 17
Flash Disk Module 18
System FRU (Super FRU) 22
Array FRU replacement Procedures 23
Replacing a Controller 23
Replacing a Controller Battery 29
Replacing a Drive 36
Replacing a Fan 39
Replacing a Power Supply 41
Replacing an SFP Transceiver 44
viii Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Tables
TABLE 1-1 List of Adapters 16 TABLE 1-2 Routing Table 16 TABLE A-3 UPS Error Messages 22 TABLE A-4 File System Errors 24 TABLE A-5 PEMS Error Messages 24 TABLE 2-1 Index to Problems 4 TABLE 2-2 Bootup Beep Codes 6 TABLE 2-3 Server LEDs 11 TABLE 2-4 Front Panel LEDs 13 TABLE 2-5 Front Panel Pushbuttons 15 TABLE 2-6 Rear Panel LEDs 16 TABLE 2-7 System Status LED States 18 TABLE 2-8 Power Supply Status LED States 20 TABLE 2-9 Standard POST Error Messages and Codes 24 TABLE 2-10 Extended POST Error Messages and Codes 26 TABLE 2-11 BMC-Generated POST Beep Codes 27 TABLE 2-12 BIOS-Generated Boot Block POST Beep Codes 28 TABLE 2-13 Memory 3-Beep and LED POST Error Codes 29 TABLE 2-14 BIOS Recovery Beep Codes 30 TABLE 2-15 Boot Block POST Progress LED Code Table (Port 80h Codes) 31
ix
TABLE 2-16 POST Progress LED Code Table (Port 80h Codes) 32 TABLE 2-17 Status LED Indicators 87 TABLE 2-18 Supported Tape Libraries and Tape Drives 148 TABLE 3-1 Lights on the Back of a Command Module 14 TABLE 3-2 Lights on the Front of a Command Module 23 TABLE 3-3 Lights on the Back of a Command Module 24 TABLE 3-4 Enterprise Management Window Menus 48 TABLE 3-5 Enterprise Management Window Toolbar Buttons 49 TABLE 3-6 Array Management Window Tabs 52 TABLE 3-7 Array Management Window Menus (1 of 2) 53 TABLE 3-8 Array Management Window Toolbar Buttons 54 TABLE 3-9 RAID Level Configurations 58 TABLE 3-10 Mappings View Tab 69 TABLE 3-11 Volume-to-LUN Terminology 69 TABLE 3-12 Storage Array Status Icon Quick Reference 86 TABLE 4-1 Standard Terms 2
x Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Figures
FIGURE 2-1 Front Panel Pushbuttons and LEDs 13 FIGURE 2-2 Rear Panel LEDs 16 FIGURE 2-3 Location of Front-Panel System Status LED 18 FIGURE 2-4 Location of Rear-Panel Power Supply Status LEDs 20 FIGURE 2-5 Fault and Status LEDs on the Server Board 21 FIGURE 2-6 Location of Front-Panel ID Pushbutton and LED 23 FIGURE 2-7 Examples of POST LED Coding 31 FIGURE 2-8 The Update Software Panel 41 FIGURE 3-1 Controller 2 FIGURE 3-2 Label Locations on the Controller 3 FIGURE 3-3 Battery Charging/Charged and Cache Active Lights 4 FIGURE 3-4 Drives and Lights 4 FIGURE 3-5 Drive Numbering – Rackmount Module 5 FIGURE 3-6 Fans and Airflow 5 FIGURE 3-7 Power Supplies 6 FIGURE 3-8 SFP Transceiver and fibre Optic Cable 7 FIGURE 3-9 Tray ID Switch 8 FIGURE 3-10 Removing and Replacing a Deskside Module Back Cover 9 FIGURE 3-11 Power Supply Switches 10 FIGURE 3-12 Lights on the Back of a Command Module 13 FIGURE 3-13 Alarm Mute Button 20
xi
FIGURE 3-14 Lights on the Front of a Command Module 23 FIGURE 3-15 Lights on the Back of a Command Module 24 FIGURE 3-16 Power Supply Switches 28 FIGURE 3-17 Setting the Tray ID Switch 30 FIGURE 3-18 Verifying the Link Rate Setting 31 FIGURE 3-19 Removing and Installing a Drive 35 FIGURE 3-20 Power Supply Switches 38 FIGURE 3-21 Removing and Installing a Drive 38 FIGURE 3-22 Removing and Installing a Drive 42 FIGURE 3-23 Enterprise Management Window 45 FIGURE 3-24 Array Management Window 45 FIGURE 3-25 Enterprise Management Window 46 FIGURE 3-26 Device Tree Example 47 FIGURE 3-27 Array Management Window 51 FIGURE 3-28 Unconfigured and Free Capacity Nodes 66 FIGURE 3-29 Mappings View Window 68 FIGURE 3-30 SANshare Storage Partitioning Example 73 FIGURE 3-31 Host Port Definitions Dialog 75 FIGURE 3-32 Heterogeneous Hosts Example 76 FIGURE 3-33 DVE Modification Operation in Progress 79 FIGURE 3-34 Persistent Reservations Dialog 83 FIGURE 3-35 Monitoring Storage Array Health Using the Enterprise Management Window 85 FIGURE 3-36 Event Monitor Configuration 87 FIGURE 3-37 Event Monitor Example 88 FIGURE 3-38 Problem Notification in the Array Management Window 91 FIGURE 3-39 Displaying the Recovery Guru Window 92 FIGURE 3-40 Recovery Guru Window Example 93 FIGURE 3-41 Status Changes During an Example Recovery Operation 94 FIGURE 3-42 Status Changes When The Example Recovery Operation is Completed 95 FIGURE 4-1 The lifecycle of a transaction in StorEdge File Replicator 4
xii Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
FIGURE 4-2 Write ordering on the Mirror 5 FIGURE 4-3 Lost transaction handling on the Mirror 6 FIGURE 4-4 The Mirror Log and Primary Journal 7 FIGURE 6-1 Physical and Logical Volume Relationship 2 FIGURE 6-2 The Copy-On-Write Mechanism for Checkpoints 4 FIGURE 6-3 Mappings for Block n Before Modification 5 FIGURE 6-4 Mappings for Block n After Modification 6 FIGURE 6-5 Creating a hardlink when a volume is checkpointed and has active checkpoints 8 FIGURE 6-6 Mappings for Block n After Deleting ckpti-1 10 FIGURE 6-7 After Deleting ckpti+1. 10 FIGURE 6-8 Accessing .chkpnt in UNIX 13 FIGURE 6-9 Accessing ".chkpnt" in Windows Explorer 15 FIGURE 6-10 Viewing ".chkpnt" in Windows Explorer 16 FIGURE 6-11 Sharing Blocks Between Live and Checkpoint File Systems 17 FIGURE 6-12 Windows File Copy Error Message During a Checkpoint Restore Operation 19 FIGURE 6-13 Windows Excel Open Error Message During a Checkpoint Restore Operation 19 FIGURE 7-1 Removing the Cover 3 FIGURE 7-2 Sun StorEdge 5310 NAS Bezel Replacement 5 FIGURE 7-3 Sun StorEdge 5310 NAS Expansion Unit 6 FIGURE 7-4 Replacing the Power Supply 8 FIGURE 7-5 Removing the Fan Module 10 FIGURE 7-6 The Gigabit Ethernet Card in the Low Profile Riser Slot 14 FIGURE 7-7 Connecting the LCD Display 18 FIGURE 7-8 The Flash Disk 20 FIGURE 7-9 Removing an SFP Transceiver and fibre Optic Cable 25 FIGURE 7-10 Removing and Replacing a Controller 25 FIGURE 7-11 Removing the Controller Cover (Upside Down View) 26 FIGURE 7-12 Replacing the Controller Battery 27 FIGURE 7-13 Label Locations for the Controller 28 FIGURE 7-14 Controller Host Link, Drive Link, and Fault Lights 29
xiii
FIGURE 7-15 Removing the SFP Transceiver and fibre Optic Cable 31 FIGURE 7-16 Removing and Replacing a Controller 31 FIGURE 7-17 Removing the Controller Cover (Upside Down View) 33 FIGURE 7-18 Removing and Installing the Controller Battery 33 FIGURE 7-19 Label Locations on the Controller 34 FIGURE 7-20 Drive Link, Host Link, Battery, and Fault Lights 36 FIGURE 7-21 Replacing a Drive 38 FIGURE 7-22 Replacing a Fan 40 FIGURE 7-23 Replacing a Power Supply 43 FIGURE 7-24 Replacing an SFP Transceiver 45
xiv Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004

Preface

This Troubleshooting Guide provides information on how to identify, isolate, and fix
TM
problems with the Sun StorEdge
5310 NAS. It also explains how to remove and
replace certain key server components.
Topics in this chapter include:
“Who Should Use This Book” on page -xvi
“How This Manual is Organized” on page -xvi
“Typographic Conventions” on page -xvi
“Related Documentation” on page -xvii
“Ordering Sun Documents” on page -xvii
“Shell Prompts in Command Examples” on page -xviii
“Sun Welcomes Your Comments” on page -xviii
xv
Who Should Use This Book
The intended audience for this book is Sun field service personnel who are responsible for maintaining Sun StorEdge 5310 NAS.
How This Manual is Organized
This manual contains the following chapters:
Chapter 1, “Troubleshooting Overview” on page 1-1
Chapter 2, “NAS Head” on page 2-1
Chapter 3, “Storage Arrays” on page 3-1
Chapter 4, “StorEdge File Replicator” on page 4-1
Chapter 5, “Clustering” on page 5-1
Chapter 6, “Checkpoints/Snapshots” on page 6-1
Chapter 7, “FRU/CRU Replacement Procedures” on page 7-1
Typographic Conventions
The following table describes the typographic conventions used in this book.
TABLE P-1 Typograp hic Conv e ntions
Typeface or Symbol Meaning Example
courier font Names of commands;
Names of files; On-screen computer output;
italics Book titles, new words;
Terms to be emphasized; Variables that you replace with a
real value;
boldface courier font What you type machine_name% su
xvi Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
ls -a to list all files.
Use Edit your .login file. machine_name% You have mail.
Read Chapter 6 in the User’s Guide;
These are called class options; You mu s t be root to do this; To delete a file, ty p e rm filename.
Related Documentation
These documents contain information related to the tasks described in this book:
Sun StorEdge 5310 NAS Quick Reference Manual Sun StorEdge 5310 NAS Hardware Installation, Configuration, and User Guide Sun StorEdge 5310 NAS Software Installation, Configuration, and User Guide Sun StorEdge 5310 NAS Setup Poster
Ordering Sun Documents
The SunDocsSM program provides more than 250 manuals from Sun Microsystems, Inc. If you are in the United States, Canada, Europe or Japan, you can purchase documentation sets or individual manuals by using this program.
For a list of documents and how to order them, see the catalog section of the SunExpress™ Internet site at http://store.sun.com.
Accessing Sun Documentation Online
The http://docs.sun.com Web site enables you to access the Sun technical documentation online. You can browse the docs.sun.com archive or search for a specific book title or subject.
Preface xvii
Shell Prompts in Command Examples
The following table shows the default system prompt and superuser prompt for the C, Bourne and Korn shell.
TABLE P-2 Shell Prompt
Shell Prompt
Bourne shell and Korn shell prompt machine name$
Bourne shell and Korn shell superuser prompt machine name#
Sun Welcomes Your Comments
Sun is interested in improving its documentation and welcomes your comments and suggestions. You can email your comments to Sun at:
docfeedback@sun.com
Please include the part number (8xx-xxxx-xx) of your document in the subject line of your email.
xviii Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
CHAPTER
1

Troubleshooting Overview

This chapter provides an overview of diagnostic functions and tools needed for troubleshooting the Sun StorEdge 5310 NAS.
This chapter contains the following sections:
“How to Use This Manual” on page 1-1
“Important Notices and Information on the Sun StorEdge 5310 NAS” on page 1-2
“Diagnostic Information Sources” on page 1-8

1.1 How to Use This Manual

Before going deep into this manual, check the following to ensure that common problems have been resolved.
Are both of the power cords plugged in?
Are green LEDs displaying on the power sources? If no, check the power source.
Does the LCD Display panel show the system name and CPU% on it? If no, check
the power source.
Can you ping the system? If no, check the network cables and IP address on the
LCD Display. If you are still having problems, check with your system administrator.
If the user can’t access shares, are the shares set up on the system? Check the
shares section to make sure that the shares are set up with the proper name.
Is an NFS client having permissions issues on a CIFS file? Vice versa? Check the
FAQ for file permission issues to resolve.
1-1

1.2 Important Notices and Information on the Sun StorEdge 5310 NAS

Caution – Do not plug a USB keyboard into the front USB connector. This will
cause the system to crash.
Caution – Do Not power on the Sun StorEdge 5310 NAS, until two minutes after
the JBOD has been powered up, to ensure that the disk drives have finished spinning up.
Caution – /dvol/etc folder contains config information and needs to be backed up
to ensure that all configuration information is available upon a failure. Back up the /dvol/etc folder to an existing LUN on the Sun StorEdge 5310 NAS.
Note – /dvol/etc folder contains config information and needs to be backed up
to ensure that all configuration information is available upon a failure. It is recommended to back the /dvol/etc folder up to an existing LUN on the Sun StorEdge 5310 NAS.
Note – You must enable FTP from the CLI using the load ftpd command.
Currently, enabling the FTP from the web interface does not work.
Note – When configuring the Sun StorEdge 5310 NAS through a firewall, ensure
that the correct ports are not blocked. Refer to “StorEdge Web Admin does not work properly through a firewall.” on page 2-80 for more details.
Note – There is a line of tape that must be removed to be able to remove the fan
tray.
1-2 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004

1.3 Troubleshooting Tools

1.3.0.1 Storage Automated Diagnostic Environment (StorAde)

If you have the Storage Automated Diagnostic Environment installed in the host, check the internal status of the array with this tool. See the documentation for this tool for further information.
All that you need to use the Storage Automated Diagnostic Environment is web browser access to the host where it is installed.

1.3.0.2 Command Line Interface (CLI)

The CLI can be accessed through the MENU system or by using Telnet. This is a useful sections for troubleshooting many types of issues. The CLI is also where you load tools like FTP. See the Diagnostic Tools and Procedures section for details.

1.3.0.3 Log Error Messages

Both the Sun StorEdge 5310 NAS and attached hosts create log message files or error messages of system conditions and events. These log files are the most useful immediate tools for troubleshooting.

1.3.0.4 Sun StorEdge 5310 NAS Generated Messages

A syslog daemon in the array writes system error message logs to a location determined by the site system administrator. Consult with the site system administrator to obtain access to this log.

1.3.0.5 Client Generated Messages

CIFS clients will get messages on the monitor when they have attached shares on the Sun StorEdge 5310 NAS. These messages will be useful in determining issues that arise.
NFS clients will have messages generated in its /var/adm/messages file.
Chapter 1 Troubleshooting Over view 1-3
A variety of software logging tools monitor the various branches of the storage network. When an error is detected, the error’s severity level is categorized and classified. Errors are reported or logged according to severity level.

1.3.0.6 Log Message Severity Levels

Emergency—Specifies emergency messages. These messages are not distributed
to all users. Emergency priority messages are logged into a separate file for reviewing.
Alert—Specifies important messages that require immediate attention. These
messages are distributed to all users.
Critical—Specifies critical messages not classified as errors, such as hardware
problems. Critical and higher-priority messages are sent to the system console.
Error—Specifies any messages that represent error conditions, such as an
unsuccessful disk write.
Warning—Specifies any messages for abnormal, but recoverable, conditions.
Notice—Specifies important informational messages. Messages without a priority
designation are mapped into this priority message.
Information—Specifies informational messages. These messages are useful in
analyzing the system.
Debug—Specifies debugging messages.

1.4 Troubleshooting Procedures

1.4.0.1 High-Level Troubleshooting Tasks

This section lists the high-level steps you can take to isolate and troubleshoot problems in the array. It offers a methodical approach, and lists the tools and resources available at each step.
1. Discover the error by checking one or more of the following messages or files:
Storage Automated Diagnostic Environment alerts or email messages, if available
“event log” from the Sun StorEdge 5310 NAS
/var/adm/messages file at the host system
CIFS clients messages
2. Determine the extent of the problem by using one or more of the following methods:
1-4 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Review the Storage Automated Diagnostic Environment topology view
Using the Storage Automated Diagnostic Environment revision checking
functionality, determine whether the package or patch is installed
3. Check the status of a Sun StorEdge 5310 NAS by using one or more of the following methods:
Review the status of the light-emitting diodes (LED) on the array
Run the commands that check and display the configuration
Manually open a telnet session to the array and check the system status
Review the Storage Automated Diagnostic Environment device monitoring
reports, if available
4. Test and isolate field-replaceable units (FRUs) using the following tools:
Storage Automated Diagnostic Environment diagnostic tests, if available (these
tests might require a loopback cable for isolation)
Use the Troubleshooting Guide procedures documentation to help isolate FRU
failures
Note – These tests isolate the problem to a FRU that must be replaced. Follow the
instructions in the Sun StorEdge 5310 NAS Troubleshooting Guide for proper FRU replacement procedures.
5. Replace the failed FRU.
6. Verify the fix using the following tools:
Storage Automated Diagnostic Environment GUI Topology View and Diagnostic
Tests, if available
/var/adm/messages on the data host
CIFS client Access
Array LEDs
syslog file

1.4.0.2 Initial Troubleshooting Guidelines

To begin a problem analysis, check one or more of the following information sources for troubleshooting and perform one or more of the following checks:
The LED's can help you quickly identify if a problem is occurring. See the
Hardware Troubleshooting section to help isolate the failed component.
Chapter 1 Troubleshooting Over view 1-5
Sun StorEdge 5310 NAS messages, found in the syslog file, indicating a
problem. See Error Messages section for more information about array generated messages.
Host-generated message, found in the /var/adm/messages file, CIFS clients
may have errors on their monitor or in the event log.

1.5 Troubleshooting Flow Charts

Use the flow charts below to diagnose problems.
1-6 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Follow the steps below to diagnose hardware problems.
Chapter 1 Troubleshooting Over view 1-7
Follow the steps below to diagnose software problems.

1.6 Diagnostic Information Sources

1.6.1 StorEdge Diagnostic Email

The diagnostic email includes information about the StorEdge system configuration, disk subsystem, file system, network configuration, SMB shares, backup/restore information, /etc information, system log, environment data and administrator information. The diagnostics are a primary tool for checking configuration and troubleshooting.
Before you can send email diagnostics from the StorEdge, SMTP (email) must be configured. Please see the FAQ, “How do I set up SMTP (email)?”
1-8 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
To collect diagnostics, proceed as follows:
1. Access the StorEdge via Telnet or serial console.
2. Press enter at the [menu] prompt and enter the administrator password.
3. Press the spacebar until “Diagnostics” is displayed under “Extensions” at the lower right.
4. Select the letter corresponding to “Diagnostics”.
5. Wait a few seconds while the StorEdge builds the diagnostic.
6. Select option “2”, Send Email
7. Select option “1”, Edit problem description
8. Enter a precise description of the problem
9. Press [Enter]
10. Select option “8”, Send Email
Diagnostic is sent
If an email server is not configured or not available, it is also possible to save the diagnostics to a file on the StorEdge. To do this, proceed as above to access the “Diagnostics” menu.
1. Select option “1”, Save File.
2. Select option “1”, Edit path
3. Enter a valid path name in the path box. Format is /<volumename>/<directory>/<new filename>.
4. Press [Enter]
5. Select option “2”, save diagnostics file
System will respond with diagnostic saved
6. Access the volume that you saved the file to with SMB or NFS.
7. Copy the file to a local workstation
Important – Saving the diagnostics file locally will not include the necessary
attachments. When escalating an issue with diagnostics, you must also include the contents of the /etc directory, and the contents of /cvol/log.
Chapter 1 Troubleshooting Over view 1-9
This functionality is also available through the StorEdge Web Admin. To access these settings, log in, and click the envelope icon on the top taskbar. All of the options described above are available.

1.6.2 Data Collection for Escalations

1.6.2.1 Collecting Information from the Sun StorEdge 5310 NAS
The following are important considerations for data collection. Data collection is critical in cases that require escalation. We should always collect as much data as needed to resolve the worst-case scenario, in order to be able to resolve all scenarios. The worst-case scenario in this case, is that the issue has never before been seen, and we’ll need to recreate the problem in the lab. To do this, we’ll need to know about the client systems, the workload, the network, and so on.
1.6.2.2 Accurately quantify the problem
First, the problem must be quantified. We have identified a negative behavior of some type. We must precisely identify the scope of the problem and all possible details in order to resolve the issue. For example, if the StorEdge has a performance issue, we must exactly measure the performance, identify which problems exhibit the problem, and determine under what circumstances the problem occurs.
1.6.2.3 Collect general data
The first part of the data collection is to collect information that will be useful in every case. Much of this is contained in the StorEdge system diagnostics. From the diagnostics, we can see the StorEdge OS version, internal settings, recent log activity, and more. It is very important to generate the diagnostics during or immediately after the manifestation of the problem. Otherwise, the log and statistics will not show any data on the failure. Always collect a diagnostic email when escalating issues.
You should also collect any error messages generated by this problem, and any steps already taken in the attempt to resolve the problem, and the results obtained.
1.6.2.4 Collect specific data
Based on the above data, additional information may be required. This document will help you to tailor this data collection. Here are some examples:
1-10 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Version(s) of software on client system(s)
Version(s) of software on server system(s)
Network topology
Steps and/or sequence of events leading to the failure
What was the user doing or attempting to do when the failure occurred?
Problem symptom (error codes, failed operation, crash)
Syslog data
Network traces
Diagnostic email
1.6.2.5 Check remote access capabilities
In some cases, it is useful for one of your escalation resources to directly access the system. This can be a way to greatly simplify advanced data collection. Please note that this step is not always necessary or useful, but it can be a very valuable tool at times. When you know that advanced investigation will be required, it’s always wise to ask if remote access via TCP/IP or dial-up is available.
1.6.2.6 Data Collection for Specific Issues
Software compatibility issues
Some applications do not function properly when StorEdge is used in place of a server running a native operating system. Most, but not all, of these issues can be resolved with data collection and troubleshooting. It may be necessary to upgrade the application, the client operating system, or the StorEdge operating system. Keep in mind that the problem may lie in any of these, or a combination of all three.
The first step is to do research. Check to see if a newer version of the application or the StorEdge operating system is available. Check the release notes to see if the compatibility issue is addressed. If either version is far out of date, perform an upgrade to see if the problem is resolved. Another useful step is to try to operation on a other available network clients.
To escalate the issue, begin data collection by generating a system diagnostic with all attachments. If there is a specific symptom which can be identified, generate the system diagnostic as close as possible after this time, so that any effects can be observed in the logs and statistics.
The procedure for this can be found later in this document under Diagnostic Procedures. Next, it is necessary to collect as much data as possible on the client and application. At a minimum, the following information is required:
Client Operating System version, including any service packs or minor revisions
Chapter 1 Troubleshooting Overview 1-11
Software version, including any service packs, options or minor revisions
Client configuration information– mount options, NIC configuration, platform,
etc.
Network information – topology, switch and router information, path from client
to StorEdge
Server information – Detailed information on any application or authentication
servers, including all of the above details.
An exact set of steps to reproduce the problem. This should be very detailed,
including every menu selection and text entry
Details on any symptoms experienced by the client
The goal of this data set is to allow someone in a remote location to reproduce and resolve the issue without impacting the customer.
The next step is to verify the problem and collect network traces. If possible, copy the data residing on the StorEdge to another server temporarily. Verify that it works as expected. If it still exhibits the same symptom, the issue likely resides with the application.
Use a network capture utility to capture the network traffic generated by the failure condition between the client, the StorEdge and any other server involved in the issue. Define traffic filters so that only this traffic is captured.
Next, repeat the network capture, using the server which the application runs successfully on. This will allow engineering to make a direct comparison of a successful operation and an unsuccessful operation.
StorEdge has a built-in network monitoring tool. Details on the operation of this tool can be found in the Diagnostic Procedures section of this document. However, in this case it would be best to use a network analysis tool on the client. The main reason for this is that the StorEdge tool will not be able to capture the data when an alternate server is used for comparison.
1.6.2.7 Security Issues
When troubleshooting security problems, it is useful to experiment. Try other workstations, other operating systems and different user accounts, including a root or a Domain Admin account. These are very useful in locating the source of the problem.
When escalating a security issue collect the following data:
1-12 Sun StorEdge 5310 NAS Troubleshooting Guide • December 2004
Loading...
+ 352 hidden pages