Brocade Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01 12 March 2008

Fabric OS

Troubleshooting and Diagnostics Guide

Supporting Fabric OS v6.1.0

Brocade, Fabric OS, File Lifecycle Manager, MyView, and StorageX are registered trademarks and the Brocade B-wing symbol, DCX, and SAN Health are trademarks of Brocade Communications Systems, Inc., in the United States and/or in other countries. All other brands, products, or service names are or may be trademarks or service marks of, and are used to identify, products or services of their respective owners.

Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied, concerning any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes to this document at any time, without notice, and assumes no responsibility for its use. This informational document describes features that may not be currently available. Contact a Brocade sales office for information on feature and product availability. Export of technical data contained in this document may require an export license from the United States government.

The authors and Brocade Communications Systems, Inc. shall have no liability or responsibility to any person or entity with respect to any loss, cost, liability, or damages arising from the information contained in this book or the computer programs that accompany it.

The product described by this document may contain “open source” software covered by the GNU General Public License or other open source license agreements. To find-out which open source software is included in Brocade products, view the licensing terms applicable to the open source software, and obtain a copy of the programming source code, please visit http://www.brocade.com/support/oscd.

Brocade Communications Systems, Incorporated

Corporate Headquarters Brocade Communications Systems, Inc. 1745 Technology Drive San Jose, CA 95110 Tel: 1-408-333-8000 Fax: 1-408-333-8101 Email: info@brocade.com

European and Latin American Headquarters Brocade Communications Switzerland Sàrl Centre Swissair Tour A - 2ème étage 29, Route de l'Aéroport Case Postale 105 CH-1215 Genève 15 Switzerland Tel: +41 22 799 56 40 Fax: +41 22 799 56 41 Email: emea-info@brocade.com

Asia-Pacific Headquarters Brocade Communications Singapore Pte. Ltd. 9 Raffles Place #59-02 Republic Plaza 1 Singapore 048619 Tel: +65-6538-4700 Fax: +65-6538-0302 Email: apac-info@brocade.com

Document History

Title Publication number Summary of changes Date

Fabric OS Troubleshooting and Diagnostics Guide

53-0000853-01 First released edition. 12 March 2008

About This Document

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

How this document is organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Supported hardware and software . . . . . . . . . . . . . . . . . . . . . . . . . . . x

What’s new in this document. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x

Document conventions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Text formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xi

Notes, cautions, and warnings . . . . . . . . . . . . . . . . . . . . . . . . . . .xi

Key terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

Additional information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

Brocade resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

Other industry resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

Getting technical help. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Document feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv

Chapter 1 Introduction to Troubleshooting

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

About troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Network time protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Most common problem areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Questions for common symptoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Gathering information for your switch support provider . . . . . . . . . . 5

Setting up your switch for FTP. . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Capturing a supportSave. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Capturing a supportShow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Capturing output from a console . . . . . . . . . . . . . . . . . . . . . . . . . 6

Capturing command output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Building a case for your switch support provider . . . . . . . . . . . . . . . . 7

Basic switch information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Detailed problem information. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . . . 9

Chapter 2 General Issues

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11

Licensing issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11

Fabric OS Troubleshooting and Diagnostics Guide iii 53-1000853-01

Switch Message Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11

Checking fan components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12

Checking the switch temperature. . . . . . . . . . . . . . . . . . . . . . . .13

Checking the power supply . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13

Checking the temperature, fan, and power supply . . . . . . . . . .13

Fibre Channel Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13

Checking for Fibre Channel connectivity problems . . . . . . . . . .14

Third party applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15

Chapter 3 Connections Issues

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17

Port initialization and FCP auto discovery process . . . . . . . . . . . . . .17

Link issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19

Connection problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19

Checking the logical connection . . . . . . . . . . . . . . . . . . . . . . . . .19

Checking the name server (NS) . . . . . . . . . . . . . . . . . . . . . . . . .20

Link failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21

Determining a successful negotiation . . . . . . . . . . . . . . . . . . . .22

Checking for a loop initialization failure . . . . . . . . . . . . . . . . . . .22

Checking for a point-to-point initialization failure . . . . . . . . . . .23

Correcting a port that has come up in the wrong mode . . . . . . 23

Marginal links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24

Troubleshooting a marginal link . . . . . . . . . . . . . . . . . . . . . . . . .24

Device login issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25

Pinpointing problems with device logins . . . . . . . . . . . . . . . . . .27

Media-related issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29

Testing a port’s external transmit and receive path . . . . . . . . .30

Testing a switch’s internal components. . . . . . . . . . . . . . . . . . .30

Testing components to and from the HBA . . . . . . . . . . . . . . . . .30

Segmented fabrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31

Reconciling fabric parameters individually . . . . . . . . . . . . . . . .31

Downloading a correct configuration . . . . . . . . . . . . . . . . . . . . .32

Reconciling a domain ID conflict . . . . . . . . . . . . . . . . . . . . . . . .32

Chapter 4 Configuration Issues

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33

Configupload and download issues. . . . . . . . . . . . . . . . . . . . . . . . . .33

Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .35

Brocade configuration form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36

Chapter 5 FirmwareDownload Errors

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37

Blade troubleshooting tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37

Firmware download issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .38

iv Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Troubleshooting firmwareDownload . . . . . . . . . . . . . . . . . . . . . . . . .41

Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .41

Brocade DCX error handling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41

USB error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44

Considerations for downgrading firmware . . . . . . . . . . . . . . . . . . . .44

Preinstallation messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44

Blade types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45

Firmware versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47

IP settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48

Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49

Port settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51

Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54

Zoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54

Chapter 6 Security Issues

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .57

Password issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .57

Password recovery options . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58

Protocol and certificate management issues . . . . . . . . . . . . . . . . . .58

Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .59

SNMP issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60

Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .60

FIPS issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60

Chapter 7 ISL Trunking Issues

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61

Link issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61

Buffer credit issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62

Chapter 8 Zone Issues

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .63

Overview of corrective action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .63

Verifying a fabric merge problem . . . . . . . . . . . . . . . . . . . . . . . .63

Segmented fabrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .63

Zone conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64

Correcting a fabric merge problem quickly . . . . . . . . . . . . . . . .65

Editing zone configuration members . . . . . . . . . . . . . . . . . . . . .66

Reordering the zone member list . . . . . . . . . . . . . . . . . . . . . . . .66

Checking for Fibre Channel connectivity problems . . . . . . . . . .66

Checking for zoning problems. . . . . . . . . . . . . . . . . . . . . . . . . . .68

Gathering additional information. . . . . . . . . . . . . . . . . . . . . . . . . . . .68

Fabric OS Troubleshooting and Diagnostics Guide v 53-1000853-01

Chapter 9 FCIP Issues

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69

FCIP tunnel issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69

FCIP links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .71

Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .72

Port mirroring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .72

Supported hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .73

Port mirroring considerations . . . . . . . . . . . . . . . . . . . . . . . . . . .74

Port mirroring management . . . . . . . . . . . . . . . . . . . . . . . . . . . .75

FTRACE concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .76

Tracing Fibre Channel information . . . . . . . . . . . . . . . . . . . . . . .76

Chapter 10 FICON Fabric Issues

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79

FICON issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79

Troubleshooting FICON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80

General information to gather for all cases . . . . . . . . . . . . . . . .80

Identifying ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81

Single-switch topology checklist . . . . . . . . . . . . . . . . . . . . . . . . .82

Cascade mode topology checklist . . . . . . . . . . . . . . . . . . . . . . .82

Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . .82

Troubleshooting FICON CUP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .83

Troubleshooting FICON NPIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85

Chapter 11 iSCSI Issues

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .87

Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .87

Zoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89

Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .90

Chapter 12 Working With Diagnostic Features

In this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .91

About Fabric OS diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .91

Diagnostic information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .92

Power-on self test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .92

Switch status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94

Viewing the overall status of the switch. . . . . . . . . . . . . . . . . . .94

Displaying switch information . . . . . . . . . . . . . . . . . . . . . . . . . . .95

Displaying the uptime for a switch . . . . . . . . . . . . . . . . . . . . . . .95

vi Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Port information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .96

Viewing the status of a port . . . . . . . . . . . . . . . . . . . . . . . . . . . .96

Displaying the port statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . .97

Displaying a summary of port errors for a switch . . . . . . . . . . .98

Equipment status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99

Displaying the status of the fans . . . . . . . . . . . . . . . . . . . . . . . .99

Displaying the status of a power supply. . . . . . . . . . . . . . . . . . .99

Displaying temperature status . . . . . . . . . . . . . . . . . . . . . . . . .100

System message log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .100

Displaying the system message log, with no page breaks . . .100 Displaying the system message log one message at a time .100

Clearing the system message log . . . . . . . . . . . . . . . . . . . . . . .101

Port log. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .101

Viewing the port log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .101

Syslogd configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102

Configuring the host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103

Configuring the switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103

Automatic trace dump transfers . . . . . . . . . . . . . . . . . . . . . . . . . . .104

Specifying a remote server . . . . . . . . . . . . . . . . . . . . . . . . . . . .105

Enabling the automatic transfer of trace dumps. . . . . . . . . . .105

Setting up periodic checking of the remote server . . . . . . . . .105

Saving comprehensive diagnostic files to the server . . . . . . .105

Diagnostic tests not supported by M-EOS 9.6.2 and FOS 6.0 . . . .106

Appendix A Switch Type

Appendix B Hexidecimal

Index

Fabric OS Troubleshooting and Diagnostics Guide vii 53-1000853-01

viii Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

About This Document

In this chapter

•How this document is organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

•Supported hardware and software. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x

•What’s new in this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x

•Document conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

•Additional information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

•Getting technical help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

•Document feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv

How this document is organized

The document contains the following chapters:

• Chapter 1, “Introduction to Troubleshooting,” gives a brief overview of Fabric OS, explains the

Fabric OS CLI Help feature, and provides typical connection and configuration procedures.

• Chapter 2, “General Issues,” provides information on licensing, hardware, and syslog issues.

• Chapter 3, “Connections Issues,” provides information and procedures on managing

authentication and user accounts for the switch management channel.

• Chapter 4, “Configuration Issues,” provides information and procedures for configuring ACL

policies for FC port and switch binding and managing the fabric-wide consistency policy.

• Chapter 5, “FirmwareDownload Errors,” provides procedures for maintaining and backing up

your switch configurations.

• Chapter 6, “Security Issues,” provides procedures for basic password and user account

management.

• Chapter 7, “ISL Trunking Issues,” describes the concepts and provides procedures for using

administrative domains.

• Chapter 8, “Zone Issues,” provides preparations and procedures for performing firmware

downloads, as well troubleshooting information.

• Chapter 9, “FCIP Issues,” provides information and procedures specific to Brocade 48000 and

Brocade DCX models. Because these models have CP blades and port blades, they require procedures that are not relevant to the Brocade fixed-port models.

• Chapter 10, “FICON Fabric Issues,” provides information and procedures specific to Brocade

48000 and Brocade DCX models. Because these models have CP blades and port blades, they require procedures that are not relevant to the Brocade fixed-port models.

Fabric OS Troubleshooting and Diagnostics Guide ix 53-1000853-01

• Chapter 11, “iSCSI Issues,” provides information and procedures specific to Brocade 48000

and Brocade DCX models. Because these models have CP blades that support the iSCSI feature.

• Chapter 12, “Working With Diagnostic Features,” provides procedures for use of the Brocade

Adaptive Networking suite of tools, including Traffic Isolation, QoS Ingress Rate Limiting, and QoS SID/DID Traffic Prioritization.

• The appendices provide special information to guide you in understanding switch output.

Supported hardware and software

In those instances in which procedures or parts of procedures documented here apply to some switches but not to others, this guide identifies exactly which switches are supported and which are not.

Although many different software and hardware configurations are tested and supported by Brocade Communications Systems, Inc. for 6.1.0, documenting all possible configurations and scenarios is beyond the scope of this document.

The following hardware platforms are supported by this release of Fabric OS:

• Brocade 200E switch

• Brocade 300 switch

• Brocade 4016 switch

• Brocade 4018 switch

• Brocade 4020 switch

• Brocade 4024 switch

• Brocade 4100 switch

• Brocade 4900 switch

• Brocade 5000 switch

• Brocade 5100 switch

• Brocade 5300 switch

• Brocade 7500 switch

• Brocade 7600 switch

• Brocade 48000 director

• Brocade DCX Backbone

What’s new in this document

This manual is a compilation of all the information originally distributed through out the Fabric OS Administrator’s Guide chapters.

x Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Document conventions

NOTE

ATTENTION

CAUTION

DANGER

This section describes text formatting conventions and important notice formats used in this document.

Text formatting

The narrative-text formatting conventions that are used are as follows:

bold text Identifies command names

italic text Provides emphasis

code text Identifies CLI output

For readability, command names in the narrative portions of this guide are presented in mixed lettercase: for example, switchShow. In actual examples, command lettercase is often all lowercase. Otherwise, this manual specifically notes those cases in which a command is case sensitive.

Identifies the names of user-manipulated GUI elements Identifies keywords and operands Identifies text to enter at the GUI or CLI

Identifies variables Identifies paths and Internet addresses Identifies document titles

Identifies command syntax examples

Notes, cautions, and warnings

The following notices and statements are used in this manual. They are listed below in order of increasing severity of potential hazards.

A note provides a tip, guidance or advice, emphasizes important information, or provides a reference to related information.

An Attention statement indicates potential damage to hardware or data.

A Caution statement alerts you to situations that can be potentially hazardous to you.

A Danger statement indicates conditions or situations that can be potentially lethal or extremely hazardous to you. Safety labels are also attached directly to products to warn of these conditions or situations.

Fabric OS Troubleshooting and Diagnostics Guide xi 53-1000853-01

Key terms

For definitions specific to Brocade and Fibre Channel, see the Brocade Glossary.

For definitions of SAN-specific terms, visit the Storage Networking Industry Association online dictionary at:

http://www.snia.org/education/dictionary

Additional information

This section lists additional Brocade and industry-specific documentation that you might find helpful.

Brocade resources

To get up-to-the-minute information, join Brocade Connect. It’s free! Go to

http://www.brocade.com and click Brocade Connect to register at no cost for a user ID and

password.

For practical discussions about SAN design, implementation, and maintenance, you can obtain

Building SANs with Brocade Fabric Switches through:

http://www.amazon.com

For additional Brocade documentation, visit the Brocade SAN Info Center and click the Resource Library location:

http://www.brocade.com

Release notes are available on the Brocade Connect Web site and are also bundled with the Fabric OS firmware.

Other industry resources

• White papers, online demos, and data sheets are available through the Brocade Web site at

http://www.brocade.com/products/software.jhtml.

• Best practice guides, white papers, data sheets, and other documentation is available through

the Brocade Partner Web site.

For additional resource information, visit the Technical Committee T11 Web site. This Web site provides interface standards for high-performance and mass storage applications for Fibre Channel, storage management, and other applications:

http://www.t11.org

For information about the Fibre Channel industry, visit the Fibre Channel Industry Association Web site:

http://www.fibrechannel.org

xii Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Getting technical help

Contact your switch support supplier for hardware, firmware, and software support, including product repairs and part ordering. To expedite your call, have the following information available:

1. General Information

• Switch model

• Switch operating system version

• Error numbers and messages received

• supportSave command output

• Detailed description of the problem, including the switch or fabric behavior immediately

following the problem, and specific questions

• Description of any troubleshooting steps already performed and the results

• Serial console and Telnet session logs

• syslog message logs

2. Switch Serial Number

The switch serial number and corresponding bar code are provided on the serial number label, as illustrated below.:

*FT00X0054E9*

FT00X0054E9

The serial number label is located as follows:

• Brocade 200E—On the nonport side of the chassis.

• Brocade 4016—On the top of the switch module.

• Brocade 4018—On the top of the blade.

• Brocade 4020 and 4024—On the bottom of the switch module.

• Brocade 4100, 4900, and 7500—On the switch ID pull-out tab located inside the chassis

on the port side on the left.

• Brocade 5000—On the switch ID pull-out tab located on the bottom of the port side of the

switch

• Brocade 300, 5100, and 5300—On the switch ID pull-out tab located on the bottom of the

port side of the switch.

• Brocade 7600—On the bottom of the chassis.

• Brocade 48000—Inside the chassis next to the power supply bays.

• Brocade DCX Backbone—On the bottom right on the port side of the chassis.

3. World Wide Name (WWN)

Use the wwn command to display the switch WWN.

If you cannot use the wwn command because the switch is inoperable, you can get the WWN from the same place as the serial number, except for the Brocade DCX. For the Brocade DCX, access the numbers on the WWN cards by removing the Brocade logo plate at the top of the nonport side of the chassis.

Fabric OS Troubleshooting and Diagnostics Guide xiii 53-1000853-01

For the Brocade 4016, 4018, 4020, and 4024 embedded switches: Provide the license ID. Use the licenseIdShow command to display the WWN.

Document feedback

Quality is our first concern at Brocade and we have made every effort to ensure the accuracy and completeness of this document. However, if you find an error or an omission, or you think that a topic needs further development, we want to hear from you. Forward your feedback to:

documentation@brocade.com

Provide the title and version number of the document and as much detail as possible about your comment, including the topic heading and page number and your suggestions for improvement.

xiv Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Chapter

Introduction to Troubleshooting

This chapter provides information on troubleshooting and the most common procedures to use to diagnose and recover from problems.

This book is a companion guide to be used in conjunction with the Fabric OS Administrator’s Guide. Although it provides a lot of common troubleshooting tips and techniques it does not teach troubleshooting methodology.

In this chapter

•About troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

•Most common problem areas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

•Questions for common symptoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

•Gathering information for your switch support provider . . . . . . . . . . . . . . . . . 5

•Building a case for your switch support provider . . . . . . . . . . . . . . . . . . . . . . 7

About troubleshooting

Troubleshooting should begin at the center of the SAN — the fabric. Because switches are located between the hosts and storage devices and have visibility into both sides of the storage network, starting with them can help narrow the search path. After eliminating the possibility of a fault within the fabric, see if the problem is on the storage side or the host side, and continue a more detailed diagnosis from there. Using this approach can quickly pinpoint and isolate problems.

For example, if a host cannot detect a storage device, run a switch command, for example switchShow to determine if the storage device is logically connected to the switch. If not, focus first on the switch directly connecting to storage. Use your vendor-supplied storage diagnostic tools to better understand why it is not visible to the switch. If the storage can be detected by the switch, and the host still cannot detect the storage device, then there is still a problem between the host and switch.

Network time protocol

One of the most frustrating parts of troubleshooting is trying to synchronize switch’s message logs and portlogs with other switches in the fabric. If you do not have NTP set up on your switches, then trying to synchronize log files to track a problem is practically impossible.

Fabric OS Troubleshooting and Diagnostics Guide 1 53-1000853-01

Most common problem areas

Table 1 identifies the most common problem areas that arise within SANs and identifies tools to

use to resolve them.

TABLE 1 Common troubleshooting problems and tools

Problem area Investigate Tools

Fabric • Missing devices

• Marginal links (unstable connections)

• Incorrect zoning configurations

• Incorrect switch configurations

Storage Devices

• Physical issues between switch and

devices

• Incorrect storage software

configurations

Hosts

• Downlevel HBA firmware

• Incorrect device driver installation

• Incorrect device driver configuration

Storage Management Applications

• Incorrect installation and

configuration of the storage devices that the software references.

For example, if using a volume-management application, check for:

• Incorrect volume installation

• Incorrect volume configuration

• Switch LEDs

• Switch commands (for example,

switchShow or nsAllShow) for diagnostics

• Web or GUI-based monitoring and

management software tools

• Device LEDs

• Storage diagnostic tools

• Switch commands (for example,

switchShow or nsAllShow) for diagnostics

• Host operating system diagnostic

tools

• Device driver diagnostic tools

• Switch commands (for example,

switchShow or nsAllShow) for diagnostics

Also, make sure you use the latest HBA firmware recommended by the switch supplier or on the HBA supplier's web site

• Application-specific tools and

resources

Questions for common symptoms

You first need to determine what the problem is. Some symptoms are obvious, such as the switch rebooted without any user intervention, or more obscure, such as your storage is having intermittent connectivity to a particular host. Whatever the symptom is, you will need to gather information from the devices that are directly involved in the symptom.

Table 2 lists common symptoms and possible areas to check. You may notice that an intermittent

connectivity problem has lots of variables to look into, such as the type of connection between the two devices, how the connection is behaving, and the port type involved.

2 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Questions for common symptoms

TABLE 2 Common symptoms

Symptom Areas to check Chapter

BadRootDev errors Firmware versions on switch Chapter 5, “FirmwareDownload Errors”

Blade is faulty Firmware or application download

Hardware

Blade is stuck in the “LOADING” state Firmware or application download Chapter 5, “FirmwareDownload Errors”

Configupload or download fails FTP or SCP server or USB availability Chapter 4, “Configuration Issues”

E_Port failed to come online Correct licensing

Fabric parameters Zoning

EX_Port does not form Links Chapter 3, “Connections Issues”

Fabric merge fails Fabric segmentation Chapter 2, “General Issues”

Fabric segments Licensing

Zoning Fabric parameters

FCIP tunnel bounces FCIP tunnel Chapter 9, “FCIP Issues”

FCIP tunnel does not come online FCIP tunnel Chapter 9, “FCIP Issues”

FCIP tunnel does not form Licensing

Fabric parameters

FCIP tunnel is sluggish FCIP tunnel Chapter 9, “FCIP Issues”

Feature is not working Licensing Chapter 2, “General Issues”

FICON switch does not talk to hosts FICON settings Chapter 10, “FICON Fabric Issues”

FirmwareDownload fails FTP or SCP server or USB availability

Firmware version compatibility Unsupported features enabled

Intermittent connectivity Links

Trunking Buffer credits FCIP tunnel

LEDs are flashing Links Chapter 3, “Connections Issues”

LEDs are steady Links Chapter 3, “Connections Issues”

Marginal link Links Chapter 3, “Connections Issues”

No connectivity between host and storage Cables

SCSI timeout errors SCSI retry errors

No connectivity between switches Licensing

Fabric parameters Segmentation Zoning, if applicable

No light on LEDs Links Chapter 3, “Connections Issues”

Performance problems Links

FCIP tunnels

Chapter 2, “General Issues” Chapter 5, “FirmwareDownload Errors”

Chapter 2, “General Issues” Chapter 3, “Connections Issues” Chapter 8, “Zone Issues”

Chapter 3, “Connections Issues” Chapter 8, “Zone Issues”

Chapter 2, “General Issues” Chapter 3, “Connections Issues” Chapter 8, “Zone Issues”

Chapter 2, “General Issues” Chapter 9, “FCIP Issues”

Chapter 5, “FirmwareDownload Errors”

Chapter 3, “Connections Issues” Chapter 7, “ISL Trunking Issues” Chapter 9, “FCIP Issues”

Chapter 2, “General Issues” Chapter 3, “Connections Issues” Chapter 8, “Zone Issues”

Chapter 3, “Connections Issues” Chapter 9, “FCIP Issues”

Fabric OS Troubleshooting and Diagnostics Guide 3 53-1000853-01

Questions for common symptoms

TABLE 2 Common symptoms

Symptom Areas to check Chapter

SCSI retry errors Buffer credits

FCIP tunnel bandwidth

SCSI timeout errors Links

HBA Buffer credits FCIP tunnel bandwidth

Switch constantly reboots FIPS Chapter 6, “Security Issues”

Switch is unable to join fabric Security policies

Zoning Fabric parameters

Switch reboots during configup/download Configuration file discrepancy Chapter 4, “Configuration Issues”

Syslog messages Hardware

SNMP management station

Trunk bounces Cables are on same port group

SFPs Trunked ports

Trunk failed to form Licensing

Cables are on same port group SFPs Trunked ports Zoning

User forgot password Password recovery Chapter 6, “Security Issues”

User is unable to change switch settings RBAC settings

Account settings

Zone configuration mismatch Effective configuration Chapter 8, “Zone Issues”

Zone content mismatch Effective configuration Chapter 8, “Zone Issues”

Zone type mismatch Effective configuration Chapter 8, “Zone Issues”

Chapter 9, “FCIP Issues”

Chapter 3, “Connections Issues” Chapter 7, “ISL Trunking Issues” Chapter 9, “FCIP Issues”

Chapter 3, “Connections Issues”

Chapter 2, “General Issues” Chapter 6, “Security Issues”

Chapter 7, “ISL Trunking Issues”

Chapter 2, “General Issues” Chapter 3, “Connections Issues” Chapter 7, “ISL Trunking Issues” Chapter 8, “Zone Issues”

Chapter 6, “Security Issues”

4 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Gathering information for your switch support provider

NOTE

Gathering information for your switch support provider

If you are troubleshooting a production system, you must gather data quickly. As soon as a problem is observed, perform the following tasks (if using a dual CP system, run the commands on both CPs). For more information about these commands and their operands, refer to the Fabric OS Command Reference.

1. Enter the supportSave command to save RASLOG, TRACE, supportShow, core file, FFDC data, and other support information.

It is recommended that you use the supportFtp command to set up the supportSave environment for automatic dump transfers using the -n and -c options; this will save you from having to enter or know all the required FTP parameters needed to successfully execute a supportSave operation.

• Enter the supportShow command to collect information for the local CP to a remote FTP

location or using the USB memory device on supporting products. This command does not collect RASLOG, TRACE, core files or FFDC data.

To capture the data from the supportShow command, you will need to run the command through a Telnet or SSH utility or serial console connection.

2. Gather console output and logs.

For more details about these commands, see the Fabric OS Command Reference.

Setting up your switch for FTP

1. Connect to the switch and log in using an account assigned to the admin role.

2. Type the following command:

supportFtp -s [-h hostip][-u username][-p password][-d remotedirectory]

3. Respond to the prompts as follows:

-h hostip Specifies FTP host IP address. It must be an IP address. hostip should be less

than 48 characters.

-u username Enter the user name of your account on the server; for example, “JaneDoe”.

-d remotedirectory Specifies remote directory to store trace dump files. The supportFtp command

cannot take a slash (/) as a directory name. The remote directory should be less than 48 characters.

-p password Specifies FTP user password. If the user name is anonymous, the password is

not needed. password should be less than 48 characters.

Example of supportFTP command

switch:admin> supportftp -s Host IP Addr[1080::8:800:200C:417A]: User Name[njoe]: Password[********]: Remote Dir[support]: Auto file transfer parameters changed

Fabric OS Troubleshooting and Diagnostics Guide 5 53-1000853-01

Gathering information for your switch support provider

Capturing a supportSave

1. Connect to the switch and log in using an account assigned to the admin role.

2. Type the supportSave command.

When invoked without operands, this command goes into interactive mode.The following operands are optional:

-n Does not prompt for confirmation. This operand is optional; if omitted, you are prompted for confirmation.

-c Uses the FTP parameters saved by the supportFtp command. This operand is optional; if omitted, specify the FTP parameters through command line options or interactively. To display the current FTP parameters, run supportFtp (on a dual-CP system, run supportFtp on the active CP).

Capturing a supportShow

1. Connect to the switch through a Telnet or SSH utility or a serial console connection.

2. Log in using an account assigned to the admin role.

3. Set the Telnet or SSH utility to capture output from the screen.

Some Telnet or SSH utilities require this step to be performed prior to opening up a session. Check with your Telnet or SSH utility vendor for instructions.

4. Type the supportShow command.

Capturing output from a console

Some information, such as boot information is only outputted directly to the console. In order to capture this information you have to connect directly to the switch through its management interface, either a serial cable or an RJ-45 connection.

1. Connect directly to the switch using hyperterminal.

2. Log in to the switch using an account assigned to the admin role.

3. Set the utility to capture output from the screen.

Some utilities require this step to be performed prior to opening up a session. Check with your utility vendor for instructions.

4. Type the command or start the process to capture the required data on the console.

Capturing command output

1. Connect to the switch through a Telnet or SSH utility.

2. Log in using an account assigned to the admin role.

3. Set the Telnet or SSH utility to capture output from the screen.

Some Telnet or SSH utilities require this step to be performed prior to opening up a session. Check with your Telnet or SSH utility vendor for instructions.

4. Type the command or start the process to capture the required data on the console.

6 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Building a case for your switch support provider

The following form should be filled out in its entirety and presented to your switch support provider when you are ready to contact them. Having this information immediately available will expedite the information gathering process that is necessary to begin determining the problem and finding a solution.

Basic switch information

1. What is the switch’s current Fabric OS level?

To determine the switch’s Fabric OS level, type the firmwareShow command and write the information.

2. What is the switch model?

To determine the switch model, type the switchshow command and write down the value in the switchType field. Cross-reference this value with the chart located in Appendix A, “Switch

Type”.

3. Is the switch operational? Yes or No

4. Impact assessment and urgency:

• Is the switch down? Yes or no.

• Is it a standalone switch? Yes or no.

• Are there VE, VEX or EX ports connected to the chassis? Yes or no.

• How large is the fabric? nsallshow

• Is it a secure fabric?

• Are there security policies turned on in the fabric? If so, what are they? (Gather the output

from the following commands:

 secPolicyShow  fddCfg --showall  ipFilter --show  authUtil --show  secAuthSecret --show  fipsCfg --showall

• Is the fabric redundant? If yes, what is the MPIO software? (List vendor and version.)

5. If you have a redundant fabric, did a failover occur?

6. Was POST enabled on the switch?

7. Which CP blade was active? (Only applicable to the Brocade 24000 and 48000 directors, and the Brocade DCX Backbone.)

Detailed problem information

Obtain as much of the following informational items as possible prior to contacting the SAN technical support vendor.

Document the sequence of events by answering the following questions:

Fabric OS Troubleshooting and Diagnostics Guide 7 53-1000853-01

Building a case for your switch support provider

• What happened prior to the problem?

• Is the problem reproducible?

• If so, what are the steps to produce the problem?

• What configuration was in place when the problem occurred?

• A description of the problem with the switch or the fault with the fabric.

• The last actions or changes made to the system environment:

 settings  supportSave output; you can save this information on a qualified and installed

Brocade USB storage device only on the Brocade 300, 5100, 5300 and the Brocade DCX enterprise-class platform.

 supportShow output

• Host information:

 OS version and patch level  HBA type  HBA firmware version  HBA driver version  Configuration settings

• Storage information:

 Disk/tape type  Disk/tape firmware level  Controller type  Controller firmware level  Configuration settings  Storage software (such as EMC Control Center, Veritas SPC, etc.)

8. What and when were the last actions or changes made to the system environment?

TABLE 3 Environmental changes

Type of Change Date when change occurred

8 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Building a case for your switch support provider

Gathering additional information

Below are features that require you to gather additional information. The additional information is necessary in order for your switch support provider to effectively and efficiently troubleshoot your issue. Refer to the chapter specified for the commands whose data you need to capture.

• Configurations, see Chapter 3, “Connections Issues”.

• Firmwaredownload, see Chapter 5, “FirmwareDownload Errors”.

• Trunking, see Chapter 7, “ISL Trunking Issues”.

• Zoning, see Chapter 8, “Zone Issues”.

• FCIP tunnels, see Chapter 9, “FCIP Issues”.

• FICON, see Chapter 10, “FICON Fabric Issues”.

Fabric OS Troubleshooting and Diagnostics Guide 9 53-1000853-01

Building a case for your switch support provider

10 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Chapter

General Issues

This chapter provides information on troubleshooting and the most common procedures to use to recover from licensing and common switch log errors.

In this chapter

•Licensing issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

•Switch Message Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

•Fibre Channel Routing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

•Third party applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Licensing issues

Some features need licenses in order to work properly. To view a list of features and their associated licenses, refer to the Fabric OS Administrator’s Guide. Licenses are created using a switch’s World Wide Name so you cannot apply one license to different switches. Before calling your switch support provider, verify that you have the correct licenses installed.

Symptom A feature is not working.

Probable cause and recommended action

Refer to the Fabric OS Administrator’s Guide to determine if the appropriate licenses are installed on the local switch and any connecting switches.

Determining installed licenses

1. Connect to the switch and log in using an account assigned to the admin role.

2. Type the licenseShow command.

A list of the switches currently installed licenses will be displayed.

Switch Message Logs

Switch message logs contain information on events that happen on the switch or in the fabric. This is an effective tool in understanding what is going on in your fabric or on your switch. Weekly review of these logs is necessary to prevent minor problems from becoming huge issues, or in catching problems at an early stage.

Below are some common problems that can occur with or in your system message log.

Fabric OS Troubleshooting and Diagnostics Guide 11 53-1000853-01

Switch Message Logs

Symptom Inaccurate information in the system message log

Probable cause and recommended action

In rare instances, events gathered by the track change feature can report inaccurate information to the system message log.

For example, a user enters a correct user name and password, but the login was rejected because the maximum number of users had been reached. However, when looking at the system message log, the login was reported as successful.

If the maximum number of switch users has been reached, the switch will still perform correctly in that it will reject the login of additional users, even if they enter the correct user name and password information.

However, in this limited example, the Track Change feature will report this event inaccurately to the system message log; it will appear that the login was successful. This scenario only occurs when the maximum number of users has been reached; otherwise, the login information displayed in the system message log should reflect reality.

See the Fabric OS Administrator’s Guide for information regarding enabling and disabling track changes (TC).

Symptom MQ errors are appearing in the switch log.

Probable cause and recommended action

An MQ error is a message queue error. Identify an MQ error message by looking for the two letters MQ followed by a number in the error message:

2004/08/24-10:04:42, [MQ-1004], 218,, ERROR, ras007, mqRead, queue = raslog-test- string0123456-raslog, queue I D = 1, type = 2

MQ errors can result in devices dropping from the switches Name Server or can prevent a switch from joining the fabric. MQ errors are rare and difficult to troubleshoot; resolve them by working with the switch supplier. When encountering an MQ error, issue the supportSave command to capture debug information about the switch; then, forward the supportSave data to the switch supplier for further investigation.

Symptom I2C bus errors are appearing in the switch log.

Probable cause and recommended action

C bus errors generally indicate defective hardware or poorly seated devices or blades; the specific item is listed in the error message. See the Fabric OS Message Reference for information specific to the error that was received. Some Chip-Port (CPT) and Environmental Monitor (EM) messages contain I

If the I

C message does not indicate the specific hardware that may be failing, begin debugging the

C-related information.

hardware, as this is the most likely cause. The next sections provide procedures for debugging the hardware.

Checking fan components

1. Log in to the switch as user.

2. Enter the fanShow command.

12 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Fibre Channel Routing

3. Check the fan status and speed output.

If any of the fan speeds display abnormal RPMs, replace the fan. You may first consider re-seating the fan (unplug it and plug it back in).

Checking the switch temperature

1. Log in to the switch as user.

2. Enter the tempShow command.

3. Check the temperature output.

Look for indications of high or low temperatures.

Checking the power supply

1. Log in to the switch as user.

2. Enter the psShow command.

3. Check the power supply status. Refer to the appropriate hardware reference manual for

details regarding the power supply status.

If any of the power supplies show a status other than OK, consider replacing the power supply as soon as possible.

Checking the temperature, fan, and power supply

1. Log in to the switch as user.

2. Enter the sensorShow command. See the Fabric OS Command Reference for details regarding

the sensor numbers.

3. Check the temperature output.

Look for indications of high or low temperatures.

4. Check the fan speed output.

If any of the fan speeds display abnormal RPMs, replace the fan FRU.

5. Check the power supply status.

If any power supplies show a status other than OK, consider replacing the power supply as soon as possible.

Fibre Channel Routing

The FC-FC Routing Service enables you to route the ECHO generated when an fcPing command is issued on a switch, providing fcPing capability between two devices in different fabrics across the FC router.

Fabric OS Troubleshooting and Diagnostics Guide 13 53-1000853-01

Fibre Channel Routing

Checking for Fibre Channel connectivity problems

1. On the edge Fabric OS switch, make sure that the source and destination devices are properly

configured in the LSAN zone before entering the fcPing command. This command performs the following functions:

• Checks the zoning configuration for the two ports specified.

• Generates an ELS (extended link service) ECHO request to the source port specified and

validates the response.

• Generates an ELS ECHO request to the destination port specified and validates the

response.

switch:admin> fcping 0x060f00 0x05f001 Source: 0x60f00 Destination: 0x5f001 Zone Check: Zoned

Pinging 0x60f00 with 12 bytes of data: received reply from 0x60f00: 12 bytes time:501 usec received reply from 0x60f00: 12 bytes time:437 usec received reply from 0x60f00: 12 bytes time:506 usec received reply from 0x60f00: 12 bytes time:430 usec received reply from 0x60f00: 12 bytes time:462 usec 5 frames sent, 5 frames received, 0 frames rejected, 0 frames timeout Round-trip min/avg/max = 430/467/506 usec

Pinging 0x5f001 with 12 bytes of data: received reply from 0x5f001: 12 bytes time:2803 usec received reply from 0x5f001: 12 bytes time:2701 usec received reply from 0x5f001: 12 bytes time:3193 usec received reply from 0x5f001: 12 bytes time:2738 usec received reply from 0x5f001: 12 bytes time:2746 usec 5 frames sent, 5 frames received, 0 frames rejected, 0 frames timeout Round-trip min/avg/max = 2701/2836/3193 usec

2. Regardless of the device’s zoning configuration, the fcPing command sends the ELS frame to

the destination port. A destination device can take any one of the following actions:

• Send an ELS Accept to the ELS request.

• Send an ELS Reject to the ELS request.

• Ignore the ELS request.

There are some devices that do not support the ELS ECHO request. In these cases, the device will either not respond to the request or send an ELS reject. When a device does not respond to the ELS request, further debugging is required; however, do not assume that the device is not connected.

For details about the fcPing command, see the Fabric OS Command Reference.

14 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Third party applications

Symptom Replication application works for a while and then breaks.

Probable cause and recommended action

Some third party applications will work when they are first set up and then cease to work due to an incorrect parameter setting. Check each of the following parameters and your application vendor documentation to determine if these are set correctly:

• Port-base routing

Use the aptPolicy command to set this feature.

• In-order delivery

Use the iodSet command to turn this feature on and the iodReset command to turn this feature off.

• Load balancing

In most cases this should be set to off. Use the dlsReset command to turn off the function.

Third party applications

Fabric OS Troubleshooting and Diagnostics Guide 15 53-1000853-01

Third party applications

16 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Chapter

Connections Issues

This chapter provides information on troubleshooting basic connectivity issues and the most common procedures to use to diagnose and recover from basic connection problems.

In this chapter

•Port initialization and FCP auto discovery process . . . . . . . . . . . . . . . . . . . . 17

•Link issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

•Connection problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

•Link failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

•Marginal links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

•Device login issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

•Media-related issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

•Segmented fabrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Port initialization and FCP auto discovery process

The steps in the port initialization process represent a protocol used to discover the type of connected device and establish the port type and port speed. The possible port types are as follows:

• U_Port—Universal FC port. The base Fibre Channel port type and all unidentified, or uninitiated

ports are listed as U_Ports.

• L_/FL_Port—Fabric Loop port. Connects public loop devices.

• G_Port—Generic port. Acts as a transition port for non-loop fabric-capable devices.

• E_Port—Expansion port. Assigned to ISL links.

• F_Port—Fabric port. Assigned to fabric-capable devices.

• EX_Port—A type of E_Port. It connects a Fibre Channel router to an edge fabric. From the point

of view of a switch in an edge fabric, an EX_Port appears as a normal E_Port. It follows applicable Fibre Channel standards as other E_Ports. However, the router terminates EX_Ports rather than allowing different fabrics to merge as would happen on a switch with regular E_Ports.

• VE_Port—A virtual E_Port. However, it terminates at the switch and does not propagate fabric

services or routing topology information from one edge fabric to another.

• VEX_Port—A virtual EX_Port. It connects a Fibre Channel router to an edge fabric. From the

point of view of a switch in an edge fabric, a VEX_Port appears as a normal VE_Port. It follows the same Fibre Channel protocol as other VE_Ports. However, the router terminates VEX_Ports rather than allowing different fabrics to merge as would happen on a switch with regular VE_Ports.

Fabric OS Troubleshooting and Diagnostics Guide 17 53-1000853-01

Port initialization and FCP auto discovery process

Figure 1 shows the process behind port initialization. Understanding this process can help you

determine where a problem resides. For example, if your switch cannot form an E_Port, you understand that the process never got to that point or does not recognize the switch as an E_Port. Possible solutions would be to look at licensing and port configuration. Verify that the correct licensing is installed or that the port is not configured as a loop port, a G_Port, or the port speed is not set.

FIGURE 1 Simple port initialization process

The FCP auto discovery process enables private storage devices that accept the process login (PRLI) to communicate in a fabric.

If device probing is enabled, the embedded port logs in (PLOGI) and attempts a PRLI into the device to retrieve information to enter into the name server. This enables private devices that do not perform a fabric login (FLOGI), but accept PRLI, to be entered in the name server and receive full fabric citizenship. Private hosts require the QuickLoop feature which is not available in Fabric OS v4.0.0 and later.

A fabric-capable device will register information with the Name Server during a FLOGI. These devices will typically register information with the name server before querying for a device list. The embedded port will still PLOGI and attempt PRLI with these devices.

To display the contents of a switch’s Name Server, use the nsShow or nsAllShow command. For more information about these name server commands, refer to Fabric OS Command Reference.

18 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Link issues

Symptom LEDs are flashing.

Symptom LEDs are steady.

Symptom No light from the LEDs.

Link issues

Probable cause and recommended action

Depending on the rate of the flash and the color of the LED this could mean several things. To determine what is happening on either your port status LED or power status LED, refer to that switch’s model hardware reference manual. There is a table that describes the LEDs purpose and explains the current behavior as well as provides suggested resolutions.

Probable cause and recommended action

The color of the LED is important in this instance. To determine what is happening on either your port status LED or power status LED, refer to that switch’s model hardware reference manual. There is a table that describes the LEDs purpose and explains the current behavior as well as provides suggested resolutions.

Probable cause and recommended action

If there is no light coming from the LED, then no signal is being detected. Check your cable and SFP to determine the physical fault.

Symptom EX_Port does not form.

Probable cause and recommended action

This usually happens when the fabric parameters are set up correctly for the EX_Port.

A common parameter to check is the use of the World Wide Node Name (WWNN) instead of the port World Wide Name (pWWN). To fix this problem use the pWWN instead.

Connection problems

If a host is unable to detect its target, for example, a storage or tape device, you should begin troubleshooting the problem at the switch. Determine if the problem is the target or the host, then continue to divide the suspected problem-path in half until you can pinpoint the problem. One of the most common solutions is zoning. Verify that the host and target are in the same zone. For more information on zoning, refer to Chapter 8, “Zone Issues”.

Checking the logical connection

1. Enter the switchShow command.

2. Review the output from the command and determine if the device successfully logged into the

switch.

• A device that is logically connected to the switch is registered as an F_, L_, E_, EX_, VE_,

VEX_, or N_Port.

Fabric OS Troubleshooting and Diagnostics Guide 19 53-1000853-01

Connection problems

• A device that is not logically connected to the switch will be registered as a G_ or U_Port. If

NPIV is not on the switch, the N_Port is another possible port type.

3. If the missing device is logically connected, proceed to the next troubleshooting procedure

(“Checking the name server (NS)” on page 20).

4. If the missing device is not logically connected, check the device and everything on that side of

the data path. Also see “Link failures” on page 21 for additional information.

Checking the path includes the following for the Host. Verify the following:

• All aspects of the Host OS.

• The third-party vendor multi-pathing input/output (MPIO) software if it is being used.

• The driver settings and binaries are up to date.

• The device Basic Input Output System (BIOS) settings are correct.

• The HBA configuration is correct according to manufacturers specifications.

• The SFPs in the HBA are compatible with the Hosts HBA.

• The cable going from the switch to the Host HBA is not damaged.

• The SFP on the switch is compatible with the switch.

• All switch settings related to the Host.

Checking the path includes the following for the Target:

• The driver settings and binaries are up to date.

• The device Basic Input Output System (BIOS) settings are correct.

• The HBA configuration is correct according to the manufacturers specifications.

• The SFPs in the HBA are compatible with the Target HBA.

• The cable going from the switch to the Target HBA is not damaged.

• All switch settings related to the Target.

See “Checking for a loop initialization failure” on page 22 as the next potential trouble spot.

Checking the name server (NS)

1. Enter the nsShow command on the switch to determine if the device is attached:

The Local Name Server has 9 entries {

Type Pid COS PortName NodeName TTL(sec)

*N 021a00; 2,3;20:00:00:e0:69:f0:07:c6;10:00:00:e0:69:f0:07:c6; 895 Fabric Port Name: 20:0a:00:60:69:10:8d:fd NL 051edc; 3;21:00:00:20:37:d9:77:96;20:00:00:20:37:d9:77:96; na FC4s: FCP [SEAGATE ST318304FC 0005]

Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051ee0; 3;21:00:00:20:37:d9:73:0f;20:00:00:20:37:d9:73:0f; na FC4s: FCP [SEAGATE ST318304FC 0005]

Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051ee1; 3;21:00:00:20:37:d9:76:b3;20:00:00:20:37:d9:76:b3; na FC4s: FCP [SEAGATE ST318304FC 0005]

Fabric Port Name: 20:0e:00:60:69:10:9b:5b

20 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Link failures

NL 051ee2; 3;21:00:00:20:37:d9:77:5a;20:00:00:20:37:d9:77:5a; na FC4s: FCP [SEAGATE ST318304FC 0005]

Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051ee4; 3;21:00:00:20:37:d9:74:d7;20:00:00:20:37:d9:74:d7; na FC4s: FCP [SEAGATE ST318304FC 0005]

Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051ee8; 3;21:00:00:20:37:d9:6f:eb;20:00:00:20:37:d9:6f:eb; na FC4s: FCP [SEAGATE ST318304FC 0005] Fabric Port Name: 20:0e:00:60:69:10:9b:5b NL 051eef; 3;21:00:00:20:37:d9:77:45;20:00:00:20:37:d9:77:45; na FC4s: FCP [SEAGATE ST318304FC 0005]

Fabric Port Name: 20:0e:00:60:69:10:9b:5b N 051f00; 2,3;50:06:04:82:bc:01:9a:0c;50:06:04:82:bc:01:9a:0c; na FC4s: FCP [EMC SYMMETRIX 5267]

Fabric Port Name: 20:0f:00:60:69:10:9b:5b

2. Look for the device in the NS list, which lists the nodes connected to that switch. This allows

you to determine if a particular node is accessible on the network.

• If the device is not present in the NS list, the problem is between the device and the

switch. There may be a time-out communication problem between edge devices and the name server, or there may be a login issue. First check the edge device documentation to determine if there is a time-out setting or parameter that can be reconfigured. Also, check the port log for NS registration information and FCP probing failures (using the fcpProbeShow command). If these queries do not help solve the problem, contact the support organization for the product that appears to be inaccessible.

• If the device is listed in the NS, the problem is between the storage device and the host.

There may be a zoning mismatch or a host/storage issue. Proceed to Chapter 8, “Zone

Issues”.

3. Enter the portLoginShow command to check the port login status.

4. Enter the fcpProbeShow command to display the FCP probing information for the devices

attached to the specified F_Port or L_Port. This information includes the number of successful logins and SCSI INQUIRY commands sent over this port and a list of the attached devices.

5. Check the port log to determine whether or not the device sent the FLOGI frame to the switch,

and the switch probed the device.

Link failures

A link failure occurs when a server, storage, or switch device is connected to a switch, but the link between the server, storage, or switch and the switch does not come up. This prevents the devices from communicating to or through the switch.

If the switchShow command or LEDs indicate that the link has not come up properly, use one or more of the following procedures.

The port negotiates the link speed with the opposite side. The negotiation usually completes in one or two seconds; however, sometimes the speed negotiation fails.

Fabric OS Troubleshooting and Diagnostics Guide 21 53-1000853-01

NOTE

Link failures

Skip this procedure if the port speed is set to a static speed through the portCfgSpeed command.

Determining a successful negotiation

1. Enter the portCfgShow command to display the port speed settings of all the ports.

2. Enter the switchShow command to determine if the port has module light.

3. Determine whether or not the port completes speed negotiation by entering the portCfgSpeed

command. Then change the port speed to 1, 2, 4 or 8Gbps, depending on what speed can be used by both devices. This should correct the negotiation by setting to one speed.

4. Enter the portLogShow or portLogDump command.

5. Check the events area of the output:

14:38:51.976 SPEE sn <Port#> NC 00000001,00000000,00000001 14:39:39.227 SPEE sn <Port#> NC 00000002,00000000,00000001

• sn indicates a speed negotiation.

• NC indicates negotiation complete.

If these fields do not appear, proceed to the step 6.

6. Correct the negotiation by entering the portCfgSpeed [slotnumber/]portnumber, speed_level

command if the fields in step 5 do not appear.

switch:admin> portcfgspeed Usage: portCfgSpeed PortNumber Speed_Level Speed_Level: 0 - Auto Negotiate 1 - 1Gbps 2 - 2Gbps 4 - 4Gbps 8 - 8Gbps ax - Auto Negotiate + enhanced retries

Checking for a loop initialization failure

1. Verify the port is an L_Port.

a. Enter the switchShow command.

b. Check the comment field of the output to verify that the switch port indicates an L_Port. If

a loop device is connected to the switch, the switch port must be initialized as an L_Port.

c. Check to ensure that the port state is online; otherwise, check for link failures.

2. Verify that loop initialization occurred if the port the loop device is attached does not negotiate

as an L_Port.

a. Enter the portLogShow or portLogDump command.

b. Check argument number four for the loop initialization soft assigned (LISA) frame

(0x11050100).

switch:admin> portlogdumpport 4 time task event port cmd args

-------------------------------------------------

22 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Link failures

11:40:02.078 PORT Rx3 23 20 22000000,00000000,ffffffff,11050100 Received LISA frame

The LISA frame indicates that the loop initialization is complete.

3. Skip point-to-point initialization by using the portCfgLport Command.

The switch changes to point-to-point initialization after the LISA phase of the loop initialization. This behavior sometimes causes trouble with old HBAs.

Checking for a point-to-point initialization failure

1. Enter the switchShow command to confirm that the port is active and has a module that is

synchronized.

If a fabric device or another switch is connected to the switch, the switch port must be online.

2. Enter the portLogShow or portLogDump commands.

3. Verify the event area for the port state entry is pstate. The command entry AC indicates that

the port has completed point-to-point initialization.

switch:admin> portlogdumpport 4 time task event port cmd args

------------------------------------------------11:38:21.726 INTR pstate 4 AC

4. Skip over the loop initialization phase.

After becoming an active port, the port becomes an F_Port or an E_Port depending on the device on the opposite side. If the opposite device is a fabric device, the port becomes an F_Port. If the opposite device is another switch, the port becomes an E_Port.

If there is a problem with the fabric device, enter the portCfgGPort to force the port to try to come up as point-to-point only.

Correcting a port that has come up in the wrong mode

1. Enter the switchShow command.

2. Check the output from the switchShow command and follow the suggested actions in Table 4.

TABLE 4 SwitchShow output and suggested action

Output Suggested action

Disabled If the port is disabled (for example, due to persistent disable or security reasons), attempt

to resolve the issue and then enter the portEnable or portCfgPersistentEnable command.

Bypassed The port may be testing.

Loopback The port may be testing.

E_Port If the opposite side is not another switch, the link has come up in a wrong mode. Check the

output from the portLogShow or PortLogDump commands and identify the link initialization stage where the initialization procedure went wrong.

F_Port If the opposite side of the link is a private loop device or a switch, the link has come up in a

wrong mode. Check the output from portLogShow or PortLogDump commands.

Fabric OS Troubleshooting and Diagnostics Guide 23 53-1000853-01

Marginal links

NOTE

Marginal links

A marginal link involves the connection between the switch and the edge device. Isolating the exact cause of a marginal link involves analyzing and testing many of the components that make up the link (including the switch port, switch SFP, cable, edge device, and edge device SFP).

TABLE 4 SwitchShow output and suggested action (Continued)

Output Suggested action

G_Port The port has not come up as an E_Port or F_Port. Check the output from portLogShow or

PortLogDump commands and identify the link initialization stage where the initialization procedure went wrong.

L_Port If the opposite side is not a loop device, the link has come up in a wrong mode. Check the

output from portLogShow or PortLogDump commands and identify the link initialization stage where the initialization procedure went wrong.

If you are unable to read a portlog dump, contact your switch support provider for assistance.

Troubleshooting a marginal link

1. Enter the portErrShow command.

2. Determine whether there is a relatively high number of errors (such as CRC errors or ENC_OUT

errors), or if there are a steadily increasing number of errors to confirm a marginal link.

3. If you suspect a marginal link, isolate the areas by moving the suspected marginal port cable

to a different port on the switch. Reseating of SFPs may also cure marginal port problems.

If the problem stops or goes away, the switch port or the SFP is marginal (proceed to step 4)

If the problem does not stop or go away, see step 7.

4. Replace the SFP on the marginal port.

5. Run the portLoopbackTest on the marginal port. You will need an adapter to run the loopback

test for the SFP. Otherwise, run the test on the marginal port using the loopback mode lb=5. See the Fabric OS Command Reference for additional information on this command.

TABLE 5 Loopback modes

Loopback mode Description

1 Port Loopback (loopback plugs)

2 External Serializer/Deserializer (SerDes) loopback

5 Internal (parallel) loopback (indicates no external equipment)

7 Back-end bypass and port loopback

8 Back-end bypass and SerDes loopback

9 Back-end bypass and internal loopback

6. Check the results of the loopback test and proceed as follows:

• If the loopback test failed, the port is bad. Replace the port blade or switch.

24 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

• If the loopback test did not fail, the SFP was bad.

7. Perform the following steps to rule out cabling issues:

a. Insert a new cable in the suspected marginal port.

b. Enter the portErrShow command to determine if a problem still exists.

• If the portErrShow output displays a normal number of generated errors, the issue is

• If the portErrShow output still displays a high number of generated errors, follow the

Device login issues

A correct login is when the port type matches the device type that is plugged in. In the example below, it shows that the device connected to Port 1 is a fabric point-to-point device and it is correctly logged in an F-Port.

brcd5300:admin> switchshow switchName:brcd5300 switchType:64.3 switchState:Online switchMode:Native switchRole:Subordinate switchDomain:1 switchId:fffc01 switchWwn:10:00:00:05:1e:40:ff:c4 zoning:OFF switchBeacon:OFF FC Router:OFF FC Router BB Fabric ID:1

Device login issues

solved.

troubleshooting procedures for the Host or Storage device in the following section,

“Device login issues”.

Area Port Media Speed State Proto ===================================== 0 0 -- N8 No_Module 1 1 -- N8 No_Module 2 2 -- N8 No_Module 3 3 -- N8 No_Module 4 4 -- N8 No_Module 5 5 -- N8 No_Module 6 6 -- N8 No_Module 7 7 -- N8 No_Module 8 8 -- N8 No_Module 9 9 -- N8 No_Module 10 10 -- N8 No_Module 11 11 -- N8 No_Module 12 12 -- N8 No_Module 13 13 -- N8 No_Module 14 14 -- N8 No_Module 15 15 -- N8 No_Module 16 16 -- N8 No_Module 17 17 -- N8 No_Module 18 18 -- N8 No_Module 19 19 -- N8 No_Module 20 20 -- N8 No_Module 21 21 -- N8 No_Module

Fabric OS Troubleshooting and Diagnostics Guide 25 53-1000853-01

Device login issues

22 22 -- N8 No_Module 23 23 -- N8 No_Module 24 24 -- N8 No_Module 25 25 -- N8 No_Module 26 26 -- N8 No_Module 27 27 -- N8 No_Module 28 28 -- N8 No_Module 29 29 -- N8 No_Module 30 30 -- N8 No_Module 31 31 -- N8 No_Module 32 32 -- N8 No_Module 33 33 -- N8 No_Module 34 34 -- N8 No_Module 35 35 -- N8 No_Module 36 36 -- N8 No_Module 37 37 -- N8 No_Module 38 38 -- N8 No_Module 39 39 -- N8 No_Module 40 40 -- N8 No_Module 41 41 -- N8 No_Module 42 42 -- N8 No_Module 43 43 -- N8 No_Module 44 44 -- N8 No_Module 45 45 -- N8 No_Module 46 46 -- N8 No_Module 47 47 -- N8 No_Module 48 48 -- N8 No_Module 49 49 -- N8 No_Module 50 50 -- N8 No_Module 51 51 -- N8 No_Module 52 52 -- N8 No_Module 53 53 -- N8 No_Module 54 54 -- N8 No_Module 55 55 -- N8 No_Module 56 56 -- N8 No_Module 57 57 -- N8 No_Module 58 58 -- N8 No_Module 59 59 -- N8 No_Module 60 60 -- N8 No_Module 61 61 -- N8 No_Module 62 62 -- N8 No_Module 63 63 -- N8 No_Module 64 64 id N2 Online E-Port 10:00:00:05:1e:34:d0:05 "1_d1" (Trunk master) 65 65 -- N8 No_Module 66 66 -- N8 No_Module 67 67 id AN No_Sync 68 68 id N2 Online L-Port 13 public 69 69 -- N8 No_Module 70 70 -- N8 No_Module 71 71 id N2 Online L-Port 13 public 72 72 -- N8 No_Module 73 73 -- N8 No_Module 74 74 -- N8 No_Module 75 75 -- N8 No_Module 76 76 id N2 Online E-Port 10:00:00:05:1e:34:d0:05 "1_d1" (upstream)(Trunk master) 77 77 id N4 Online F-Port 10:00:00:06:2b:0f:6c:1f 78 78 -- N8 No_Module

26 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Device login issues

79 79 id N2 Online E-Port 10:00:00:05:1e:34:d0:05 "1_d1" (Trunk master)

Pinpointing problems with device logins

1. Log in to the switch as admin.

2. Enter the switchShow command; then, check for correct logins.

3. Enter the portCfgShow command to see if the port is configured correctly.

In some cases, you may find that the port has been locked as an L_Port and the device attached is a fabric point-to-point device such as a host or switch. This would be an incorrect configuration for the device and therefore the device cannot log into the switch.

To correct this type of problem, remove the Lock L_Port configuration using the portCfgDefault command.

switch:admin> portcfgshow Ports of Slot 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

-----------------+--+--+--+--+----+--+--+--+----+--+--+--+----+--+--+-Speed AN AN AN AN AN AN AN AN AN AN AN AN AN AN AN AN Trunk Port ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON

Long Distance .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

VC Link Init .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Locked L_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Locked G_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Disabled E_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

ISL R_RDY Mode .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

RSCN Suppressed .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Persistent Disable.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

NPIV capability ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON

where AN:AutoNegotiate, ..:OFF, ??:INVALID, SN:Software controlled AutoNegotiation.

4. Enter the portErrShow command; then, check for errors that can cause login problems. A

steadily increasing number of errors can indicate a problem. Track errors by sampling the port errors every five or ten minutes.

• frames tx and rx are the number of frames being transmitted and received.

• crc_err counter goes up then the physical path should be inspected. Check the cables to

and from the switch, patch panel, and other devices. Check the SFP by swapping it with a well-known-working SFP.

• enc_out are errors that occur outside the frame is usually a bad primitive. To determine if

you are having a cable problem, take snapshots of the porterrshow in increments of 5 to 10 minutes. If you notice the crc_err counter go up, you have a bad or damaged cable, or a bad or damaged device in the path.

• disc_c3 errors are discarded class 3 errors which means that the switch is holding onto

the frame longer than the hold time allows. One problem this could be related to is ISL oversubscription.

Fabric OS Troubleshooting and Diagnostics Guide 27 53-1000853-01

Device login issues

NOTE

switch:admin> porterrshow

frames enc crc too too bad enc disc link loss loss frjt fbsy tx rx in err shrt long eof out c3 fail sync sig ===================================================================== 0: 39m 95m 0 0 0 0 0 364k 10 13 6.1k 8 0 0 1: 2.6g 4.1g 0 0 0 0 0 150k 46 0 0 1 0 0 2: 3.7g 2.0g 0 0 0 0 0 18 134 0 0 0 0 0 3: 2.2g 2.9g 0 0 0 0 0 19 127 0 0 0 0 0 4: 20m 19m 0 0 0 0 0 241k 10 10 4.3k 5 0 0 5: 113m 2.4g 0 0 0 0 0 96k 12 0 0 1 0 0 6: 331m 359m 0 0 0 0 0 249m 1.4k 1 2.3k 2 0 0 7: 156m 171m 0 0 0 0 0 10m 1.4k 27 52 6 0 0 8: 3.0g 1.1g 5 2 0 0 0 117k 1 38 62 28 0 0 9: 2.0g 807m 0 0 0 0 0 12k 0 10 19 2 0 0 10: 0 0 0 0 0 0 0 139k 0 0 0 1 0 0 11: 0 0 0 0 0 0 0 129k 0 0 0 1 0 0 12: 1.7g 3.1g 0 0 0 0 0 426 1.4k 0 0 0 0 0 13: 697m 691m 0 0 0 0 0 6.0k 1.4k 0 0 0 0 0 14: 2.3g 1.7g 0 0 0 0 0 107 1.0k 0 0 0 0 0 15: 3.1g 893m 0 0 0 0 0 58 37k 0 0 0 0 0

When two shared ports on an FC4-48 blade are receiving traffic and the primary port goes offline, all the frames that are out for delivery for the primary port are dropped, but the counters show them as dropped on the secondary port that shares the same area. Error counters increment unexpectedly for the secondary port, but the secondary port is operating properly.

If this occurs, clear the counters using the portStatsClear command on the secondary port after primary port goes offline.

5. Enter the portFlagsShow command; then, check to see how a port has logged in and where a

switch:admin> portflagsshow Port SNMP Physical Flags

------------------------------ 0 Offline In_Sync PRESENT U_PORT LED 1 Online In_Sync PRESENT ACTIVE F_PORT G_PORT U_PORT LOGICAL_ONLINE LOGIN NOELP LED ACCEPT 2 Offline No_Light PRESENT U_PORT LED 3 Offline No_Module PRESENT U_PORT LED 4 Offline No_Module PRESENT U_PORT LED 5 Offline No_Light PRESENT U_PORT LED 6 Offline No_Module PRESENT U_PORT LED 7 Offline No_Module PRESENT U_PORT LED 8 Offline No_Light PRESENT U_PORT LED 9 Offline No_Light PRESENT U_PORT LED 10 Offline No_Module PRESENT U_PORT LED 11 Offline No_Module PRESENT U_PORT LED 12 Offline No_Module PRESENT U_PORT LED 13 Offline No_Module PRESENT U_PORT LED 14 Online In_Sync PRESENT ACTIVE F_PORT G_PORT U_PORT LOGICAL_ONLINE LOGIN NOELP LED ACCEPT 15 Online In_Sync PRESENT ACTIVE E_PORT G_PORT U_PORT SEGMENTED LOGIN LED

28 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Media-related issues

NOTE

6. Enter the portLogDumpPort portid command where the port ID is the port number; then, view

the device-to-switch communication.

switch:admin> portlogdump 13 time task event port cmd args

------------------------------------------------Tue Apr 24 19:45:58 2007 19:45:58.728 PORT Tx3 0 12 22000000,00000000,ffffffff,11010000 19:45:58.778 SPEE sn 0 WS 000000f0,00000000,00000000 19:45:58.787 SPEE sn 0 WS 00000001,00000000,00000000 19:45:59.327 SPEE sn 0 NC 00000002,00000000,00000001 19:45:59.328 LOOP loopscn 0 LIP 8002 19:45:59.328 LOOP loopscn 0 LIP f7f7 19:45:59.328 PORT Tx3 0 12 22000000,00000000,ffffffff,11010000 19:45:59.378 SPEE sn 0 WS 000000f0,00000000,00000000 19:45:59.387 SPEE sn 0 WS 00000001,00000000,00000000 19:45:59.927 SPEE sn 0 NC 00000002,00000000,00000001 19:45:59.927 LOOP loopscn 0 LIP 8002 19:45:59.928 LOOP loopscn 0 LIP f7f7 19:45:59.928 PORT Tx3 0 12 22000000,00000000,ffffffff,11010000

See “Port log” on page 101 for overview information about portLogDump.

Media-related issues

This section provides procedures that help pinpoint any media-related issues, such as bad cables and SFPs, in the fabric. The tests listed in Table 6 are a combination of structural and functional tests that can be used to provide an overview of the hardware components and help identify media-related issues.

• Structural tests perform basic testing of the switch circuit. If a structural test fails, replace the

main board or port blade.

• Functional tests verify the intended operational behavior of the switch by running frames

through ports or bypass circuitry.

TABLE 6 Component test descriptions

Test name Operands Checks

portTest [-ports itemlist] [-iteration count]

spinFab [-nmegs count] [-ports itemlist] [-setfail mode] Tests switch-to-switch ISL cabling and

[-userdelay time] [-timeout time] [-pattern pattern] [-patsize size] [-seed seed] [-listtype porttype]

Used to isolate problems to a single replaceable element and isolate problems to near-end terminal equipment, far-end terminal equipment, or transmission line. Diagnostics can be executed every day or on demand.

trunk group operations.

The following procedures are for checking switch-specific components.

Fabric OS Troubleshooting and Diagnostics Guide 29 53-1000853-01

Media-related issues

Testing a port’s external transmit and receive path

1. Connect to the switch and log in as admin.

2. Connect the port you want to test to any other switch port with the cable you want to test.

3. Enter the portLoopbackTest -lb_mode 2 command.

Testing a switch’s internal components

1. Connect to the switch and log in as admin.

2. Connect the port you want to test to any other switch port with the cable you want to test.

3. Enter the portLoopbackTest -lb_mode 5 command where 5 is the operand that causes the test

to run on the internal switch components (this is a partial list—see the Fabric OS Command Reference for additional command information):

[-nframes count]—Specify the number of frames to send.

[-lb_mode mode]—Select the loopback point for the test.

[-spd_mode mode]—Select the speed mode for the test.

[-ports itemlist]—Specify a list of user ports to test.

Testing components to and from the HBA

1. Connect to the switch and log in as admin.

2. Enter the portTest command (see the Fabric OS Command Reference for information on the

command options).

See Table 7 on page 30 for a list of additional tests that can be used to determine the switch components that are not functioning properly. See the Fabric OS Command Reference for additional command information.

TABLE 7 Switch component tests

Test Function

portLoopbackTest Performs a functional test of port N to N path. Verifies the functional components of the

switch.

turboRamTest Verifies that the on chip SRAM located in the 2 Gbps ASIC is using the Turbo-Ram BIST

circuitry. This allows the BIST controller to perform the SRAM write and read operations at a much faster rate.

Related Switch Test Option:

itemList Restricts the items to be tested to a smaller set of parameter values that you pass to the

switch.

30 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Segmented fabrics

Fabric segmentation is generally caused by one of the following conditions:

• Incompatible fabric parameters (see “Reconciling fabric parameters individually,” next).

• Incorrect PID setting (see Fabric OS Administrator’s Guide).

• Incompatible zoning configuration (see Chapter 8, “Zone Issues”).

• Domain ID conflict (see “Reconciling fabric parameters individually” on page 31).

• Incompatible security policies.

• Incorrect fabric mode.

• Incorrect policy distribution.

There are a number of settings that control the overall behavior and operation of the fabric. Some of these values, such as the domain ID, are assigned automatically by the fabric and can differ from one switch to another in the fabric. Other parameters, such as the BB credit, can be changed for specific applications or operating environments, but must be the same among all switches to allow the formation of a fabric.

The following fabric parameters must be identical on each switch for a fabric to merge:

• R_A_TOV

• E_D_TOV

• Data field size

• Sequence level switching

• Disable device probing

• Suppress class F traffic

• Per-frame route priority

• Long distance fabric (not necessary on Bloom-based, Condor, or GoldenEye fabrics. For more

information regarding these ASIC types, refer to Appendix A, “Switch Type”.)

• BB credit

• PID format

Segmented fabrics

Reconciling fabric parameters individually

1. Log in to one of the segmented switches as admin.

2. Enter the configShow fabric.ops command.

3. Log in to another switch in the same fabric as admin.

4. Enter the configShow fabric.ops command.

5. Compare the two switch configurations line by line and look for differences. Do this by

comparing the two Telnet windows or by printing the configShow fabric.ops output. Also, verify that the fabric parameter settings (see the above list) are the same for both switches.

6. Connect to the segmented switch after the discrepancy is identified.

7. Disable the switch by entering the switchDisable command.

8. Enter the configure command to edit the fabric parameters for the segmented switch.

Fabric OS Troubleshooting and Diagnostics Guide 31 53-1000853-01

Segmented fabrics

See the Fabric OS Command Reference for more detailed information.

9. Enable the switch by entering the switchEnable command.

Alternatively, you can reconcile fabric parameters by entering the configUpload command for each switch.

Downloading a correct configuration

You can restore a segmented fabric by downloading a previously saved correct backup configuration to the switch. Downloading in this manner reconciles any discrepancy in the fabric parameters and allows the segmented switch to rejoin the main fabric. For details on uploading and downloading configurations, see Fabric OS Administrator’s Guide.

Reconciling a domain ID conflict

If a domain ID conflict appears, the conflict is only reported at the point where the two fabrics are physically connected. However, there may be several conflicting domain IDs, which appears as soon as the initial conflict is resolved.

Typically, the fabric automatically resolves domain conflicts during fabric merges or builds unless Insistent Domain ID (IDID) is configured. If IDID is enabled, switches that cannot be programmed with a unique domain ID are segmented out. Check each switch that has IDID configured and make sure their domain IDs are unique within the configuration.

Repeat the following procedure until all domain ID conflicts are resolved.

1. Enter the fabricShow command on a switch from one of the fabrics.

2. In a separate Telnet window, enter the fabricShow command on a switch from the second

fabric.

3. Compare the fabricShow output from the two fabrics. Note the number of domain ID conflicts;

there may be several duplicate domain IDs that must be changed. Determine which switches have domain overlap and change the domain IDs for each of those switches.

4. Choose the fabric on which to change the duplicate domain ID; connect to the conflicting

switch in that fabric.

5. Enter the switchDisable command.

6. Enter the switchEnable command.

This will enable the joining switch to obtain a new domain ID as part of the process of coming online. The fabric principal switch will allocate the next available domain ID to the new switch during this process.

7. Repeat step 4 through step 6 if additional switches have conflicting domain IDs.

32 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Chapter

NOTE

Configuration Issues

It is important to maintain consistent configuration settings on all switches in the same fabric because inconsistent parameters (such as inconsistent PID formats) can cause fabric segmentation. As part of standard configuration maintenance procedures, it is recommended that you back up all important configuration data for every switch on a host computer server for emergency reference.

For information about AD-enabled switches using Fabric OS v5.2.0 or later, see the Fabric OS Administrator’s Guide.

In this chapter

•Configupload and download issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

•Brocade configuration form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Configupload and download issues

Symptom The configuration upload fails.

Probable cause and recommended action

If the configuration upload fails, It may be because of one or more of the following reasons:

• The FTP or SCP server’s host name is not known to the switch.

Verify with your network administrator that the switch has access to the FTP server.

• The USB path is not correct.

If your platform supports a USB memory device, verify that it is connected and running. Verify that the path name is correct by using the usbStorage -l command.

Example of usbStorage -l command

switch:admin> usbstorage -l firmwarekey\ 0B 2007 Aug 15 15:13 support\ 106MB 2007 Aug 24 05:36

support1034\ 105MB 2007 Aug 23 06:11 config\ 0B 2007 Aug 15 15:13 firmware\ 380MB 2007 Aug 15 15:13

FW_v6.0.0\ 380MB 2007 Aug 15 15:13 Available space on usbstorage 74%

• The FTP or SCP server’s IP address cannot be contacted.

Verify that you can connect to the FTP server. Use your local PC to connect to the FTP server or ping the FTP server.

Fabric OS Troubleshooting and Diagnostics Guide 33 53-1000853-01

Configupload and download issues

Example of a successful ping

C:\>ping 192.163.163.50 Pinging 192.163.163.50 with 32 bytes of data: Reply from 192.163.163.50: bytes=32 time=5ms TTL=61 Ping statistics for 192.163.163.50: Packets: Sent = 4, Received = 4, Lost = 0 (0%loss), Approximate round trip times in milli-seconds: Minimum = 4ms, Maximum = 5ms, Average = 4ms

If your ping is successful from your computer, but you cannot reach it from inside your data center, there could be a block on the firewall to not allow FTP connections from inside the data center. Contact your network administrator to determine if this is the cause and to resolve it by opening the port up on both inbound and outbound UDP and TCP traffic.

Example of a failed ping

C:\> ping 192.163.163.50 Pinging 192.163.163.50 with 32 bytes of data: Request timed out. Request timed out. Request timed out. Request timed out. Ping statistics for 192.163.163.50: Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),

If your ping has failed then you should verify the following:

- The ports are open on the firewall.

- The FTP server is up and running.

Example of a failed login to the FTP server

The output should be similar to the following on an unsuccessful login:

C:\>ftp 192.163.163.50 Connected to 192.163.163.50 220 Welcome to Education Services FTP service. User (10.255.252.50:(none)): upd20 331 Please specify the password. Password: 530 Login incorrect. Login failed.

If your login to the FTP or SCP server has failed, verify the username and password are correct.

• You do not have configuration upload permission on the switch.

There may be some restrictions if you are using Admin Domains or Role-Based Access Control. For more information on these types of restrictions, refer to the Fabric OS Administrator’s Guide.

• You do not have permission to write to directory on the FTP or SCP server.

Implement one change at a time, then issue the command again. By implementing one change at a time, you will be able to determine what works and what does not work. Knowing which change corrected the problems will help you to avoid this problem in future endeavors.

34 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Symptom The configuration download fails.

Probable cause and recommended action

Check the following:

• The FTP or SCP server’s host name is known to the switch.

Verify with your network administrator that the switch has access to the FTP server.

• The USB path is correct.

If your platform supports a USB memory device, verify that it is connected and running. Verify that the path name is correct. It should be the relative path from

/usb/usbstorage/brocade/configdownload or use absolute path.

• The FTP or SCP server’s IP address can be contacted.

Verify that you can connect to the FTP server. Use your local PC to connect to the FTP server or ping the FTP server.

• There was no reason to disable the switch.

Note, however, that you must disable the switch for some configuration downloads. For more information on how to perform a configuration download without disabling a switch, refer to the Fabric OS Administrator’s Guide.

Configupload and download issues

• You have permission on the host to perform configuration download.

There may be some restrictions if you are using Admin Domains or Role-Based Access Control. For more information on these types of restrictions, refer to the Fabric OS Administrator’s Guide.

• The configuration file you are trying to download exists on the host.

• The configuration file you are trying to download is a switch configuration file.

• If you selected the (default) FTP protocol, the FTP server is running on the host.

• The configuration file uses correct syntax.

• The username and password are correct.

Symptom The switch reboots during the configuration download.

Probable cause and recommended action

Issue the command again.

Gathering additional information

Be sure to capture the output from the commands you are issuing both from the switch and from your computer when you are analyzing the problem.

Send this and all logs to your switch support provider.

Messages captured in the logs

Configuration download generates both RASLog and Audit log messages resulting from execution of the configDownload command.

The following messages are written to the logs:

• configDownload completed successfully … (RASLog and Audit log)

Fabric OS Troubleshooting and Diagnostics Guide 35 53-1000853-01

Brocade configuration form

• configUpload completed successfully … (RASLog)

• configDownload not permitted … (Audit log)

• configUpload not permitted … (RASLog)

• (Warning) Downloading configuration without disabling the switch was unsuccessful. (Audit

log)

Brocade configuration form

Use this form as a hard copy reference for your configuration information.

In the hardware reference manuals for the Brocade 48000 and DCX modular switches there is a guide for FC port setting tables. The tables can be used to record configuration information for the various blades.

TABLE 8 Brocade configuration and connection

Brocade configuration settings

IP address

Gateway address

Chassis configuration option

Management connections

Serial cable tag

Ethernet cable tag

Configuration information

Domain ID

Switch name

Ethernet IP address

Ethernet subnet mask

Total number of local devices (nsShow)

Total number of devices in fabric (nsAllShow)

Total number of switches in the fabric (fabricShow)

36 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Chapter

ATTENTION

FirmwareDownload Errors

This chapter contains procedures to troubleshoot and fix common firmware download issues relating to a switch or an enterprise-class switch.

In this chapter

•Blade troubleshooting tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

•Firmware download issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

•Troubleshooting firmwareDownload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

•Brocade DCX error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

•USB error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

•Considerations for downgrading firmware. . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Blade troubleshooting tips

This chapter refers to the following specific types of blades inserted into either the Brocade 48000 director or Brocade DCX:

• FC blades or port blades contain only Fibre Channel ports: Brocade FC4-16/32/48, FC10-6,

and FC8-16/32/48.

• AP blades contain extra processors and specialized ports: Brocade FR4-18i and FC4-16IP, and

FA4-18.

• CP blades have a control processor (CP) used to control the entire switch. They can be inserted

only into slots 5 and 6 on the Brocade 48000, and slots 6 and 7 on the Brocade DCX.

• CR8 core blades provide ICL functionality between two DCX directors. They can be inserted

only into slots 5 and 8 on the Brocade DCX.

Typically, issues detected during firmware download to AP blades do not require recovery actions on your part.

If you experience frequent failovers between CPs that have different versions of firmware, then you may notice multiple blade firmware downloads and a longer startup time.

Brocade 48000 with FR4-18i blades: If you are running v5.1.0 firmware, then you cannot downgrade to earlier versions without removing the blades.

Brocade 48000 with FC4-48 or FC4-16IP blades: If you are running Fabric OS v5.2.0, then you cannot downgrade to earlier versions without removing the blades.

Do not remove blades until the EX_Ports are removed first. The firmwareDownload command will indicate when the blades are safe to remove.

Fabric OS Troubleshooting and Diagnostics Guide 37 53-1000853-01

Firmware download issues

Brocade 48000 with FA4-18 or a FC10-6 blades: If you are running Fabric OS v5.3.0, then you cannot downgrade to earlier versions without removing the blades.

Brocade 48000 with FC8-16 blades: If you are running Fabric OS v6.0, then you cannot downgrade to earlier versions without removing the blade.

Brocade DCX Director with FC8-16/32/48 blades: If you are running Fabric v6.1.0, then you cannot downgrade to pre-Fabric OS v6.0.0 versions as they are not supported on this director.

Symptom The blade is faulty (issue slotShow to confirm).

Probable cause and recommended action

If the port or application blade is faulty, enter the slotPowerOff and slotPowerOn commands for the port or application blade. If the port or application blade still appears to be faulty, remove it and re-insert it into the chassis.

Symptom The AP blade is stuck in the “LOADING” state (issue slotShow to confirm).

Probable cause and recommended action

If the blade remains in the loading state for a significant period of time, the firmware download will time out. Remove the blade and re-insert it. When it boots up, autoleveling will be triggered and the firmware download will be attempted again.

Firmware download issues

The following symptoms describe common firmware download issues and their recommended actions.

Symptom Server is inaccessible or firmware path is invalid.

Probable cause and recommended action

• The FTP or SCP server’s host name is not known to the switch.

Verify with your network administrator that the switch has access to the FTP server.

Verify the path to the FTP or SCP server is accessible from the switch. For more information on checking your FTP or SCP server, see Chapter 4, “Configuration Issues”.

• The USB path is not correct.

If your platform supports a USB memory device, verify that it is connected and running. Verify that the path name is correct by using the usbStorage -l command.

Example of usbStorage -l command

switch:admin> usbstorage -l firmwarekey\ 0B 2007 Aug 15 15:13 support\ 106MB 2007 Aug 24 05:36

support1034\ 105MB 2007 Aug 23 06:11 config\ 0B 2007 Aug 15 15:13 firmware\ 380MB 2007 Aug 15 15:13

FW_v6.0.0\ 380MB 2007 Aug 15 15:13

38 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Available space on usbstorage 74%

Example of error message

Stealth200E:admin> firmwaredownload Server Name or IP Address: 192.126.168.115 User Name: jdoe File Name: /users/home/jdoe/firmware/v6.1.0 Network Protocol(1-auto-select, 2-FTP, 3-SCP) [1]: 2 Password: Checking system settings for firmwaredownload... Protocol selected: FTP Trying address-->AF_INET IP: 192.126.168.115, flags : 2 Firmware access timeout. The server is inaccessible or firmware path is invalid. Please make sure the server name or IP address, the user/password and the firmware path are valid. If SCP is selected, SSH server must support password authentication. If USB device was used for firmwaredownload, make sure it is plugged in and enabled on the Active CP.

Symptom Cannot download the requested firmware.

Probable cause and recommended action

Firmware download issues

The firmware you are trying to download on the switch is incompatible. Check the firmware version against the switch type. If the firmware is incompatible, retrieve the correct firmware version and try again.

Example of error message

SW3900:admin> firmwaredownload Server Name or IP Address: 192.168.126.115 User Name: jdoe File Name: /users/home/jdoe/firmware/v6.1.0 Network Protocol(1-auto-select, 2-FTP, 3-SCP) [1]: 2 Password: <hidden> Checking system settings for firmwaredownload... Protocol selected: FTP Trying address-->AF_INET IP: 192.168.126.115, flags : 2 Cannot download the requested firmware because the firmware doesn't support this platform. Please enter another firmware path.

Symptom Cannot download on a switch with Interop turned on.

Probable cause and recommended action

On single CP, Interop fabric does not support Coordinated HotCode Load.

Perform a firmwareDownload -o command. The operand bypasses the checking of Coordinated HotCode Load (HCL). On single CP systems in interop fabrics, the HCL protocol is used to ensure data traffic is not disrupted during firmware upgrades. This option will allow firmwaredownload to continue even if HCL is not supported in the fabric or the protocol fails. Using this option may cause traffic disruption for some switches in the fabric.

Fabric OS Troubleshooting and Diagnostics Guide 39 53-1000853-01

Firmware download issues

Symptom You receive a BadRootDev error message.

Probable cause and recommended action

You perform a firmwaredownload on a 3900 or 4100 (a single-bladed/4.x switch). During the firmwaredownload process the boot environment variables are incorrectly set and causes a BadRootDev error message to appear.

You can find the information contained in the examples below in the output from the supportShow command.

Example of normal boot environment variable

/sbin/bootenv: AutoLoad=yes ENET_MAC=006069900E8C InitTest=MEM() LoadIdentifiers=Fabric Operating System;Fabric Operating System OSBooted=MEM()0xF0000000 OSLoadOptions=quiet;quiet OSLoader=MEM()0xF0000000;MEM()0xF0800000 OSRootPartition=hda2;hda1 SkipWatchdog=yes

The following boot environment variable example shows what it looks like when a firmware upgrade has occurred but not a proper reboot. This environment variable is temporarily placed there by the firmware upgrade process. To fix this issue, you can reboot the switch. If the problem persists, contact your switch support provider.

Example of switch needing a reboot

/sbin/bootenv:

AutoLoad=yes AutoLoadTimeout=0 ENET_MAC=006069906014 LoadIdentifiers=IDE w/ XFS;IDE w/ XFS & NFS Root OSLoadOptions=quiet;quiet OSLoader=MEM()0xF0800000;MEM()0xF0000000 OSRootPartition=hda2;hda1 SkipWatchdog=yes

SoftUpgrade=commit Upgrade=/dev/hda2

Example of BadRootDev environment variable

This environment variable will need to be removed before continuing:

/sbin/bootenv: AutoLoad=yes

BadRootDev=hda2

ENET_MAC=00051e3411ad InitTest=MEM() LoadIdentifiers=Fabric Operating System;Fabric Operating System OSBooted=MEM()0xF0000000 OSLoadOptions=quiet;quiet OSLoader=MEM()0xF0000000;MEM()0xF0800000 OSRootPartition=hda1;hda2 SkipWatchdog=yes

40 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Troubleshooting firmwareDownload

NOTE

A network diagnostic script and preinstallation check is a part of the firmwareDownload procedure. The script and preinstallation check performs troubleshooting and automatically checks for any blocking conditions. If the firmware download fails, see the Fabric OS Message Reference for details about error messages. Also see, “Considerations for downgrading firmware” on page 44.

If a firmware download fails in a director, the firmwareDownload command synchronizes the firmware on the two partitions of each CP by starting a firmware commit operation. Wait at least 15 minutes for this commit operation to complete before attempting another firmware download.

If the firmware download fails in a director or enterprise-class platform, the CPs may end up with different versions of firmware and are unable to achieve HA synchronization. In such cases, issue the firmwareDownload -s command on the standby CP; the single mode (-s) option allows you to upgrade the firmware on the standby CP to match the firmware version running on the active CP. Then re-issue the firmwareDownload command to download the desired firmware version to both CPs. For example, if CP0 is running v5.2.0 on the primary and secondary partitions, and CP1 is running v5.0.1 on the primary and secondary partition, then synchronize them by issuing the firmwareDownload command.

See the Fabric OS Message Reference for detailed information about .plist-related error messages.

Troubleshooting firmwareDownload

For more information on any of the commands in the Recommended Action section, see the Fabric OS Command Reference.

Some of the messages include error codes (as shown in the example below). These error codes are for internal use only and you can disregard them.

Example: Port configuration with EX ports enabled along with trunking for port(s) 63, use the portCfgEXPort, portCfgVEXPort, and portCfgTrunkPort commands to remedy this. Verify blade is ENABLED. (error 3)

Gathering additional information

You should follow these best practices for firmware download before you start the procedure:

• Keep all session logs.

• Enter the supportSave or the supportShow command before and after entering the

firmwareDownload command.

• If a problem persists, package together all of the information (the Telnet session logs and

serial console logs, output from the supportSave command) for your switch support provider. Make sure you identify what information was gathered before and after issuing the firmwareDownload command.

Brocade DCX error handling

If an error occurs in the middle of a firmware download, the command will make sure the two partitions of a CF card have the same version of firmware. However, the main CPU and co-CPU may have different versions of firmware. Table 9 describes error handling at different steps of the firmware download process and the required action you need to perform.

Fabric OS Troubleshooting and Diagnostics Guide 41 53-1000853-01

Brocade DCX error handling

TABLE 9 Brocade DCX CP Error Handling

Scenario No.

When Scenario Error handling Required Action

1During

step 1

2During

step 2

3During

step 1 or 2

4During

step 1 or 2

5During

step 4

6During

step 6

During downloading to the main CPU of the standby CP, if an error occurs and the main CPU reboots.

During downloading to the co CPU of the standby CP, if an error occurs and the co CPU reboots.

During downloading to any of the CPUs on the standby CP, if the downloading takes too long and exceeds the 30 minute timeout for the firmwareDownload process.

Active CP fails over during the downloading to any of the CPUs on the standby CP.

If the standby CP failed to reboot or unable to synchronize with the active CP.

When downloading to the main CPU of the standby CP, if an error occurs and the main CPU reboots.

1. When the main CPU boots up, firmwareDownload is aborted.

2. The firmwareCommit command will be initiated on the main CPU and the original firmware is restored on that CPU.

3. Both CPUs on both CPs will have the original firmware.

1. When the co CPU boots up, firmwareDownload is aborted.

2. The firmwareCommit command will be initiated on both CPUs on the standby CP.

3. Both CPUs on both CPs will have the original firmware.

1. The firmwareCommit command will be initiated on both CPU on the standby CP and the original firmware is restored on both CPUs.

2. Both CPUs on both CPs will have the original firmware.

1. The firmwareCommit command will be initiated on both CPU on the new active CP and the original firmware is restored on both CPUs.

2. Both CPUs on both CPs will have the original firmware.

1. Active CP will wait for 10 minutes and abort firmwareDownload.

2. If the standby CP boots up, firmwareCommit will start on both CPUs on the standby CP.

3. Both CPUs on the standby CP will have the new firmware, and both CPUs on the active CP will have the old firmware.

1. When the main CPU on the standby CP boots up, firmwareDownload is aborted.

2. The firmwareCommit command will be initiated on both CPUs on both CPs.

3. Both CPUs on the standby CP will have the old firmware and both CPUs on the active CP will have the new firmware.

Restart firmwareDownload after the repair is done.

Determine why the CPs fail to gain HA sync and remedy it before restarting firmwaredownload

Restart firmwareDownload after the repair is done.

42 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

TABLE 9 Brocade DCX CP Error Handling

Scenario No.

When Scenario Error handling Required Action

Brocade DCX error handling

7During

step 7

8During

step 6 or 7

9During

step 6 or 7

10 During

step 8

11 During

step 10

When downloading to the co CPU of the standby CP, if an error occurs and the co CPU reboots.

When downloading to any of the CPUs on the standby CP, downloading takes too long and exceeds the 30 minute timeout for the firmwareDownload process.

The active CP fails over during the downloading to any of the CPUs on the standby CP.

If the standby CP failed to reboot or unable to synchronize with the active CP.

If commit fails The affected CPUs will have different

1. When the co-CPU on the standby CP boots up, firmwareDownload is aborted.

2. The firmwareCommit command will be initiated on both CPUs on both CPs.

3. Both CPUs on the standby CP will have the old firmware and both CPUs on the active CP will have the new firmware.

1. The firmwareCommit command will be initiated on both CPU on both CPs.

2. Both CPUs on the standby CP will have the old firmware and both CPUs on the active CP will have the new firmware.

1. The standby CP will become the new active CP.

2. The firmwareCommit command will be initiated on both CPUs on the active CP and they will have the old firmware.

3. The firmwareCommit will be initiated on both CPUs on the standby CP and they will have the new firmware.

1. Active CP will wait for 10 minutes and abort firmwareDownload.

1The firmwareCommit command will

be initiated on both CPUs and they will have the new firmware.

2. If the standby CP boots up, firmwareCommit will start on both CPUs and they will have new firmware.

versions of firmware on its partitions. An error message is logged.

Restart firmwareDownload after the repair is done.

Determine why the CPs fail to gain HA sync and remedy it before restarting the firmwareDownload

command.

Run the firmwareCommit -f command on the switch or each CP that fails. If the problem persists, contact your switch support provider.

Fabric OS Troubleshooting and Diagnostics Guide 43 53-1000853-01

USB error handling

The following table outlines how the USB device handles errors under specific scenarios and details what actions you should take after the error occurs.

TABLE 10 USB error handling

Scenario under which download fails Error handling Action

An access error occurs during firmwaredownload due to the removal of the USB device, or USB device hardware failure, etc.

USB device is not enabled. Firmwaredownload will fail with

Firmwaredownload will timeout and commit will be started to repair the partitions of the CPUs that are affected. See previous table for details.

an error message

Considerations for downgrading firmware

To avoid failure of a firmware downgrade, verify the firmware you are downgrading to supports all the blades in the chassis and the firmware you are downgrading to supports all features you are currently using. If not, you will need to disable or remove those features that are not supported. Also, check for any one of the following conditions:

• If FC8-32 and/or FC8-48 port blade is inserted on Brocade 48000 director or Brocade DCX

enterprise-class platform, power off and remove the blade prior to downgrading the firmware.

• If an EX_Port is configured and enabled on any one of the FC8-port blades, reconfigure the port

back to default prior to downgrading the firmware.

• If port mirroring is configured and enabled on any one of the FC8-port blades, reconfigure the

port back to default prior to downgrading the firmware.

• If Access Gateway ADS policy is enabled, disable ADS policy prior to downgrading the firmware.

• If F_Port Trunking is enabled, disable it first prior to downgrading.

None.

Enable the USB device using the usbStorage -e command and retry firmwaredownload.

Preinstallation messages

The messages in this section are displayed if an exception case is encountered during firmware download. The following example shows feature-related messages that you may see if you were downgrading from v5.2.0 to v5.1.0:

The following items need to be addressed before downloading the specified firmware:

Port mirror connections detected. Please use portmirror --delete to remove these mirror connections.

AD feature is in use. Please clear it using the ad --clear command.

Port configuration with EX ports enabled along with trunking for port(s) 58, use the portcfgexport, portcfgvexport, and/or portcfgtrunkport commands to disable the port configuration. Verify that the blade is ENABLED. (error 3)

44 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Considerations for downgrading firmware

This example shows hardware-related messages for the same downgrade example:

director:admin> firmwaredownload Type of Firmware (FOS, SAS, or any application) [FOS]: Server Name or IP Address: 192.168.32.10 Network Protocol (1-auto-select, 2-FTP, 3-SCP) [1]: User Name: userfoo File Name: /home/userfoo/dist/v5.3.0 Password: Verifying the input parameters ... Checking system settings for firmwaredownload...

The following items need to be addressed before downloading the specified firmware:

AP BLADE type 31 is inserted. Please use slotshow to find out which slot it is in and remove it. SW BLADE type 36 is inserted. Please use slotshow to find out which slot it is in and remove it. Firmwaredownload command failed.

director:admin>

Blade types

These messages pertain to any blade in a chassis that may need to be removed or powered off before a firmwaredownload begins.

Message AP Blade type 24 is inserted. Please use slotshow to find out which slot it is in and remove it.

Probable cause and recommended action

The firmware download operation was attempting to download Fabric OS v5.0.0 with one or more Brocade FR4-18i port blades (blade ID 24) in the system. Brocade FR4-18i port blades are not supported on firmware v5.0.0 or earlier, so the firmware download operation is aborted.

Use the slotShow command to display which slots the Brocade FR4-18i port blades occupy, and physically remove the blades from the chassis. Retry the firmware download operation.

Message AP Blade type 31 is inserted. Please use slotshow to find out which slot it is in and remove it.

Probable cause and recommended action

The firmware download operation was attempting to downgrade a system to Fabric OS v5.1.0 or earlier with one or more Brocade FC4-16IP port blades (blade ID 31) in the system. Brocade FC4-16IP port blades are not supported on firmware v5.1.0 or earlier, so the firmware download operation failed.

Use the slotShow command to display which slots the Brocade FC4-16IP port blades occupy. Physically remove the blades from the chassis, or use the micro-switch to turn the blade off. Retry the firmware download operation.

Fabric OS Troubleshooting and Diagnostics Guide 45 53-1000853-01

Considerations for downgrading firmware

Message AP Blade type 33 is inserted. Please use slotshow to find out which slot it is in and remove it.

Cannot downgrade due to the presence of AP BLADE type 33. Remove or power off these blades before proceeding.

Probable cause and recommended action

The firmware download operation was attempting to download Fabric OS v5.0.0 with one or more Brocade FR4-18 port blades (blade ID 33) in the system. Brocade FR4-18 port blades are not supported on firmware v5.0.0 or earlier, so the firmware download operation is aborted.

Use the slotShow command to display which slots the Brocade FR4-18i port blades occupy, and physically remove the blades from the chassis Retry the firmware download operation.

Message SW Blade type 36 is inserted. Please use slotshow to find out which slot it is in and remove it.

Probable cause and recommended action

The firmware download operation was attempting to downgrade a system to Fabric OS v5.1.0 or earlier with one or more Brocade FC4-48 port blades (blade ID 36) in the system. Brocade FC4-48 port blades are not supported on firmware v5.1.0 or earlier, so the firmware download operation failed.

Use the slotShow command to display which slots the Brocade FC4-48 port blades occupy. Physically remove the blades from the chassis, or use the micro-switch to turn the blade off. Retry the firmware download operation.

Message SW Blade type 37 is inserted. Please use slotshow to find out which slot it is in and remove it.

Probable cause and recommended action

The firmware download operation was attempting to downgrade a system to Fabric OS v5.3.0 or earlier with one or more Brocade FC8-16 port blades (blade ID 37) in the system. Brocade FC8-16 port blades are not supported on firmware v5.3.0 or earlier, so the firmware download operation failed.

Use the slotShow command to display which slots the Brocade FC8-16 port blades occupy. Physically remove the blades from the chassis, or use the micro-switch to turn the blade off. Retry the firmware download operation.

Message SW Blade type 39 is inserted. Please use slotshow to find out which slot it is in and remove it.

Probable cause and recommended action

The firmware download operation was attempting to downgrade a system to Fabric OS v5.2.0 or earlier with one or more Brocade FC10-6 port blades (blade ID 39) in the system. Brocade FC10-6 port blades are not supported on firmware v5.2.0 or earlier, so the firmware download operation failed.

Use the slotShow command to display which slots the Brocade FC10-6 port blades occupy. Physically remove the blades from the chassis, or use the micro-switch to turn the blade off. Retry the firmware download operation.

46 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Considerations for downgrading firmware

Message SW Blade type 51 is inserted. Please use slotshow to find out which slot it is in and remove it.

Probable cause and recommended action

The firmware download operation was attempting to downgrade a system to Fabric OS v5.3.0 or earlier with one or more Brocade FC8-48 port blades (blade ID 51) in the system. Brocade FC8-48 port blades are not supported on firmware v5.3.0 or earlier, so the firmware download operation failed.

Use the slotShow command to display which slots the Brocade FC8-48 port blades occupy. Physically remove the blades from the chassis, or use the micro-switch to turn the blade off. Retry the firmware download operation.

Message SW Blade type 55 is inserted. Please use slotshow to find out which slot it is in and remove it.

Probable cause and recommended action

The firmware download operation was attempting to downgrade a system to Fabric OS v5.3.0 or earlier with one or more Brocade FC8-32 port blades (blade ID 55) in the system. Brocade FC8-32 port blades are not supported on firmware v5.3.0 or earlier, so the firmware download operation failed.

Use the slotShow command to display which slots the Brocade FC8-32 port blades occupy. Physically remove the blades from the chassis, or use the micro-switch to turn the blade off. Retry the firmware download operation.

Firmware versions

These messages refer to differences between the current firmware and the firmware you are applying to the switch.

Message Cannot upgrade directly to v6.1.0. Upgrade your switch to v6.0.0 first before upgrading to the

requested version.

Probable cause and recommended action

If the switch is running v5.3.0 or earlier, you will not be allowed to upgrade directly to v6.1.0 because of the “two-version” rule.

Upgrade your switch to Fabric OS version v6.0.0 before upgrading to v6.1.0

Message Cannot upgrade directly to v6.0. Upgrade your switch to v5.2.1_NI1 or v5.3.0 first before

upgrading to the requested version.

Probable cause and recommended action

If the switch is running v5.2.1 or earlier, you will not be allowed to upgrade directly to v6.0 because of the “two-version” rule.

Upgrade your switch to Fabric OS version v5.2.1_NI1 or v5.3.0 before upgrading to v6.0

Message Cannot upgrade directly to v5.3.0. Upgrade your switch to v5.1 or v5.2 first before upgrading to the

requested version.

Probable cause and recommended action

If the switch is running v5.0.0 or earlier, you will not be allowed to upgrade directly to v5.3.0 because of the “two-version” rule.

Fabric OS Troubleshooting and Diagnostics Guide 47 53-1000853-01

Considerations for downgrading firmware

Upgrade your switch to Fabric OS version v5.1.0 or v5.2.0 before upgrading to v5.3.0

Message Firmwaredownload of blade application firmware failed. Reissue firmwareDownload to recover.

Probable cause and recommended action

The firmware download operation was attempting to upgrade the SAS image while the blade was operational.

Retry the firmwaredownload command again.

IP settings

These messages refer to any IP settings that need to be fixed prior to downgrading the firmware.

Message Cannot downgrade due to the presence of IPv6 addresses on the switch. Please reconfigure these

addresses before proceeding. (Firmwaredownload will tell you which addresses are configured with IPv6 and commands used to remedy.)

Probable cause and recommended action

If the switch is running v5.3.0 or later, and if there are any IPv6 addresses configured, e.g. switch IP address, syslog IP addresses, radius server, etc. you cannot downgrade to a version that does not support IPv6.

Use the ipAddrSet command to change the IPv6 addresses to IPv4 addresses.

Message Default IP filter policies are not active on the switch. Please make the default IPv4 filter policies

active before downgrading.

Probable cause and recommended action

If switch is running v5.3.0 or later, and if any of the user-created IP filter policies are active, you cannot downgrade to v5.2.0 or earlier.

Message Cannot downgrade to v5.3.0 or earlier because one or more FCIP tunnels have IPv6 interface or a

route address configured on the switch.

Probable cause and recommended action

You cannot downgrade to v5.3.x or earlier due to one or more FCIP tunnels having an Ethernet interface or a route address configured with an IPv6 address.

Use the ipAddrSet command to reconfigure to IPv4 addresses first.

Message Cannot downgrade to v5.2.0 or earlier because user created IP filter policies are active.

Probable cause and recommended action

The default IP filter policies are not active on the switch. Make the default IPv4 filter policies active before downgrading.

48 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Considerations for downgrading firmware

Platform

These messages are switch features or fabric-wide settings that need to be removed or disabled before downgrading the firmware.

Message Only platform option 5 is supported by version 6.1.0. Use chassisconfig to reset the option before

downloading the firmware.

Probable cause and recommended action

The firmware download operation was attempting to upgrade a system to Fabric OS v6.1.0. The chassisConfig option was set to 1, 2, 3 or 4, which are not supported in v6.1.0, so the firmware download operation was aborted.

Execute the chassisConfig command with a supported option (5 for Brocade 48000 and the Brocade DCX enterprise-class platform on v6.1.0), and then retry the firmware download operation.

The supported options are:

option 5 One 384-port switch with the following configuration:

FC4-16 (blade ID 17), FC4-32 (blade ID 18) FR4-18i (Blade ID 24), FR4-18i (blade ID 31, FA4-18 (blade ID 33), 36, FC10-6 (blade ID 39) on slots 1–4 and 7–10; CP4 (blade ID 16) on slots 5–6

Message Only platform option 5 is supported by version 6.0. Use chassisconfig to reset the option before

downloading the firmware.

Probable cause and recommended action

The firmware download operation was attempting to upgrade a system to Fabric OS v6.0.0. The chassisConfig option was set to 2, 3 or 4, which are not supported in v6.0.0, so the firmware download operation was aborted.

Execute the chassisConfig command with a supported option (1 or 5 for Brocade 48000 for Fabric OS v5.3.0; and 5 for Brocade 48000 and the Brocade DCX enterprise-class platform on v6.0.0), and then retry the firmware download operation.

The supported options are:

option 1 One 128-port switch with the following configuration:

FC2-16 (blade ID 4), FC4-16 (blade ID 17) on slots 1–4 and 7–10; CP2 (blade ID 5), CP4 (blade ID 16) on slots 5–6

option 5 One 384-port switch with the following configuration:

FC4-16 (blade ID 17), FC4-32 (blade ID 18) FR4-18i (Blade ID 24), FR4-18i (blade ID 31, FA4-18 (blade ID 33), 36, FC10-6 (blade ID 39) on slots 1–4 and 7–10; CP4 (blade ID 16) on slots 5–6

Message Cannot upgrade to firmware v6.0.0. This firmware does not support Brocade 24000 platform.

Probable cause and recommended action

The Brocade 24000 does not support firmware v6.0.0. Download firmware v5.3.x on this platform.

Fabric OS Troubleshooting and Diagnostics Guide 49 53-1000853-01

Considerations for downgrading firmware

Message The active security DB size is greater than 256 KB, you will not be allowed to downgrade to below

v6.0.0.

Probable cause and recommended action

You cannot downgrade because the active security database size is greater than 256 KB. Reduce the size before downgrading.

Message Cannot downgrade to v5.3.0 or earlier because FIPS mode is enabled.

Probable cause and recommended action

You cannot downgrade because FIPS mode is enabled. Disable FIPS on the switch before continuing.

Message Cannot downgrade to v5.3.0 or earlier because LDAP is configured on the switch.

Probable cause and recommended action

LDAP is configured on the switch. Delete the LDAP configuration before continuing.

Message Cannot downgrade to v5.3.0 or earlier because one or more F_Ports have Preferred port set.

Probable cause and recommended action

You cannot downgrade to v5.3.x or earlier due to one or more F_Ports having a Preferred port set.

Use the ag –-prefdel command to remedy this before proceeding.

Message Cannot downgrade to v5.3.0 or earlier because one or more Access Gateway policies are enabled.

Probable cause and recommended action

You cannot downgrade to v5.3.x or earlier due to one or more Access Gateway policies are enabled.

Use the ag –-policydisable command to remove the policies.

Message Cannot downgrade to v5.3.0 or earlier because FCS is configured for fabric wide distribution.

Probable cause and recommended action

You cannot downgrade to 5.3 or earlier due to FCS is configured for fabric-wide distribution.

Use the fddCfg command to disable it before proceeding.

Message Cannot downgrade to v5.2.0 or earlier because the switch is currently configured with

PEAP/MSCHAPv2 for RADIUS authentication.

Probable cause and recommended action

The switch is currently configured with PEAP/MSCHAPv2 for RADIUS authentication.

Use the aaaConfig command to remove that entry before proceeding.

50 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Considerations for downgrading firmware

Port settings

These messages refer to port settings that need to be fixed before downgrading the switch’s firmware.

Message Cannot downgrade to v5.2.0 or lower due to GE port(s) has MTU size configured between 1261 to

1499 bytes. Please use portcfg command to reconfigure the MTU size and try again.

Probable cause and recommended action

If a GE port has its MTU size configured between 1261 to 1499 bytes, you will not be allowed to downgrade to v5.2.0 or earlier.

Use the portCfg command to reconfigure the MTU size and try again.

Message Cannot downgrade to v5.2.0 or lower because GE ports has IPSec and Fastwrite enabled. Please

use portcfg command to disable Fastwrite and try again.

Probable cause and recommended action

If a GE port has IPSec and Fastwrite enabled, you cannot downgrade to v5.2.0 or earlier.

Use the portCfg command to disable IPSec and try again.

Message Cannot downgrade to v5.2.0 or lower because GE port(s) has DHCP enabled. Please use portcfg

command to disable it and try again.

Probable cause and recommended action

If a GE port has DHCP enabled, you will not be allowed to downgrade to v5.2.0 or earlier.

Use the portCfg command to disable it and try again.

Message Cannot downgrade to v5.3.0 or earlier because one or more GE ports have VLAN Tagging

configured.

Probable cause and recommended action

You cannot downgrade to v5.3.x or earlier due to one or more GE ports having VLAN Tagging configured.

Use the portCfg command to reconfigure FCIP tunnel or delete VLAN Tagging entries.

Message Cannot downgrade due to presence of port mirror connections. Use portmirror

these mirror connections before proceeding.

Probable cause and recommended action

The firmware download operation was attempting to downgrade a system to Fabric OS v5.1.0 or earlier with Port Mirroring enabled. Port Mirroring is not supported on firmware v5.1.0 or earlier, so the firmware download operation failed.

Remove the mirror connections using the portMirror - -delete command. Retry the firmware download operation.

--delete to remove

Fabric OS Troubleshooting and Diagnostics Guide 51 53-1000853-01

Considerations for downgrading firmware

Message The command failed due to presence of long-distance ports in L0.5 mode. Please remove these

settings before proceeding.

Probable cause and recommended action

The firmware download operation was attempting to upgrade a system to Fabric OS v6.0.0 with long-distance ports in L0.5, L1, or L2 modes. Long-distance ports in these modes are not supported in firmware v6.0.0 or later, so the firmware upgrade operation failed.

L0 Specify L0 to configure the port to be a regular switch port. A total of 20 full-size frame

buffers are reserved for data traffic, regardless of the port’s operating speed; therefore, the maximum supported link distance is 10 km, 5 km, or 2.5 km for the port at speeds of 1 Gbps, 2 Gbps, or 4 Gbps, respectively.

LE Specify LE mode is used for E_Ports for distances beyond 5 Km and up to 10 Km. A

total of 5, 10, or 20 full-size frame buffers are reserved for port speeds of 1 Gbps, 2 Gbps, or 4 Gbps, respectively. LE does not require an Extended Fabrics license.

LD Specify LD for automatic long-distance configuration. The buffer credits for the given

E_Port are automatically configured, based on the actual link distance. Up to a total of 250 full-size frame buffers are reserved, depending upon the distance measured during E_Port initialization. If the desired distance is provided, it is used as the upper limit to the measured distance. For Bloom1-based systems, the number of frame buffers is limited to 63.

LS Specify LS mode to configure a long-distance link with a fixed buffer allocation. Up to a

total of 250 full-size frame buffers are reserved for data traffic, depending on the desired distance value provided with the portCfgLongDistance command. For Bloom1-based systems, the number of frame buffers is limited to 63.

Message The command failed due to presence of long-distance ports in LS mode. Please remove these

settings before proceeding.

Probable cause and recommended action

The firmware download operation was attempting to downgrade a system to Fabric OS v5.0.0 or earlier with long-distance ports in LS mode. Long-distance ports in LS mode is not supported in firmware v5.0.0 or earlier, so the firmware download operation failed.

Change the long distance port setting to a supported distance setting using the portCfgLongDistance command and then retry the firmware download operation. The supported settings are:

L0 Specify L0 to configure the port to be a regular switch port. A total of 20 full-size frame

L0.5 Specify L0.5 (portCfgShow displays the two-letter code as LM) long distance to support

a long distance link of up to 25 km. A total of 12, 25, or 50 full-size frame buffers are reserved for data traffic for the port at speeds of 1 Gbps, 2 Gbps, or 4 Gbps, respectively.

L1 Specify L1 long distance to support a long distance link up to 50 km. A total of 25, 50,

or 100 full-size frame buffers are reserved for data traffic for the port at speeds of 1 Gbps, 2 Gbps, or 4 Gbps, respectively.

L2 Specify L2 long distance to support a long distance link up to 100 km. A total of 50,

100, or 200 full-size frame buffers are reserved for data traffic for the port at speeds of 1 Gbps, 2 Gbps, or 4 Gbps, respectively.

52 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Considerations for downgrading firmware

LE Specify LE mode is used for E_Ports for distances beyond 5 Km and up to 10 Km. A

total of 5, 10, or 20 full-size frame buffers are reserved for port speeds of 1 Gbps, 2 Gbps, or 4 Gbps, respectively. LE does not require an Extended Fabrics license.

LD Specify LD for automatic long-distance configuration. The buffer credits for the given

LS Specify LS mode to configure a long-distance link with a fixed buffer allocation. Up to a

Message Cannot downgrade to v5.3.0 or earlier because QoS is enabled.

Probable cause and recommended action

QoS is enabled on the local switch. Disable QoS on all the ports before downgrading.

Use the portCfg qos [Slot/]Port[-Range] –disable command to disable QoS.

Message Cannot downgrade to v5.3.0 or earlier because FCIP tunnel(s) have VC QoS mapping enabled.

Probable cause and recommended action

You cannot downgrade to v5.3.x or earlier due to the FCIP tunnels having VC QoS map configured.

Use the portCfg command to delete it.

Message An SNMP trap port is set to non-default, you will not be allowed to downgrade to below v6.0.0.

Probable cause and recommended action

The SNMP trap port was set to non-default. Remove the SNMP trap port setting before downgrading.

Message Cannot downgrade to v5.3.0 or earlier because FICON is configured on the switch.

Probable cause and recommended action

You cannot downgrade to 5.3.0 or earlier due to FICON is configured.

Use the portCfg command to remove it.

Message Cannot downgrade to v5.3.0 or earlier because FTRACE is configured on the switch.

Probable cause and recommended action

You cannot downgrade to 5.3.0 or earlier due to FTRACE is configured.

Use the portCfg command to remove it.

Message Cannot downgrade to v5.3.0 or earlier because one or more EX/VEX ports are online on the

FR4-18i.

Probable cause and recommended action

You cannot downgrade to v5.3.x or earlier due to one or more EX_ or VEX_Ports are online.

Fabric OS Troubleshooting and Diagnostics Guide 53 53-1000853-01

Considerations for downgrading firmware

Use the portDisable command to disable these ports before proceeding.

Routing

These error messages refer to routing policies.

Message Cannot downgrade to v5.1.0 because Device Based routing policy is not supported by v5.1.0. Use

aptPolicy to change the routing policy before proceeding.

Probable cause and recommended action

The firmware download operation was attempting to upgrade a system to Fabric OS v5.1.0 with device-based routing policy selected. Device-based routing policy is not supported in firmware v5.1.0 or later, so the firmware download operation was aborted.

Disable the switch and change the routing policy selection to one of the following supported selections on firmware v5.1.0 using the aptPolicy command, and then retry the firmware download operation. The supported selections are:

policy 1 Port-based routing policy

With this policy, the path chosen for an incoming frame is based on:

1. Incoming port on which the frame was received

2. Destination domain for the frame

The chosen path remains the same if the dynamic load sharing (DLS) feature is not enabled. If DLS is enabled, then a different path may be chosen on a fabric event. Refer to the dlsSet command for the definition of a fabric event.

This policy may provide better ISL utilization when there is little or no oversubscription of the ISLs.

NOTE: Static routes are supported only with this policy.

policy 3 Exchange-based routing policy

With this policy, the path chosen for an incoming frame is based on:

1. Incoming port on which the frame was received

2. FC address of the Source ID (SID) for this frame

3. FC address of the Destination ID (DID) for this frame

4. FC Originator Exchange ID (OXID) for this frame

This policy allows for optimal utilization of the available paths as I/O traffic between different (SID, DID, OXID) pairs can use different paths. All frames received on a incoming port with the same (SID, DID, OXID) parameters takes the same path unless there is a fabric event. Refer to the dlsSet command for the definition of a fabric event.

This policy does not support static routes. DLS always is enabled and the DLS setting cannot change with this policy.

Zoning

These messages refer to any zone settings that need to be fixed prior to downgrading the switch’s firmware.

54 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Considerations for downgrading firmware

Message Cannot downgrade due to the presence of broadcast zone(s). Remove or disable them before

proceeding.

Probable cause and recommended action

If the switch is running v5.3.0 or later, and a “broadcast zone” is configured, you cannot downgrade the switch to v5.2.0 or earlier, as a broadcast zone gets a special meaning in v5.3.0, but it will be treated as regular zone in v5.2.0 or earlier.

Use the zoneRemove command to remove the zone or zoneDelete command to delete the zone.

Message Cannot downgrade due to LSAN count is set to 3000, please disable it before proceeding.

Probable cause and recommended action

If a switch is running v5.3.0 or later and the LSAN count is at 3000, you cannot downgrade to v5.2.0 or earlier.

Use the fcrLsanMatrix command to disable the LSAN.

Message Cannot downgrade due to LSAN zone binding is enabled. Please disable it before proceeding.

Probable cause and recommended action

If switch is running v5.3.0 or later, and if LSAN zone binding is enabled, you cannot downgrade to v5.2.0 or earlier.

Use the fcrLsanMatrix command to disable the LSAN.

Message Cannot upgrade due to the presence of an existing zone named “broadcast”. Rename this zone

before proceeding.

Probable cause and recommended action

If the switch is running v5.1.0 or v5.2.0, and if an existing zone is named “broadcast”, you cannot upgrade the switch to the v5.3.0 firmware, as broadcast zone gets a special meaning in v5.3.0.

Use the zoneDelete command to delete the zone.

Message The command failed due to the current zone size is not supported by the new firmware. Reduce

the size of the configuration before proceeding.

Probable cause and recommended action

The firmware download operation was attempting to downgrade a system to Fabric OS v5.1.0 or earlier and the current zone size is not supported by the firmware version to be downloaded, so the firmware download operation failed.

Reduce the zone database size to 256 KB. Verify that the zone size is below the 256 KB limit using the cfgSize command. Retry the firmware download operation.

Fabric OS Troubleshooting and Diagnostics Guide 55 53-1000853-01

Considerations for downgrading firmware

56 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Chapter

Security Issues

This chapter provides troubleshooting information and procedures on security for the switch management channel.

In this chapter

•Password issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

•Protocol and certificate management issues . . . . . . . . . . . . . . . . . . . . . . . . 58

•SNMP issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

•FIPS issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Password issues

The following section describes various ways to recover forgotten passwords.

Symptom User forgot password.

Probable cause and recommended action

If you know the root password, you can use this procedure to recover the password for the default accounts of user, admin, and factory.

Recovering passwords

1. Open a CLI connection (serial or Telnet) to the switch.

2. Log in as root.

3. Enter the command for the type of password that was lost:

passwd user passwd admin passwd factory

4. Enter the requested information at the prompts.

Symptom Unable to log in as root password.

Probable cause and recommended action

To recover your root password, contact your switch service provider.

Fabric OS Troubleshooting and Diagnostics Guide 57 53-1000853-01

Protocol and certificate management issues

NOTE

Symptom Unable to log into the boot PROM.

Probable cause and recommended action

To recover a lost boot PROM password, contact your switch service provider. You must have previously set a recovery string to recover the boot PROM password.

This does not work on lost or forgotten passwords in the account database.

Password recovery options

The following table describes the options available when one or more types of passwords are lost.

TABLE 11 Password recovery options

Topic Solution

If all the passwords are forgotten, what is the password recovery mechanism? Are these procedures non-disruptive recovery procedures?

If a user has only the root password, what is the password recovery mechanism?

How to recover boot PROM password? Contact your switch service provider and provide the

How do I recover a user, admin, or factory password? Contact your switch service provider.

Contact your switch service provider. A non-disruptive procedure is available.

Use passwd command to set other passwords. Use passwdDefault command to set all passwords to default.

recovery string. Refer to the Fabric OS Administrator’s Guide for more

information on setting the boot PROM password.

Refer to “Password issues” on page 57 for more information.

Symptom User is unable to modify switch settings.

Probable cause and recommended action

The most common error when managing user accounts is not setting up the default Admin Domain and access control list or role-based access control (RBAC).

Errors such as a user not being able to run a command or modify switch settings are usually related to what role the user has been assigned.

Protocol and certificate management issues

This section provides information and procedures for troubleshooting standard Fabric OS security features such as protocol and certificate management.

Secure Fabric OS is not supported in Fabric OS v6.1.0.

58 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Symptom Troubleshooting certificates

Probable cause and recommended action

If you receive messages in the browser or in a pop-up window when logging in to the target switch using HTTPS, refer to Table 12 for recommended actions you can take to correct the problem.

TABLE 12 SSL messages and actions

Message Action

The page cannot be displayed The SSL certificate is not installed correctly or HTTPS is not

The security certificate was issued by a company you have not chosen to trust….

The security certificate has expired or is not yet valid

The name on the security certificate is invalid or does not match the name of the site file

This page contains both secure and nonsecure items. Do you want to display the nonsecure items?

Protocol and certificate management issues

enabled correctly. Make sure that the certificate has not expired, that HTTPS is enabled, and that certificate file names are configured correctly.

The certificate is not installed in the browser. Install it as described in the Fabric OS Administrator’s Guide.

Either the certificate file is corrupted or it needs to be updated. Click View Certificate to verify the certificate content. If it is corrupted or out of date, obtain and install a new certificate.

The certificate is not installed correctly in the Java Plug-in. Install it as described in the Fabric OS Administrator’s Guide.

Click No in this pop-up window. The session opens with a closed lock icon on the lower-right corner of the browser, indicating an encrypted connection.

Gathering additional information

For security-related issues, use the following guidelines to gather additional data for your switch support provider.

• Perform a supportSave -n command.

• If not sure about the problem area, collect a supportSave -n from all switches in the fabric.

• If you think it may be related to E_Port authentication then collect a supportSave -n from both

switches of the affected E_Port.

• If you think this is a policy-related issue, FCS switch or other security server-related issue then

use supportSave -n to collect data from the Primary FCS switch and all affected switches.

• If login-related, then also include the following information:

- Does login problem appear on a Serial, CP IP, or Switch IP address connection?

- Is it CP0 or CP1?

- Is the CP in active or standby?

- Is it the first time login after firmwareDownload and reboot?

Fabric OS Troubleshooting and Diagnostics Guide 59 53-1000853-01

SNMP issues

Symptom SNMP management station server is unable to receive traps from fabric.

This section describes symptoms with associated causes and recommended actions for SNMP-related issues.

Probable cause and recommended action

There are several causes related to this generic issue. You will need to verify the following:

• There are no port filters in the firewalls between the fabric and the SNMP management

station.

• If your SNMP management station is a dual-homed server, check that the routing tables are

set up correctly for your network.

If you continue to have problems, collect the data in the next section and contact your switch support provider.

Gathering additional information

In addition to supportSave -n, gather the following command output:

FIPS issues

Symptom When FIPS is turned on the switch constantly reboots.

• agtCfgShow

• ipAddrShow

• the MIB browser snapshot with the problem (like Adventnet screen snapshot) for a MIB

variable

This section describes symptoms with associated causes and recommended actions for problems related to FIPS.

Probable cause and recommended action

When FIPS is turned on the switch runs conditional tests each time it is rebooted. These tests run random number generators and are executed to verify the randomness of the random number generator. The conditional tests are executed each time prior to using the random number provided by the random number generator.

The results of all self-tests, for both power-up and conditional, are recorded in the system log or are output to the local console. This includes logging both passing and failing results. If the tests fail on your switch it will constantly reboot. Because boot PROM access is disabled you will not be able to exit out of the reboot. You will need to send the switch back to your switch service provider for repair.

60 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Chapter

ISL Trunking Issues

This chapter describes symptoms and solutions to trunking problems as well as recommended actions to take to correct trunking problems.

In this chapter

•Link issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

•Buffer credit issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Link issues

This section describes trunking link issues that can come up and recommended actions to take to correct the problems.

Symptom A link that is part of an ISL trunk failed.

Probable cause and recommended action

Use the trunkDebug command to troubleshoot the problem, as shown in the following procedure.

1. Connect to the switch and log in as admin.

2. Enter the following command:

trunkDebug port, port

port Specifies the number of a port in an ISL Trunking group.

Example of a unformed E_Port

This example shows that port 3 is not configured as an E_Port:

brcdDCXbb:admin> trunkdebug 126, 127 port 126 is not E/EX port port 127 is not E/EX port

Example of a formed E_Port

brcdDCXbb1:admin> trunkdebug 100, 101 port 100 and 101 connect to the switch 10:00:00:05:1e:34:02:45

The trunkDebug command displays the possible reason that two ports cannot be trunked. Possible reasons are:

• The switch does not support trunking.

• A trunking license is required.

• Trunking is not supported in switch interoperability mode.

Fabric OS Troubleshooting and Diagnostics Guide 61 53-1000853-01

Buffer credit issues

• Port trunking is disabled.

• The port is not an E_Port.

• The port is not 2 Gbps, 4 Gbps, or 8 Gbps.

• The port connects to a switch other than the one you want it to.

To correct this issue, connect additional ISLs to the switch you want to communicate.

• The ports are not the same speed or they are not set to an invalid speed.

Manually set port speeds to a speed supported on both sides of the trunk.

• The ports are not set to the same long distance mode.

Set the long distance mode to the same setting on all ports on both sides of the trunk.

• Local or remote ports are not in the same port group.

Move all ISLs to same port group. The port groups begin at port 0 and are in groups of 4 or 8, depending on the switch model. Until this is done, the ISLs will not trunk.

• The difference in the cable length among trunked links is greater than the allowed difference.

Buffer credit issues

The following section describes a trunk going on- and offline or hosts not being able to talk to a storage device.

Symptom Trunk goes offline and online (bounces).

Probable cause and recommended action

A port disabled at one end because of buffer underallocation causes all the disabled ports at the other end to become enabled. Some of these enabled ports become disabled due to a lack of buffers, which in turn triggers ports to be enabled once again at the other end.

While the system is stabilizing the buffer allocation, it warns that ports are disabled due to lack of buffers, but it does not send a message to the console when buffers are enabled. The system requires a few passes to stabilize the buffer allocation. Ultimately, the number of ports for which buffers are available come up and stabilize. You should wait for stabilization, and then proceed with correcting the buffer allocation situation.

Getting out of buffer-limited mode Occurs on LD_Ports.

1. Change the LD port speed to a lower speed (of non-buffer limited ports).

2. Change the LD port’s estimated distance to a shorter distance (of non-buffer limited ports).

3. Change LD back to L0 (of non-buffer limited ports).

4. If you are in buffer-limited mode on the LD port, then increase the estimated distance.

5. Enable any of these changes on the buffer-limited port or switch by issuing the commands portDisable and portEnable.

62 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Chapter

Zone Issues

This chapter describes troubleshooting techniques and recommended actions for common zoning problems.

In this chapter

•Overview of corrective action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

•Segmented fabrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

•Zone conflicts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

•Gathering additional information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

Overview of corrective action

The following overview provides a basic starting point for you to troubleshoot your zoning problem.

1. Verify that you have a zone problem.

2. Determine the nature of the zone conflict.

3. Take the appropriate steps to correct zone conflict.

To correct a merge conflict without disrupting the fabric, first verify that it was a fabric merge problem, then edit zone configuration members, and then reorder the zone member list if necessary.

The newly changed zone configuration will not be effective until you issue the cfgEnable command. This should be done during a maintenance window because this may cause disruption in large fabrics.

Verifying a fabric merge problem

1. Enter the switchShow command to validate that the segmentation is due to a zone issue.

2. Review “Segmented fabrics,” next, to view the different types of zone discrepancies and determine what might be causing the conflict.

Segmented fabrics

This section discusses fabric segmentation. Fabric segmentation occurs when two or more switches are joined together by ISLs and do not communicate to each other. Each switch appears as a separate fabric when you use the fabricShow command.

Fabric OS Troubleshooting and Diagnostics Guide 63 53-1000853-01

Zone conflicts

Symptom Zone conflict appears in logs and fabric is segmented.

Probable cause and recommended action

This issue is usually caused by an incompatible zoning configurations. Verify one of the following:

• The effective cfg (zone set) on each end of the segmented ISL must be identical.

• Any zone object with the same name must have the same entries in the same sequence.

Symptom Fabric segmentation is caused by an “incompatible zone database”.

Probable cause and recommended action

If fabric segmentation is caused by an “incompatible zone database,” check following:

• Whether the merge of the two fabrics resulted in the merged zone database exceeding the

zone database size limitation.

Different Fabric OS versions support different zone database sizes, for example pre-Fabric OS v5.2.0 supports 256 Kb and Fabric OS v5.2.0 and later support 1 Mb.

• Whether any port number greater than 255 is configured in a port zone.

Any pre-Fabric OS v5.2.0 switch will not merge with a newer switches with a port index greater than 255.

Symptom Fabric segmentation is caused by a “configuration mismatch”.

Probable cause and recommended action

Occurs when zoning is enabled in both fabrics and the zone configurations are different in each fabric.

Symptom Fabric segmentation is caused by a “type mismatch”.

Probable cause and recommended action

Occurs when the name of a zone object in one fabric is also used for a different type of zone object in the other fabric. A zone object is any device in a zone

Symptom Fabric segmentation is caused by a “content mismatch”.

Probable cause and recommended action

Occurs when the definition in one fabric is different from the definition of a zone object with the same name in the other fabric.

Zone conflicts

Zone conflicts can be resolved by saving a configuration file with the configUpload command, examining the zoning information in the file, and performing a cut and paste operation so that the configuration information matches in the fabrics being merged.

After examining the configuration file, you can choose to resolve zone conflicts by using the cfgDisable command followed by the cfgClear command on the incorrectly configured segmented fabric, followed by the portDisable and portEnable commands on one of the ISL ports that connects the fabrics. This will cause a merge, making the fabric consistent with the correct configuration.

64 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Zone conflicts

ATTENTION

Be careful using the cfgClear command because it deletes the defined configuration.

Table 13 summarizes commands that are useful for debugging zoning issues.

TABLE 13 Commands for debugging zoning

Command Function

aliCreate Use to create a zone alias.

aliDelete Use to delete a zone alias.

cfgCreate Use to create a zone configuration.

cfgShow Displays zoning configuration.

cfgDisable Disables the active (effective) configuration

cfgEnable Use to enable and activate (make effective) the specified configuration.

cfgSave Use to save the specified configuration.

cfgTransAbort Use to abort the current zoning transaction without committing it.

cfgTransShow Use to display the ID of the current zoning transaction

defZone Sets the default zone access mode to No Access, initializes a zoning transaction (if one is not

already in progress), and creates the reserved zoning objects.

licenseShow Displays current license keys and associated (licensed) products.

switchShow Displays currently enabled configuration and any E_Port segmentations due to zone

conflicts.

zoneAdd Use to add a member to an existing zone.

zoneCreate Use to create a zone. Before a zone becomes active, the cfgSave and cfgEnable commands

must be used.

zoneHelp Displays help information for zone commands.

zoneShow Displays zone information.

For more information about setting up zoning on your switch, refer to the Fabric OS Administrator’s Guide. Also, see the Fabric OS Command Reference for details about zoning commands.

You can correct zone conflicts by using the cfgClear command to clear the zoning database.

The cfgClear command is a disruptive procedure.

Correcting a fabric merge problem quickly

1. Determine which switches have the incorrect zoning configuration; then, log in to the switches as admin.

2. Enter the switchDisable command on all problem switches.

3. Enter the cfgDisable command on each switch.

4. Enter the cfgClear command on each switch.

Fabric OS Troubleshooting and Diagnostics Guide 65 53-1000853-01

ATTENTION

Zone conflicts

The cfgClear command clears the zoning database on the switch where the command is run.

5. Enter the switchEnable command on each switch once the zoning configuration has been cleared.

This forces the zones to merge and populates the switches with the correct zoning database. The fabrics will then merge.

Editing zone configuration members

1. Log in to one of the switches in a segmented fabric as admin.

2. Enter the cfgShow command and print the output.

3. Start another Telnet session and connect to the next fabric as an admin.

4. Enter the cfgShow command and print the output.

5. Compare the two fabric zone configurations line by line and look for an incompatible configuration.

6. Connect to one of the fabrics.

7. Run zone configure edit commands to edit the fabric zone configuration for the segmented switch (see Table 13 on page 65 for specific commands.

If the zoneset members between two switches are not listed in the same order in both configurations, the configurations are considered a mismatch; this results in the switches being segmented in the fabric.

For example:

[cfg1 = z1; z2] is different from [cfg1 = z2; z1], even though the members of the

configuration are the same.

One simple approach to making sure that the zoneset members are in the same order is to keep the members in alphabetical order.

Reordering the zone member list

1. Obtain the output from the cfgShow for both switches.

2. Compare the order in which the zone members are listed. Members must be listed in the same order.

3. Rearrange zone members so the configuration for both switches is the same. Arrange zone members in alphabetical order, if possible.

Checking for Fibre Channel connectivity problems

Enter the fcPing command (refer to the Fabric OS Command Reference for more information on this command), which checks the zoning configuration for the two ports specified by:

• Generating an ELS (Extended Link Service frame) ECHO request to the source port

specified and validates the response.

66 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Zone conflicts

• Generating an ELS ECHO request to the destination port specified and validates the

response.

Regardless of the device’s zoning, the fcPing command sends the ELS frame to the destination port. A device can take any of the following actions:

• Send an ELS Accept to the ELS request.

• Send an ELS Reject to the ELS request.

• Ignore the ELS request.

Following is sample output from the fcPing command in which one device accepts the request and another device rejects the request:

switch:admin> fcping 10:00:00:00:c9:29:0e:c4 21:00:00:20:37:25:ad:05 Source: 10:00:00:00:c9:29:0e:c4 Destination: 21:00:00:20:37:25:ad:05 Zone Check: Not Zoned

Pinging 10:00:00:00:c9:29:0e:c4 [0x20800] with 12 bytes of date: received reply from 10:00:00:00:c9:29:0e:c4: 12 bytes time:1162 usec received reply from 10:00:00:00:c9:29:0e:c4: 12 bytes time:1013 usec received reply from 10:00:00:00:c9:29:0e:c4: 12 bytes time:1442 usec received reply from 10:00:00:00:c9:29:0e:c4: 12 bytes time:1052 usec received reply from 10:00:00:00:c9:29:0e:c4: 12 bytes time:1012 usec 5 frames sent, 5 frames received, 0 frames rejected, 0 frames timeout Round-trip min/avg/max = 1012/1136/1442 usec

Pinging 21:00:00:20:37:25:ad:05 [0x211e8] with 12 bytes of data: Request rejected Request rejected Request rejected Request rejected Request rejected 5 frames sent, 0 frames received, 5 frames rejected, 0 frames timeout Round-trip min/avg/max = 0/0/0 usec switch:admin>

Following is sample output from the fcPing command in which one device accepts the request and another device does not respond to the request:

switch:admin> fcping 0x020800 22:00:00:04:cf:75:63:85 Source: 0x20800 Destination: 22:00:00:04:cf:75:63:85 Zone Check: Zoned

Pinging 0x020800 with 12 bytes of data: received reply from 0x020800: 12 bytes time:1159 usec received reply from 0x020800: 12 bytes time:1006 usec received reply from 0x020800: 12 bytes time:1008 usec received reply from 0x020800: 12 bytes time:1038 usec received reply from 0x020800: 12 bytes time:1010 usec 5 frames sent, 5 frames received, 0 frames rejected, 0 frames timeout Round-trip min/avg/max = 1006/1044/1159 usec

Pinging 22:00:00:04:cf:75:63:85 [0x217d9] with 12 bytes of data:

Fabric OS Troubleshooting and Diagnostics Guide 67 53-1000853-01

Gathering additional information

Request timed out Request timed out Request timed out Request timed out Request timed out 5 frames sent, 0 frames received, 0 frames rejected, 5 frames timeout Round-trip min/avg/max = 0/0/0 usec switch:admin>

For details about the fcPing command, see the Fabric OS Command Reference.

Checking for zoning problems

1. Enter the cfgActvShow command to determine if zoning is enabled.

• If zoning is enabled, it is possible that the problem is being caused by zoning enforcement

• If zoning is disabled, check the default zone mode by entering the defZone --show

2. Confirm that the specific edge devices that must communicate with each other are in the same zone.

(for example, two devices in different zones cannot detect each other).

command. If it is no access, change it to all access. To modify default zone mode from no access to all access, enter the defZone

--all command, and then the cfgSave command.

• If they are not in the same zone and zoning is enabled, proceed to step 3.

• If they are in the same zone, perform the following tasks:

• Enter the portCamShow command on the host port to verify that the target is present.

• Enter the portCamShow command on the target.

• Enter the nsZoneMember command with the port ID for the zoned devices on the host

and target to determine whether the name server is aware that these devices are zoned together.

3. Resolve zoning conflicts by putting the devices into the same zoning configuration.

4. Enter the defZone the access level. The defZone command sets the default zone access mode to No Access

switch:admin> defzone --show Default Zone Access Mode committed - No Access transaction - No Transaction

See “Zone conflicts” on page 64 for additional information.

--show command to display the current state of the zone access mode and

Gathering additional information

Collect the data from a supportSave -n command. Then collect the data from the cfgTransShow command. For the port having problem, collect the data from the filterPortShow <port> command.

68 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Chapter

FCIP Issues

This chapter describes the FCIP concepts, configuration procedures, and tools and procedures for monitoring network performance. Commands described in this chapter require Admin or root user access. See the Fabric OS Command Reference for detailed information on command syntax.

In this chapter

•FCIP tunnel issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

•FCIP links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

•Port mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

•FTRACE concepts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

FCIP tunnel issues

The following are the most common FCIP tunnel issues and provide recommended actions for you to follow to fix the issue.

Symptom FCIP tunnel does not come Online.

Probable cause and recommended action

Confirm the following steps.

1. Confirm GE port is online.

portshow ge1 Eth Mac Address: 00.05.1e.37.93.06 Port State: 1 Online Port Phys: 6 In_Sync Port Flags: 0x3 PRESENT ACTIVE Port Speed: 1G

2. Confirm IP configuration is correct on both tunnel endpoints.

portshow ipif ge1

Port: ge1 Interface IP Address NetMask MTU

--------------------------------------------------------- 0 11.1.1.1 255.255.255.0 1500

3. Issue the ping command to the remote tunnel endpoint from both endpoints and traceroute.

-s is the source IP address –d is the destination IP address

portcmd --ping ge1 -s 11.1.1.1 -d 11.1.1.2 Pinging 11.1.1.2 from ip interface 11.1.1.1 on 0/ge1 with 64 bytes of data

Fabric OS Troubleshooting and Diagnostics Guide 69 53-1000853-01

FCIP tunnel issues

Reply from 11.1.1.2: bytes=64 rtt=0ms ttl=64 Reply from 11.1.1.2: bytes=64 rtt=0ms ttl=64 Reply from 11.1.1.2: bytes=64 rtt=0ms ttl=64 Reply from 11.1.1.2: bytes=64 rtt=0ms ttl=64

Ping Statistics for 11.1.1.2: Packets: Sent = 4, Received = 4, Loss = 0 ( 0 percent loss) Min RTT = 0ms, Max RTT = 0ms Average = 0ms

If you are able to ping, then you have IP connectivity and your tunnel should come up. If not continue to the next step.

4. Issue the traceroute command to the remote tunnel endpoint from both endpoints.

portcmd --traceroute ge1 -s 11.1.1.1 -d 11.1.1.2 Traceroute to 11.1.1.2 from IP interface 11.1.1.1 on 0/1, 64 hops max 1 11.1.1.2 0 ms 0 ms 0 ms Traceroute complete.

5. Confirm FCIP tunnel is configured correctly.

All settings except remote and local IP and WWN must match the opposite endpoint or the tunnel may not come up. Remote and local IP and WWN should be opposite each other.

portshow fciptunnel ge1 all

Port: ge1

------------------------------------------ Tunnel ID 0 Remote IP Addr 11.1.1.2 Local IP Addr 11.1.1.1 Remote WWN Not Configured Local WWN 10:00:00:05:1e:37:93:21 Compression off Fastwrite off Tape Pipelining off Committed Rate 20000 Kbps (0.020000 Gbps) SACK on Min Retransmit Time 100 Keepalive Timeout 10 Max Retransmissions 8 Status : Active Uptime 7 days, 20 hours, 2 minutes, 6 secondsx

6. Get a GE ethernet sniffer trace.

If all possible blocking factors on the network between the two end points are ruled out, (for example, a firewall or proxy server) then simulate a connection attempt using a ping command, from source to destination and then take an Ether trace between the two end points. The Ether trace can be examined to further troubleshoot the FCIP connectivity.

Symptom FCIP tunnel goes online and offline.

Probable cause and recommended action

A bouncing tunnel is one of the most common problems. This issue is usually due to over committing of available bandwidth (trying to push 1 Gbps through a pipe that can only sustain 0.5 Gbps).

70 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

FCIP links

• To much data tries to be sent over the link.

• Management data gets lost, queued too long, and timeouts expire.

• Data exceeds timeouts multiple times.

• Verify what link bandwidth is available.

• Confirm the IP path is being used exclusively for FCIP traffic.

• Confirm that traffic shaping is configured to limit the bandwidth to available (portshow

fciptunnel).

1. If committing a rate, generally recommend setting a little below available to allow for bursting

2. Type the portShow fciptunnel <GB Port Number> all -perf –params command.

Examine data from both routers. This data is not in the supportshow output and shows retransmissions indicating, input and output rates on the tunnels.

Gather this information for both data and management TCP connections.

3. Run the following commands on both sides of the tunnel:

• portCmd --ipperf <slot/GBPort> -s <local IP> -d <remote IP> -R

• portCmd --ipperf <slot/GBPort> -s <local IP> -d <remote IP> -S

FCIP links

4. Confirm the throughput using the portCmd

This command must be run on both sides of the tunnel, simultaneously.

On local side:

portcmd --ipperf <slot/GBPort> -s <local IP> -d <remote IP> -R

On Remote side:

portcmd --ipperf <slot/GBPort> -s <local IP> -d <remote IP> -S

5. Repeat each step in the opposite direction to get throughput

The following list contains information for troubleshooting FCIP links:

--ipperf command.

• When deleting FCIP links, you must delete them in the exact reverse order they were created.

That is, delete first the tunnels, then the IP interfaces, and finally the port configuration. The IP route information is removed automatically at this point.

IP addresses are retained by slot in the system. If FR4-18i blades are moved to different slots

•

without first deleting configurations, errors can be seen when trying to reuse these IP addresses.

• The portCmd ping command only verifies physical connectivity. This command does not verify

that you have configured the ports correctly for FCIP tunnels.

• One port can be included in multiple tunnels, but each tunnel must have at least one port that

is unique to that tunnel.

• Ports at both ends of the tunnel must be configured correctly for an FCIP tunnel to work

correctly. These ports can be either VE_Ports or VEX_Ports. A VEX_Port must be connected to a VE_Port.

Fabric OS Troubleshooting and Diagnostics Guide 71 53-1000853-01

Port mirroring

• When configuring routing over an FCIP link for a fabric, the edge fabric will use VE_Ports and

the backbone fabric will use VEX_Ports for a single tunnel.

• If an FCIP tunnel fails with the “Disabled (Fabric ID Oversubscribed)” message, the solution is

to reconfigure the VEX_Port to the same Fabric ID as all of the other ports connecting to the edge fabric.

• Due to an IPSec RASLog limitation, you may not be able to determine an incorrect

configuration that causes an IPSec tunnel to not become active. This misconfiguration can occur on either end of the tunnel. As a result, you must correctly match the encryption method, authentication algorithm, and other configurations on each end of the tunnel.

Gathering additional information

The following commands should be executed and their data collected before a supportsave is run as a supportsave can take upwards of 10 minutes to execute and some of the information is time critical.

• traceDump -n

• portTrace --show all

• portTrace --status

In addition if it is a port/tunnel specific issue, run and collect the data from the following commands:

If possible, run and collect the data from the following commands:

And finally gather the data from the supportsave -n command.

See Fabric OS Administrator’s Guide or Fabric OS Command Reference for complete details on these commands

Port mirroring

• slotShow

• portShow [slot number/]<geport number>

• portShow ipif [slot number/]<geport number>

Displays IP interface configuration for each GbE port (IP address, gateway and MTU)

• portShow arp [slot number/]<geport number>

• portShow iproute [slot number/]<geport number>

• portShow fciptunnel [slot number/]<geport number> <all | tunnel ID>

Displays complete configuration of one or all of the FCIP tunnels

• portCmd <--ping |--traceroute |--perf >

Ping and traceroute utility

Performance to determine path characteristics between FCIP endpoints

Port mirroring lets you configure a switch port to connect to a port to mirror a specific source port and destination port traffic passing though any switch port. This is only supported between F_Ports. This is a useful way to troubleshoot without bringing down the host and destination links to insert an inline analyzer.

72 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Port mirroring

Port mirroring captures traffic between two devices. It mirrors only the frames containing the SID/DID to the mirror port. Because of the way it handles mirroring, a single mirror port can mirror multiple mirror connections. This also means that the port cannot exceed the maximum bandwidth of the mirror port. Attempts to mirror more traffic than available bandwidth result in the port mirror throttling the SID/DID traffic so that traffic does not exceed the maximum available bandwidth.

Use port mirroring to detect missing frames, which may occur with zoning issues or hold timeouts, capture protocol errors, and capture ULP traffic (SCSI/FICON). This feature cannot be used on embedded switch traffic.

Port mirroring is only available using the FOS v5.2.0 or later CLI and is not available through Web Tools. For a complete list of port mirroring commands, see the Fabric OS Command Reference.

To ensure proper failover in HA configurations, both the active and the standby control processors (CP) must have firmware version 5.2.0 or later installed and running. If the OS on the standby CP does not support mirroring, failing over the standby CP could cause the HA failover to fail.

Supported hardware

Port mirroring is supported on following platforms:

• Brocade 300

• Brocade 4100

• Brocade 4900

• Brocade 5000

• Brocade 5100

• Brocade 5300

• Brocade 7500

• Brocade 7600

• Brocade 48000 with chassis option 5

• Brocade DCX Backbone

Port mirroring can be used on the following blades within a chassis:

• FC4-32 32-port blade

• FC4-16 16-port blade

• FC4-48 48-port blade

• FC8-16 16-port blade

• FC8-32 32-port blade

• FC8-48 48-port blade

• FA4-18 application blade

• FR4-18i routing and FCIP blade

• FC4-16IP iSCSI blade on Fibre Channel ports only

Fabric OS Troubleshooting and Diagnostics Guide 73 53-1000853-01

Port mirroring

The FC4-48 implements port pairing, meaning that two ports share the same area. Port pairing uses a single area to map to two physical ports. A frame destined to the secondary port is routed to the primary port. The primary port's filtering zone engine is used to redirect the frame to the secondary port. Port mirroring uses the port filter zone engine to redirect the frames to the mirror port. If two F_Ports share the same area, both ports cannot be part of a mirror connection. One of the two ports can be part of the connection as long as the other port is offline. Supported port configurations are shown in Table 14.

TABLE 14 Port combinations for port mirroring

Primary port Secondary port Supported

F_Port F_Port No

F_Port Offline Yes

Offline F_Port Yes

F_Port E_Port Yes

E_Port F_Port Yes

E_Port E_Port No

If IOD is enabled, adding or deleting a port mirror connection causes a frame drop. Port mirroring reroutes a given connection to the mirror port, where the mirror traffic takes an extra route to the mirror port. When the extra route is removed, the frames between the two ports goes directly to the destination port. Since the frames at the mirror port could be queued at the destination port behind those frames that went directly to the destination port, port mirroring drops those frames from the mirror port when a connection is disabled. If IOD has been disabled, port mirroring does not drop any frames but displays an IOD error.

• A port cannot be mirrored to multiple locations. If you define multiple mirror connections for

the same F_Port, all the connections must share the same mirror port.

• Local switches cannot be mirrored because FICON CUP frames to a local switch are treated as

well-known addresses or embedded frame traffic.

• Using firmware download to downgrade to previous Fabric OS releases that do not support port

mirroring requires that you remove all port mirroring connections.

Port mirroring considerations

Before creating port mirror connections, consider the following limitations:

• A mirror port can be any port on the same switch as the source identifier port.

• Only one domain can be mirrored per chip; after a domain is defined, only mirror ports on the

defined domain can be used.

For example, in a three-domain fabric containing switches 4100A, 4100B, and 4100C, a mirror connection that is created between 4100A and 4100B only allows 4100A to add mirror connections for those ports on 4100B. To mirror traffic between 4100A and 4100C, add a mirror connection on 4100C. The first connection defines the restriction on the domain, which can be either the local domain or a remote domain.

• A switch that is capable of port mirroring can support a maximum of four mirror connections.

Each Field Description Block (FDB) defines an offset to search. Each offset can have up to four values that can be defined for a filter. If any of the four values match, the filter will match.

• Mirror port bandwidth limits mirror connections.

74 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Port mirroring

NOTE

The bandwidth of the mirror port is unidirectional. The host (SID) talks to multiple storage devices (DIDs) and does not send full line rate to a single target. A mirror port configured at 2GB can only support up to 2GB of traffic. A normal 2G F_Port is bidirectional and can support up to 4GB of traffic (two to transmit and two to receive). If the mirror port bandwidth is exceeded, the receiver port is not returned any credits and the devices in the mirror connection see degraded performance.

• Deleting a port mirroring connection with In Order Deliver (IOD) enabled causes frame drop

between two endpoints.

• Using the firmware download procedure to downgrade to previous Fabric OS releases that do

not support port mirroring requires that you remove all the port mirroring connections. If you downgrade to a previous versions of Fabric OS, you cannot proceed until the mirroring connections are removed.

Port mirroring management

The method for adding a port mirror connection between two local switch ports and between a local switch port and a remote switch port is the same. First you must configure a port to be a mirror port before you can perform a portMirror

Configuring a port to be a mirror port

• Type portCfg mirrorport [slot number/]<port number> --enable.

--add, or portMirror --delete.

The enable command enables the port as mirror port. The disable command disables the mirror port configuration.

Adding a port mirror connection

1. Log in to the switch as admin.

2. Type portMirror

The lower 8 bits of the address is ignored. For example, the ALPA for loop devices.

The configuration database keeps information about the number of port mirror connections configured on a switch, the number of chunks of port mirroring data that are stored, and the chunk number. When removing a mirror connection, always use this method to ensure that the data is cleared. Deleting a connection removes the information from the database.

--add slotnumber/portnumber SourceID DestID

Deleting a port mirror connection

1. Log in to the switch as admin.

2. Type portMirror

For example, to delete the port mirror connection on mirror port 2, you might type:

portMirror --del 0x011400 0x240400

--del SourceID DestID.

Fabric OS Troubleshooting and Diagnostics Guide 75 53-1000853-01

FTRACE concepts

Displaying port mirror connections

1. Log in to the switch as admin.

2. Type portMirror

You should see output similar to the following:

switch:admin> portmirror --show

Number of mirror connection(s) configured: 4

Mirror_Port SID DID State

---------------------------------------18 0x070400 0x0718e2 Enabled 18 0x070400 0x0718e3 Enabled 18 0x070400 0x0718ef Enabled 18 0x070400 0x0718e0 Enabled

FTRACE concepts

FTRACE is a support tool that can be used in a manner similar to that of a channel protocol analyzer. FTRACE enables troubleshooting of problems using a Telnet session rather than sending an analyzer or technical support personnel to the site. FTRACE record events that occur on the FC interface, including user defined messages and events. FTRACE includes the ability to freeze traces on certain events, and to retain the trace information for future examination.

Tracing Fibre Channel information

--show

Frame trace (FTRACE) records user-defined messages and events on the Brocade FR4-18i and the

7500. The portCfg command uses the ftrace option to capture trace information on a per FCIP

tunnel basis. You can configure up to eight FCIP tunnels on a single physical GE port. FTRACE is subject to the same FCIP tunnel limitations, such as tunnel disruption, port of switch disable or enable, and reboot requirements.

Tracing every FICON event affects performance. To avoid this, the default trace mask is set to 0x80000C7b. The mask is a filter where a bit specifies which frames and events will be captured and displayed. For troubleshooting, you should set the trace mask to 0-0xFFFFFFFF. The following table describes the configurable FTRACE parameters.

TABLE 15 FTRACE configurable parameters

Parameter Default Range Syntax

Auto check Out False T/F Boolean

Buffers 0 0-8 Integer

Display Mask 0xFFFF FFFF 0-0xFFFFFFFF Integer

Enable False T/F Boolean

Post Percentage 5 0-100 Integer

Trace Mask 0x8000 0-0xFFFFFFFF Integer

Trigger Mask 0x00000003 0-0xFFFFFFFF Integer

76 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

FTRACE concepts

NOTE

After information is captured, you can use the portshow command to display FTRACE information on a GE port for a tunnel. You can save trace events can for future analysis.

Displaying the trace for a tunnel

1. Log on to the switch as admin.

2. Enter the portShow -ftrace command with the following options:

portshow -ftrace ge0 -stats

This displays the trace stats for the GE port 0 for tunnel 1.

The configuration file includes key FCIP FTRACE configuration values. Configurations are stored on a slot basis and not on blades, such as the FR4-18i. If the FR4-18i is swapped, the configuration stays the same for the new FR4-18i corresponding to the slot they are plugged in.

When performing a configDownload, the FCIP configuration is applied to the switch only on a slot power OFF or ON, for example slots containing the FR4-18i. The Brocade 7500, which is not slot based, requires a reboot. See the Fabric OS Reference for more information on any of these commands.

FTRACE is a support tool used primarily by your switch support provider. FTRACE includes the ability to freeze traces on certain events, and to retain the trace information for future examination. The syntax for the portCfg ftrace command is as follows:

portCfg ftrace [slot/]ge0|ge1 tunnel_Id cfg [-a 1|0] [-b value] [-e 1|0] [-i value] [-p value] [-r value] [-s value] [-t value] [-z value]

Where:

slot The number of a slot in a 48000 or DCX director chassis that contains an

FC4-18i blade. This parameter does not apply to the stand-alone 7500.

ge0|ge1 The Ethernet port used by the tunnel (ge0 or Ge1).

tunnelid The tunnel number (0 - 7).

cfg Creates an FTRACE configuration.

-a 1|0 Enables or disables ACO.

-b value Number of buffers (range 0 to 8).

-e 1|0 Enable or disable FTRACE.

-i value Display mask value (range 0 to FFFFFFFF).

Default is FFFFFFFF.

-p value Post trigger percentage value (range 0-100). Default is 5.

-r value Number of records (range 0 through 1,677,721). Default us 200000.

-s value Trigger mask value (range 00000000 to FFFFFFFF). Default is 00000003.

-t value Trace mask value (range 00000000 to FFFFFFFF). Default is 80000C7B.

-z value Trace record size (range 80 to 240 bytes). Default is 80 bytes.

Fabric OS Troubleshooting and Diagnostics Guide 77 53-1000853-01

FTRACE concepts

The following example configures FTRACE with ACO disabled, and FTRACE enabled with a trigger mask value of 00000003, and a trace mask value of ffffffff.

portcfg ftrace ge0 3 cfg -a 0 -e 1 -p 5 -s 00000003 -t ffffffff

Configuring FTRACE for a tunnel

Use the following syntax to configure a trace:

portcfg -ftrace [slot-number] ge port number [tunnel -id] cfg|del] <opt args>

Enabling a trace

1. Log on to the switch as admin.

2. Enter the portCfg -ftrace command with the following options:

portcfg-ftrace ge0 cfg -a 0 -e 1

This disables Auto Checkout and enables trace for GigE 0, tunnel 1

Deleting a configuration for a tunnel

1. Log on to the switch as admin.

2. Enter the portCfg -ftrace command with the following options:

portcfg-ftrace ge1 1 del

This deletes the configuration for tunnel 1.

Displaying FTRACE for a tunnel

The portShow command uses the ftrace option to display a trace for a tunnel.

Use the following syntax to display a trace:

portshow -ftrace [slot-number] ge port number [tunnel -id] cfg|del]<opt args>

78 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Chapter

FICON Fabric Issues

This chapter discusses FICON issues, recommended actions, and additional information you should gather to fix your issue. Any information you need to verify that FICON has been set up correctly can be found in the Fabric OS Administrator’s Guide.

In this chapter

•FICON issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

•Troubleshooting FICON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

FICON issues

Symptom The Control Unit Port cannot access the switch.

Probable cause and recommended action

A two byte CHPID (channel path identifier) link is defined using a Domain and Port ID that must remain consistent. Any change in the physical link such as domain or port ID will prevent storage Control Unit access.

Use the configure command to verify and set the Insistent Domain ID parameter.

FICON:admin> configure

Configure...

Fabric parameters (yes, y, no, n): [no] y

Domain: (1..239) [97] R_A_TOV: (4000..120000) [10000] E_D_TOV: (1000..5000) [2000] WAN_TOV: (0..30000) [0] MAX_HOPS: (7..19) [7] Data field size: (256..2112) [2112] Sequence Level Switching: (0..1) [0] Disable Device Probing: (0..1) [0] Suppress Class F Traffic: (0..1) [0] Per-frame Route Priority: (0..1) [0] Long Distance Fabric: (0..1) [0] BB credit: (1..27) [16]

Insistent Domain ID Mode (yes, y, no, n): [yes] <== this should be set to ‘y’

[truncated output]

FICON:admin> configure

Fabric OS Troubleshooting and Diagnostics Guide 79 53-1000853-01

Troubleshooting FICON

Symptom Packets are being dropped between two FICON units.

Probable cause and recommended action

When planning cable the following criteria must be considered.

• Distance considerations

• Fibre Optics Sub Assembly (FOSA) type (SW or LW)

• Cable specifications (SM or MM)

• Patch Panel Connections between FOSA ports (link loss .3-5 dB per)

• Maximum allowable link budget (dB) loss

From a cabling point of view, the most important factor of a Fibre Channel link is the selection of the Fibre Optical Sub Assembly (FOSA) and matching cable type, to support the required distance. Both ends of the optical link must have the matching FOSA (SFP) types.

Troubleshooting FICON

This section provides information gathering and troubleshooting techniques necessary to fix your problem.

General information to gather for all cases

The following information needs to be gathered for all FICON setups:

• The standard support commands (portLogDump, supportSave, supportShow) or the Fabric

Manager Event Log.

By default, the FICON group in the supportShow output is disabled. To enable the capture of FICON data in the supportShow output, enter the supportShowCfgEnable ficon command. After you get confirmation that the configuration has been updated, the following will be collected and appear in the output for the supportShow command:

- ficonCupShow fmsmode

- ficonCupShow modereg

- ficonDbg dump rnid

- ficonDbg log

- ficonShow ilir

- ficonShow lirr

- ficonShow rlir

- ficonShow rnid

- ficonShow switchrnid

- ficuCmd dump -A

• Check to make sure supportshow is configured for FICON.

• Execute “supportsave” to capture supportshow, errdumpall, and any RAS logs. Only execute

this on one logical switch in each chassis as data will be collected for both logical switches. There is a known defect that will cause the supportshow data to be invalid if this is done simultaneously across both logical switches.

80 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Troubleshooting FICON

• Supportshow data is only valid if run within about 30 minutes of the failure in order for the

data to be valid.

• Provide the IOCDS mainframe file.

This will define how all mainframe ports are configured.

• Type of mainframe involved. Need make, model, and driver levels in use.

• Type of actual Storage array installed. Many arrays will emulate a certain type of IBM array and

we need to know the exact make, model, and firmware of the array in use.

• Read Brocade Release Notes for specific version of Fabric OS being installed.

The following sources provide useful problem-solving information:

• The standard support commands (portLogDump, supportSave, supportShow) or the Fabric

Manager Event Log.

• Other detailed information for protocol-specific problems:

- Display port data structures using the ptDataShow command.

- Display port registers using the ptRegShow command.

Identifying ports

The ficonShow rlir command displays, among other information, a tag field for the switch port. You can use this tag to identify the port on which a FICON link incident occurred. The tag field is a concatenation of the switch domain ID and port number, in hexidecimal format. The following example shows a link incident for the switch port at domain ID 120, port 93 (785d in hex):

switch:admin> ficonshow rlir { {Fmt Type PID Port Incident Count TS Format Time Stamp 0x18 F 785d00 93 1 Time server Thu Apr 22 09:13:32 2004 Port Status: Link not operational Link Failure Type: Loss of signal or synchronization

Registered Port WWN Registered Node WWN Flag Node Parameters 50:05:07:64:01:40:16:03 50:05:07:64:00:c1:69:ca 0x10 0x200115 Type number: 002064 Model number: 103 Manufacturer: IBM Plant of Manufacture: 02 Sequence Number: 0000000169CA tag: 155d

Switch Port WWN Switch Node WWN Flag Node Parameters 20:5d:00:60:69:80:45:7c 10:00:00:60:69:80:45:7c 0x00 0x200a5d Type number: SLKWRM Model number: 24K Manufacturer: BRD Plant of Manufacture: CA Sequence Number: 000000000078

tag: 785d

} } The Local RLIR database has 1 entry.

Fabric OS Troubleshooting and Diagnostics Guide 81 53-1000853-01

NOTE

Troubleshooting FICON

Single-switch topology checklist

This checklist is something you should verify that you have done in your FICON environment to ensure proper functionality of the feature:

• Brocade switch Fabric OS v4.1.2 or later release.

• Management tool - Suggested: Brocade Fabric Manager (FM) v4.1.0 or later.

• No license is required to enable FICON support.

• There is no special mode setting for FICON.

There is no requirement to have a secure fabric in a single switch topology.

Brocade Advanced features software package (Advanced Zoning, Trunking, Fabric Watch, Extended Fabric) license activation is required.

Cascade mode topology checklist

This checklist is something you should verify that you have done in your FICON environment to ensure proper functionality of the feature.

• Brocade switch Fabric OS 5.1.0 or later release.

• Management tool - Suggested: Brocade Fabric Manager (FM) v5.4.0 or later.

• No license is required to enable FICON support.

• There is no special mode setting for FICON. However, it is recommended that the dynamic

load-sharing feature be disabled with in-order frame delivery (IOD) enabled (default).

• When configuring Fabric for intermix mode of operations, separate zones for FICON and FCP

devices are recommended.

• The Mainframe Channel device connectivity rule of maximum one hop is applicable to both

FCP and FICON devices.

• Insistent Domain ID Flag must be set to keep the Domain ID of a Fabric switch persistent.

• CHPID Link Path must be defined using the two-byte format.

• FICON Channel connectivity to storage CU port must not exceed one hop.

The Switch Connection Control (SCC) security policy must be active.

Brocade Advanced features software package (Advanced Zoning, Trunking, Fabric Watch, Extended fabric) license activation is required.

Gathering additional information

• Is this case logged during an initial install or has this environment been working previously?

• What was changed immediately prior to issue occurring?

• Is the switch properly configured for FICON environment?

Also refer to Fabric OS Administrator’s Guide and the most recent version of the Fabric OS Release Notes for notes on FICON setup and configuration.

82 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Troubleshooting FICON CUP

NOTE

• Is this a single-switch or cascaded environment?

• If this is a cascaded FICON installation, you must have security policies enabled.

• Is IDID (insistent Domain) set? This parameter must be set for cascaded (multiple switch)

FICON configurations. It is a best practice to set this parameter in all FICON configurations.

• Is the FICON group enabled for supportshow?

Check at the top of the supportshow. If not, use supportShowCfgEnable ficon and re-run the test that was failing.

If this setting is not set to port-based routing on Condor-based switches in a FICON fabric, you will experience excessive interface control checks (IFCCs) on the mainframe whenever a blade or CP is hot-plugged or unplugged.

• Dynamic Load Sharing (DLS) MUST be disabled with the dlsReset command.

If DLS is enabled, traffic on existing ISL ports might be affected when one or more new ISLs is added between the same two switches. Specifically, adding the new ISL might result in dropped frames as routes are adjusted to take advantage of the bandwidth provided. By disabling DLS, you ensure that there will be no dropped frames. (In a supportshow, search for “route.stickyRoutes” and check for a value of “1”.)

• IOD MUST be enabled with the IODset command to ensure in-order-delivery.

In a supportshow, search for the route.delayReroute and check for a value of 1. This indicates that the feature is turned on.

Troubleshooting FICON CUP

This section provides additional information you need to verify and data to gather for a FICON CUP environment.

• Capture all data from the General section above.

• Make sure FICON CUP license is installed.

• Check state of CUP port by running the ficonCupShow fmsmode command. If it is disabled,

type the ficonCupSet fmsmode enable command to enable it.

• CUP is only supported on FOS v4.4.0 or later

• Add FICON_CUP license

• Ensure no device is plugged into port 254 on the Brocade 48000 director.

• Switchshow – make sure port shows Disabled (FMS Mode). If not, type the portDisable 10/14

and then the portEnable 10/14 command.

Symptom Unable to “vary online” FICON CUP port on the switch.

Probable cause and recommended action

Hafailover or hareboot of switch is only known fix as there is no known firmware solution.

Fabric OS Troubleshooting and Diagnostics Guide 83 53-1000853-01

NOTE

Troubleshooting FICON CUP

Symptom Mainframe RMF utility fails to capture performance data

Probable cause and recommended action

On Fabric OS v6.0.0, Brocade SilkWorm switches do not fully implement all of CUP commands needed to collect all of performance data on switch. Upgrade your switch to Fabric OS v6.1.0, where the performance data is captured.

Symptom Switch panic when a firmware upgrade from v5.0.1d to v5.1.0c occurred.

Probable cause and recommended action

Switches may have the following settings:

route.delayReroute:0==> iodReset

route.stickyRoutes:0==> dlsSet

Aptpolicy:1 ==> port-based routing

Fabric OS v5.1.x introduced a new flag during route calculation. After an upgrade from Fabric OS v5.0.x to v5.1.x, a new active CP running Fabric OS v5.1.x uses the old method of calculating the route rather than using the new flag. This causes a route miscalculation and the switch panics. This issues is fixed in Fabric OS v5.2.1 and v5.3.0

The following will appear in the message logs:

MSR: 00029030 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 1/1 TASK = c0f28000[658] 'RTEK_TH' Last syscall: -1 ASSERT - Failed expression: !old_path, file = rte_path.c, line = 1366, kernel mode

Follow the routing guidelines in Table 16 to avoid this issue. Avoid the combination of port-based routing (external = 1, internal = 2) with DLS on in a FICON fsmode enabled setup. DLS off is recommended for a FICON environment.

TABLE 16 Tested scenarios for panic or no panic

Routing type IOD setting DLS setting Result

Port-based (external = 1, internal = 2)

Device-based (external = 3, internal = 3)

Off On Panic

On On Panic

On Off No panic

Off Off No panic

Off n/a No panic

On n/a No panic

You have to configure the switches according to the Fabric OS Administrator’s Guide.

84 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Symptom Upgraded firmware from v5.0.x to v5.2.1a. The Brocade 24000 crashed when new firmware went

active.

Probable cause and recommended action

RNID processing enhancement in firmware.

The following message will appear in the switches log:

2007/05/13-07:06:56, [KSWD-1003], 78412/24993, FFDC, WARNING, SilkWorm24000, kSWD: Detected unexpected termination of: ''[4]ficud:0'RfP=712,RgP=712,DfP=0,died=1,rt=87261,dt=36843,to=50000,aJc=35761,aJ p=19101,abiJc=1942030008,abiJp=1942013308,aSeq=3,kSeq=0,kJc=0,kJp=0,J=50418,rs=2' , swd.c, line: 462, comp:ficud, ltime:2038/01/18-20:31:28

This problem occurred due to the RNID-processing enhancement in the Fabric OS code. The mainframe configuration was sending out a lot of RNIDs to the switch and it could not process all of them in a timely manner.

To fix this issue upgrade to v5.2.2 or v5.3.0.

Troubleshooting FICON NPIV

The user should capture all pertinent data from the “General information to gather for all cases” on page 80 and “Gathering additional information” on page 82.

NPIV licenses must be installed on v5.0.x. There is no license requirement for Fabric OS v5.1.0 and above.

Fabric OS Troubleshooting and Diagnostics Guide 85 53-1000853-01

Troubleshooting FICON NPIV

86 Fabric OS Troubleshooting and Diagnostics Guide

53-1000853-01

Brocade Fabric OS Troubleshooting and Diagnostics Guide

Specifications and Main Features

Frequently Asked Questions

User Manual

Contents

About This Document

In this chapter

How this document is organized

Supported hardware and software

What’s new in this document

Document conventions

Text formatting

Notes, cautions, and warnings

Key terms

Additional information

Brocade resources

Other industry resources

Getting technical help

Document feedback

Introduction to Troubleshooting

In this chapter

About troubleshooting

Network time protocol

Most common problem areas

Questions for common symptoms

Gathering information for your switch support provider

Setting up your switch for FTP

Capturing a supportSave

Capturing a supportShow

Capturing output from a console

Capturing command output

Building a case for your switch support provider

Basic switch information

Detailed problem information

Gathering additional information

General Issues

In this chapter

Licensing issues

Switch Message Logs

Checking fan components

Fibre Channel Routing

Checking the switch temperature

Checking the power supply

Checking the temperature, fan, and power supply

Checking for Fibre Channel connectivity problems

Third party applications

Connections Issues

In this chapter

Port initialization and FCP auto discovery process

Link issues

Connection problems

Checking the logical connection

Checking the name server (NS)

Link failures

Determining a successful negotiation

Checking for a loop initialization failure

Checking for a point-to-point initialization failure

Correcting a port that has come up in the wrong mode

Marginal links

Troubleshooting a marginal link

Device login issues

Pinpointing problems with device logins

Media-related issues

Testing a port’s external transmit and receive path

Testing a switch’s internal components

Testing components to and from the HBA

Segmented fabrics

Reconciling fabric parameters individually

Downloading a correct configuration

Reconciling a domain ID conflict

Configuration Issues

In this chapter

Configupload and download issues

Gathering additional information

Brocade configuration form

FirmwareDownload Errors

In this chapter

Blade troubleshooting tips

Firmware download issues

Troubleshooting firmwareDownload