Alcatel-Lucent 6648, 8800, 6624, 6800 User Manual

Page 1
Part No. 031496-00, Rev. C September 2005
OmniSwitch 6624/6648/
6800/7700/7800/8800
Troubleshooting Guide
www.alcatel.com
OmniSwitch Troubleshooting Guide September 2005 i
Page 2
hardware, including chassis and associated components, and Release 5.1 software.
The specifications described in this guide are subject to change without notice.
Copyright © 2006 by Alcatel Internetworking, Inc. All rights reserved. This document may not be repro­duced in whole or in part without the express written permission of Alcatel Internetworking, Inc.
®
Alcatel and Alcatel OmniVista
and the Alcatel logo are registered trademarks of Alcatel. Xylan®, OmniSwitch®, OmniStack®,
®
are registered trademarks of Alcatel Internetworking, Inc.
OmniAccess™, Omni Switch/Router™, PolicyView™, RouterView™, SwitchManager™, VoiceView™, WebView™, X-Cell™, X-Vision™, and the Xylan logo are trademarks of Alcatel Internetworking, Inc.
This OmniSwitch product contains components which may be covered by one or more of the following U.S. Patents:
U.S. Patent No. 6,339,830
U.S. Patent No. 6,070,243
U.S. Patent No. 6,061,368
U.S. Patent No. 5,394,402
U.S. Patent No. 6,047,024
U.S. Patent No. 6,314,106
U.S. Patent No. 6,542,507
26801 West Agoura Road
Calabasas, CA 91301
(818) 880-3500 FAX (818) 880-3505
info@ind.alcatel.com
US Customer Support—(800) 995-2696
International Customer Support—(818) 878-4507
Internet—http://eservice.ind.alcatel.com
ii OmniSwitch Troubleshooting Guide September 2005
Page 3

Contents

About This Guide .........................................................................................................xv
Supported Platforms ......................................................................................................... xv
Who Should Read this Manual? ...................................................................................... xvi
When Should I Read this Manual? .................................................................................. xvi
What is in this Manual? ................................................................................................... xvi
What is Not in this Manual? ...........................................................................................xvii
How is the Information Organized? ...............................................................................xvii
Related Documentation ..................................................................................................xvii
Before Calling Alcatel’s Technical Assistance Center .................................................... xx
Chapter 1 Troubleshooting the Switch System ......................................................................1-1
In This Chapter ................................................................................................................1-1
Introduction .....................................................................................................................1-2
Troubleshooting System for OS-6624/6648 and OS-7/8XXX .......................................1-3
Advanced Troubleshooting .............................................................................................1-9
Dshell Troubleshooting .................................................................................................1-11
Troubleshooting NIs on OmniSwitch 7700/7800/8800 .........................................1-21
OmniSwitch 6624/6648 Dshell Troubleshooting ...................................................1-23
Accessing Dshell on Idle Switches ..................................................................1-25
Using AlcatelDebug.cfg ................................................................................................1-26
Troubleshooting IPC on OS-6/7/8XXX Series of Switches .........................................1-27
Debugging IPC .......................................................................................................1-27
OmniSwitch 6624/6648 Example ..........................................................................1-34
Port Numbering Conversion Overview .........................................................................1-36
ifindex to gport .......................................................................................................1-36
gport to ifindex .......................................................................................................1-36
Converting from lport .............................................................................................1-36
OmniSwitch 7700/7800/8800 (Falcon/Eagle) Example ..................................1-36
OmniSwitch 6624/6648 (Hawk) Example ......................................................1-37
Chapter 2 Troubleshooting Switched Ethernet Connectivity ..............................................2-1
In This Chapter ................................................................................................................2-1
Overview of Troubleshooting Approach ........................................................................2-2
Verify Physical Layer Connectivity ................................................................................2-3
Verify Current Running Configuration ...........................................................................2-5
OmniSwitch Troubleshooting Guide September 2005 iii
Page 4
Verify Source Learning ...................................................................................................2-6
Verify Switch Health .......................................................................................................2-7
Verify ARP ......................................................................................................................2-7
Using the Log File ...........................................................................................................2-8
Checking the 7700/7800 Nantucket Fabric ..............................................................2-8
Checking the 7700/7800 Nantucket Fabric for Interrupts, Data Counts and
Error Counts ............................................................................................................2-9
Checking the Traffic Queue on the NI .....................................................................2-9
Check for Catalina (MAC) or Port Lockup ............................................................2-10
Chapter 3 Troubleshooting Source Learning .........................................................................3-1
In This Chapter ................................................................................................................3-1
Introduction .....................................................................................................................3-2
Troubleshooting a Source Learning Problem .................................................................3-3
Advanced Troubleshooting .............................................................................................3-5
Dshell Troubleshooting ...................................................................................................3-7
OS-6600 .................................................................................................................3-10
Chapter 4 Troubleshooting Spanning Tree ............................................................................4-1
In This Chapter ................................................................................................................4-1
Introduction .....................................................................................................................4-1
Troubleshooting Spanning Tree ......................................................................................4-2
Dshell ..............................................................................................................................4-5
Generic Troubleshooting in Dshell ...............................................................................4-10
Event Trace (stpni_traceprint) ................................................................................4-10
PORTATCH ....................................................................................................4-11
PORTDELE .....................................................................................................4-11
ADDVLAN .....................................................................................................4-11
MODVLADM .................................................................................................4-12
MODVLSTP ....................................................................................................4-12
ADDQTAG .....................................................................................................4-12
DELQTAG ......................................................................................................4-12
MDEFVLAN ...................................................................................................4-13
PORTAGGR ....................................................................................................4-13
PORTDISG ......................................................................................................4-13
AGGR_UP .......................................................................................................4-13
AGGRDOWN .................................................................................................4-13
PORTJOIN ......................................................................................................4-14
PORTLEAV ....................................................................................................4-14
BRGPARAM ...................................................................................................4-14
PTSTPMOD ....................................................................................................4-15
PORTMOD ......................................................................................................4-15
PORTVLBK ....................................................................................................4-15
PVLANBLK ....................................................................................................4-15
GMBPDU ........................................................................................................4-16
iv OmniSwitch Troubleshooting Guide September 2005
Page 5
GMIGBPDU ....................................................................................................4-16
GM2FIXED .....................................................................................................4-17
VMADDVPA ..................................................................................................4-17
VMDELVPA ...................................................................................................4-17
VMDEFVPA ...................................................................................................4-17
TOPOCHGT ....................................................................................................4-18
LINK_UP ........................................................................................................4-18
LINKDOWN ...................................................................................................4-18
NI_UP ..............................................................................................................4-18
NI_DOWN ......................................................................................................4-18
Physical and Logical Port Dumps ..........................................................................4-19
Logical Ports (stpni_debugLport) ....................................................................4-19
Physical Port (stpni_debugPport) ....................................................................4-20
Physical and Logical Port Trace Display (stpni_debugport) ...........................4-22
Socket Handler Traces ...........................................................................................4-22
stpNISock_globals ...........................................................................................4-22
stpNISock_warningprint ..................................................................................4-23
stpNISock_traceprint .......................................................................................4-23
Inter-NI Trace (stpNISock_intraceprint) .........................................................4-24
Time-out Trace (stpNISock_totraceprint) .......................................................4-24
Board Up (stpNISock_boardupprint) ..............................................................4-24
stpNISock_printon ...........................................................................................4-24
StpNISock_printoff .........................................................................................4-24
CMM Spanning Tree Traces ..................................................................................4-25
Trace Menu ......................................................................................................4-25
stpCMM_traceprint .........................................................................................4-25
Writing a PR for Spanning Tree ....................................................................................4-26
Exception in Spanning Tree (NI and CMM case) ..................................................4-26
Port Does Not Forward ..........................................................................................4-26
Spanning Tree Unchanged When Port State Has Changed ....................................4-27
Other Cases ............................................................................................................4-27
Chapter 5 Troubleshooting BOOTP/DHCP/UDP Relay ........................................................5-1
In This Chapter ................................................................................................................5-1
Starting the Troubleshooting Procedure ..........................................................................5-1
Use a Network Diagram ...........................................................................................5-2
Use the OSI Model to Guide Your Troubleshooting ...............................................5-2
UDP Relay Configuration Problems ........................................................................5-2
Incorrect Server IP Address ...............................................................................5-2
Forward Delay Timer ........................................................................................5-3
Displaying DHCP Statistics ..............................................................................5-3
UDP Relay and Group Mobility ...............................................................................5-4
Advanced Troubleshooting for UDP Relay ....................................................................5-5
Dshell ..............................................................................................................................5-6
Chapter 6 Troubleshooting DNS ................................................................................................6-1
In This Chapter ................................................................................................................6-1
Introduction .....................................................................................................................6-1
OmniSwitch Troubleshooting Guide September 2005 v
Page 6
Troubleshooting a DNS Failure ......................................................................................6-2
Starting the Troubleshooting Procedure ...................................................................6-2
Layer 7 DNS or Name Resolution Issue ..................................................................6-2
DNS Configuration Considerations ................................................................................6-3
Chapter 7 Troubleshooting Link Aggregation .......................................................................7-1
In This Chapter ................................................................................................................7-1
Link Aggregation Limits and Guidelines ........................................................................7-2
OmniSwitch 6624/6648 Restrictions .......................................................................7-2
Troubleshooting a Link Aggregation Failure ..................................................................7-3
Verify the Configuration ..........................................................................................7-3
Source Learning .......................................................................................................7-5
Link Aggregation Affecting Other Traffic ...............................................................7-5
Problems Creating a Group ......................................................................................7-5
Problems Deleting a Group ......................................................................................7-5
LACP 802.3AD ........................................................................................................7-6
Advanced Link Aggregation Troubleshooting ................................................................7-7
6800 Link Aggregation Debug Functions .....................................................................7-10
la_ni_agg_prt ..........................................................................................................7-10
la_ni_port_prt .........................................................................................................7-10
la_ni_port_up_prt ...................................................................................................7-11
la_ni_port_stats_prt ................................................................................................7-11
la_ni_info ...............................................................................................................7-11
lagg_ni_Sock_help .................................................................................................7-11
la_ni_trace_freeze ..................................................................................................7-12
la_ni_trace_unfreeze ..............................................................................................7-12
la_ni_kite_help .......................................................................................................7-12
Chapter 8 Troubleshooting 802.1Q ..........................................................................................8-1
In This Chapter ................................................................................................................8-1
Troubleshooting 802.1Q .................................................................................................8-2
Default VLAN Traffic ..............................................................................................8-3
Tagged Packet on an Untagged Port ........................................................................8-3
802.1Q with VLAN ID of 0 ..............................................................................8-4
802.1Q and 64 Byte Packets ..............................................................................8-4
Advanced Troubleshooting .............................................................................................8-5
Dshell Commands ...........................................................................................................8-7
Chapter 9 Troubleshooting Group Mobility ............................................................................9-1
In This Chapter ................................................................................................................9-1
Troubleshooting a VLAN Mobility Failure ....................................................................9-2
Binding Rules ...........................................................................................................9-3
Port Rules .................................................................................................................9-3
Precedence ................................................................................................................9-4
Advanced Troubleshooting .............................................................................................9-5
vi OmniSwitch Troubleshooting Guide September 2005
Page 7
Dshell ..............................................................................................................................9-6
NI Debug Dshell .......................................................................................................9-6
6800 Group Mobility Troubleshooting ...........................................................................9-7
show vlan rules .........................................................................................................9-7
gmHelp .....................................................................................................................9-7
gmcKiteDebug .........................................................................................................9-8
gmcShowPorts ..........................................................................................................9-8
gmcShowRules .........................................................................................................9-8
gmnKiteDebug .........................................................................................................9-9
gmnKiteShowRules ..................................................................................................9-9
gmnMacVlanShowBuffer ........................................................................................9-9
Chapter 10 Troubleshooting QoS ...............................................................................................10-1
In This Chapter ..............................................................................................................10-1
QoS Behavior ................................................................................................................10-2
Default ....................................................................................................................10-2
QoS Queues and Ports ............................................................................................10-2
Troubleshooting QoS ....................................................................................................10-3
Information Gathering on Symptoms and Recent Changes ...................................10-3
Starting the Troubleshooting Procedure .................................................................10-3
QoS Activation .......................................................................................................10-3
QoS Apply ..............................................................................................................10-4
Invalid Policies .......................................................................................................10-4
Rules Order ............................................................................................................10-4
Viewing QoS Settings ............................................................................................10-5
Viewing QoS Policy Rules .....................................................................................10-5
Validation ...............................................................................................................10-6
Example 1 ........................................................................................................10-6
Example 2 ........................................................................................................10-6
Example 3 ........................................................................................................10-7
Correction ...............................................................................................................10-8
Reflexive Rules ......................................................................................................10-8
QoS Log .................................................................................................................10-9
QoS Statistics .......................................................................................................10-11
Debug QoS ...........................................................................................................10-11
Debug QoS Internal ..............................................................................................10-12
OmniSwitch 6624/6648 Dshell Troubleshooting .................................................10-13
qosIxHelp ......................................................................................................10-13
qosDBState ....................................................................................................10-13
QoS Dump .....................................................................................................10-13
Example QoS Rules ....................................................................................................10-15
Chapter 11 Troubleshooting ARP ...............................................................................................11-1
In This Chapter ..............................................................................................................11-1
ARP Protocol Failure ....................................................................................................11-2
Common Error Conditions ............................................................................................11-5
Advanced ARP Troubleshooting ..................................................................................11-6
OmniSwitch Troubleshooting Guide September 2005 vii
Page 8
Dshell Troubleshooting .................................................................................................11-8
Viewing the ARP Table on OmniSwitch 6624/6648 Switches ............................11-10
Chapter 12 Troubleshooting IP Routing ...................................................................................12-1
In This Chapter ..............................................................................................................12-2
Introduction ...................................................................................................................12-3
IP Routing Protocol Failure ..........................................................................................12-3
Troubleshooting via the CLI .........................................................................................12-3
Troubleshooting with Debug CLI ...............................................................................12-11
RIP Troubleshooting ...................................................................................................12-13
OSPF Troubleshooting ................................................................................................12-19
BGP Troubleshooting ..................................................................................................12-27
Dshell Troubleshooting Advanced IP Routing ...........................................................12-29
ipdbg=x .................................................................................................................12-29
ifShow ..................................................................................................................12-29
iprmShowRoutes ..................................................................................................12-30
iprmCountRoutes .................................................................................................12-30
ipni_ifShow ..........................................................................................................12-30
Iprm_routeShow ...................................................................................................12-31
Ipni_routeCount ...................................................................................................12-31
ospfDbgDumpEnv ................................................................................................12-31
Chapter 13 Troubleshooting Virtual Router Redundancy Protocol (VRRP) ....................13-1
In This Chapter ..............................................................................................................13-1
Overview .......................................................................................................................13-2
Protocol Information .....................................................................................................13-3
IP Field Descriptions ..............................................................................................13-3
VRRP Field Descriptions .......................................................................................13-3
VRRP States ...........................................................................................................13-3
OmniSwitch 7700/7800/8800 Implementation .............................................................13-4
VRRP Security .......................................................................................................13-4
OmniSwitch VRRP Limitations .............................................................................13-4
CMM Failover ...............................................................................................................13-5
OmniSwitch VRRP Troubleshooting ............................................................................13-9
ARP Table ...................................................................................................................13-10
Dshell Troubleshooting ...............................................................................................13-11
Chapter 14 Troubleshooting IP Multicast Switching (IPMS) ...............................................14-1
In This Chapter ..............................................................................................................14-1
Troubleshooting a Device that Cannot Join an IP Multicast Stream ............................14-2
Troubleshooting a Device that Drops Out of an IP Multicast Stream ..........................14-3
Troubleshooting IPMS in Debug CLI ...........................................................................14-7
viii OmniSwitch Troubleshooting Guide September 2005
Page 9
Dshell Troubleshooting .................................................................................................14-9
Chapter 15 Troubleshooting DVMRP ........................................................................................15-1
In This Chapter ..............................................................................................................15-1
Introduction ...................................................................................................................15-2
DVMRP Troubleshooting .............................................................................................15-2
DVMRP Global and Interface Commands .............................................................15-2
DVMRP Debug Commands ...................................................................................15-4
Chapter 16 Troubleshooting PIM-SM ........................................................................................16-1
In This Chapter ..............................................................................................................16-1
Introduction ...................................................................................................................16-2
Definition of Terms .......................................................................................................16-2
Protocol Overview ........................................................................................................16-3
DR Election ............................................................................................................16-3
Simplified Hello Message Format ...................................................................16-3
Debugging Hello Messages .............................................................................16-4
Related CLI Command ....................................................................................16-5
BSR Election .................................................................................................................16-6
Simplified Packet Format .......................................................................................16-7
Debugging BSR/Bootstrap .....................................................................................16-7
Election of a New BSR ....................................................................................16-8
Related CLI Command ....................................................................................16-9
C-RP Advertisements ..................................................................................................16-10
Simplified RP-Advertisement Packet Format ......................................................16-10
Debugging C-RP-Adv ..........................................................................................16-11
Related CLI Command ..................................................................................16-12
RP-SET .......................................................................................................................16-13
Simplified Bootstrap RP-SET Packet Taken on a 192.168.12/24 Network .........16-14
Debugging RP-SET ..............................................................................................16-16
On Non BSR You Should See .......................................................................16-16
Related CLI Command ..................................................................................16-17
Join/Prune ....................................................................................................................16-18
Simplified Join Packet ..........................................................................................16-18
Simplified PRUNE Packet ...................................................................................16-20
Debugging JOIN/PRUNE Event ..........................................................................16-20
Register .......................................................................................................................16-21
Simplified REGISTER Packet Format .................................................................16-22
Shared Tree .................................................................................................................16-23
Related CLI Command ..................................................................................16-24
Source-Based Tree ......................................................................................................16-25
Related CLI Command ..................................................................................16-26
Troubleshooting Examples: Limitations .....................................................................16-27
Incorrect BSR ID ..................................................................................................16-27
OmniSwitch Troubleshooting Guide September 2005 ix
Page 10
Multicast Group Status is Shown as Disabled .....................................................16-27
PIM-SM Limitations ............................................................................................16-28
Upstream Neighbor/Next Hop Debug Commands ...............................................16-28
Chapter 17 Troubleshooting Server Load Balancing ...........................................................17-1
In This Chapter ..............................................................................................................17-1
Introduction ...................................................................................................................17-2
Server Load Balance Failure .........................................................................................17-2
What is an SLB Failure? ........................................................................................17-2
Description of a Complete Failure of Service ........................................................17-2
Description of a Partial Failure of Service .............................................................17-2
Troubleshooting Commands .........................................................................................17-3
Troubleshooting a Complete Failure .............................................................................17-4
Troubleshooting a Partial Failure ..................................................................................17-5
The Troubleshooting Procedure ....................................................................................17-5
Chapter 18 Troubleshooting Authenticated VLANs ..............................................................18-1
In This Chapter ..............................................................................................................18-1
Introduction ...................................................................................................................18-1
Troubleshooting AVLAN .............................................................................................18-2
DHCP Request Failure ...........................................................................................18-2
Authentication Failure ............................................................................................18-3
Problem Communicating Using Multiple Protocols Simultaneously ....................18-4
Useful Notes on Client Issues ................................................................................18-5
Troubleshooting Using Debug Systrace ........................................................................18-5
Telnet Authentication and De-authentication ........................................................18-5
Get the IP Address from Default VLAN .........................................................18-5
Initiate the Telnet Authentication ....................................................................18-6
Release/Renew IP ............................................................................................18-7
De-Authenticating ...........................................................................................18-7
Release/Renew to Go Back to Default VLAN ................................................18-7
HTTP/S Authentication ..........................................................................................18-8
Start of Authentication using https://x.x.x.253 ................................................18-8
De-Authenticate using https://x.x.x.253 ..........................................................18-9
AVClient ..............................................................................................................18-10
AVClient Authentication Start ......................................................................18-10
AVClient logout: ...........................................................................................18-11
Dshell Troubleshooting ...............................................................................................18-12
Authentication Dispatcher (AD) Debugging Help ...............................................18-12
The Authenticated VLAN adDebugShowContext Function ................................18-13
x OmniSwitch Troubleshooting Guide September 2005
Page 11
Chapter 19 Troubleshooting 802.1X .........................................................................................19-1
In This Chapter ..............................................................................................................19-1
Troubleshooting with the CLI .......................................................................................19-2
Troubleshooting Using Debug CLI ...............................................................................19-4
Dshell Troubleshooting .................................................................................................19-7
Appendix A OS6600/OS7700/OS8800 Architecture Overview ............................................A-1
In This Chapter ...............................................................................................................A-1
The MAC ASIC .............................................................................................................A-2
Catalina ....................................................................................................................A-2
Firenze .....................................................................................................................A-4
The Coronado ASIC ................................................................................................A-5
Functional Description ............................................................................................ A-6
Coronado: The “Brain” of the System ..............................................................A-7
Coronado Specifications ...................................................................................A-7
Software Module Interaction ............................................................................A-8
Queue Driver Interaction ................................................................................................A-8
Ethernet Driver ........................................................................................................A-8
Queue Dispatcher ....................................................................................................A-8
NI Supervision .........................................................................................................A-9
Source Learning ......................................................................................................A-9
L3 Manager/IPMS ................................................................................................... A-9
QoS Manager ........................................................................................................... A-9
Destination MAC Learning .............................................................................. A-9
L3 Pseudo CAM Learning ................................................................................A-9
QoS Policy Change ........................................................................................... A-9
QoS Policy Deleted ........................................................................................A-10
L2 destination MAC Aged/Deleted ................................................................A-10
L3 PseudoCAM Entry Aged/Deleted ............................................................. A-10
Request to Free Queues Sent to QoS Manager ..............................................A-10
Link Goes Up/Down ....................................................................................... A-10
Link Aggregation .........................................................................................................A-11
Coronado Tables .......................................................................................................... A-11
Layer 2 Tables .......................................................................................................A-11
Layer 3 Tables .......................................................................................................A-11
Source Learning ...........................................................................................................A-12
Hardware Routing Engine (HRE) ................................................................................A-13
QoS/Policy Manager ....................................................................................................A-15
Coronado Egress Logic ................................................................................................ A-15
The Fabric Architecture ...............................................................................................A-16
Nantucket ASIC ...........................................................................................................A-17
Additional Nantucket Specifications .....................................................................A-17
Functional Description: ..................................................................................A-18
Data Flow .......................................................................................................A-18
OmniSwitch Troubleshooting Guide September 2005 xi
Page 12
Calendar Manager Module ....................................................................................A-19
Data Port Output Module ......................................................................................A-19
Nantucket Redundancy .........................................................................................A-19
Roma ............................................................................................................................A-22
Functional Description .......................................................................................... A-23
Initialization ....................................................................................................A-24
NI Slot Insertion ............................................................................................. A-25
Setup Calendars and Flow Control for New NI .............................................A-25
NI Slot Extraction ...........................................................................................A-25
CMM Takeover and Hot Swap ............................................................................. A-25
Framing Error ........................................................................................................A-26
Chassis Management Module (CMM) .........................................................................A-26
OS7000 CMM .......................................................................................................A-27
OS8800 CMM .......................................................................................................A-27
Functional Description of CMM ...........................................................................A-28
CMM Software Startup Process ..................................................................... A-28
AOS ................................................................................................................A-29
MiniBoot ........................................................................................................A-30
AOS Start ........................................................................................................A-30
Chassis Manager Component of System Services ................................................A-30
CMM Reload of NI Module ..................................................................................A-30
Overall System Architecture .................................................................................A-32
Packet Walk .................................................................................................................A-34
Packet Walk Principles ..........................................................................................A-34
Data Flow Overview .............................................................................................A-34
Specific Packet Flows ..................................................................................................A-35
Unknown L2 Source, Known L2 Destination .......................................................A-35
The Catalina ASIC .........................................................................................A-35
The Coronado ASIC .......................................................................................A-35
The Nantucket ASIC ......................................................................................A-35
The Coronado ASIC .......................................................................................A-35
The Catalina ASIC .........................................................................................A-35
Unknown Destination ................................................................................................... A-36
Known L2 Source, Unknown L2 Destination .......................................................A-36
The Catalina ASIC .........................................................................................A-36
The Coronado ASIC .......................................................................................A-36
The Nantucket ASIC ......................................................................................A-36
The Coronado ASIC .......................................................................................A-37
The Catalina ASIC .........................................................................................A-37
Traffic is Being Passed; the Switch is Attempting to Put a Correct L2 DA
Entry on the NI .....................................................................................................A-37
The Coronado ASIC .......................................................................................A-37
Unknown L3 DA ...................................................................................................A-38
The Coronado ASIC .......................................................................................A-38
Hardware Buses on OmniSwitch 7700/7800/8800 Switches .......................................A-41
Xybus ....................................................................................................................A-41
Fbus .......................................................................................................................A-41
Bbus .......................................................................................................................A-41
xii OmniSwitch Troubleshooting Guide September 2005
Page 13
Bus Mapping on OmniSwitch 7700/7800/8800 Switches ...........................................A-42
Xybus Mapping .....................................................................................................A-42
Fbus Mapping ........................................................................................................A-42
Falcon (OmniSwitch 7700/7800) Fbus Mapping ...........................................A-42
Eagle (OmniSwitch 8800) Fbus Mapping ......................................................A-42
OS6624/6648 Architecture ...........................................................................................A-43
Hardware Architectural Overview ........................................................................A-44
Layer 2 Forwarding ...............................................................................................A-46
Address Resolution Protocol .......................................................................... A-46
Address Learning ............................................................................................A-47
Location of Address Tables ............................................................................A-47
Address Look-up Methodology ...................................................................... A-48
L2 Data Structures ..........................................................................................A-48
3-Protocol Entry .............................................................................................A-49
Layer 3 Forwarding ...............................................................................................A-50
VLANs ..................................................................................................................A-51
Port Based VLANs ......................................................................................... A-51
Protocol Based VLANs ..................................................................................A-51
Address Based VLANs ...................................................................................A-51
Tag Net ID Entry ............................................................................................A-52
Priority ............................................................................................................A-52
802.1p Priority ................................................................................................A-52
Rules-Based Priority .......................................................................................A-53
QOS Flow .......................................................................................................A-53
Bandwidth Management and QoS ..................................................................A-53
CMM Functionality for OS6600 ..................................................................................A-54
OS6600 IPC Communication .......................................................................................A-58
OS6600 BOOT Sequence ............................................................................................A-59
Appendix B Debug Commands ..................................................................................................... B-1
Appendix C Technical Support Commands ............................................................................... C-1
Appendix D Modifying Files with VI Editor ................................................................................D-1
In This Chapter ...............................................................................................................D-1
Useful VI Commands .....................................................................................................D-2
Sample VI Session ......................................................................................................... D-3
Index ...................................................................................................................... Index-1
OmniSwitch Troubleshooting Guide September 2005 xiii
Page 14
xiv OmniSwitch Troubleshooting Guide September 2005
Page 15

About This Guide

This OmniSwitch Troubleshooting Guide describes how to use Command Line Interface (CLI) and Dshell commands available on the OmniSwitch 6600 Family, OmniSwitch 6800 Series, OmniSwitch 7700/7800, and the OmniSwitch 8800 to troubleshoot switch and network problems.

Supported Platforms

This information in this guide applies to the following products:
OmniSwitch 6624 (OmniSwitch 6600-24)
OmniSwitch 6648 (OmniSwitch 6600-48)
OmniSwitch 6600-P24
OmniSwitch 6600-U24
OmniSwitch 6602-24
OmniSwitch 6602-48
OmniSwitch 6800
OmniSwitch 7700
OmniSwitch 7800
OmniSwitch 8800
Note. All references to OmniSwitch 6624 and 6648 switches also apply to the OmniSwitch 6600-P24, OmniSwitch 6600-U24, OmniSwitch 6602-24, and OmniSwitch 6602-48 unless specified otherwise.
Unsupported Platforms
The information in this guide does not apply to the following products:
OmniSwitch (original version with no numeric model name)
Omni Switch/Router
OmniStack
OmniAccess
OmniSwitch Troubleshooting Guide September 2005 page -xv
Page 16
Note. Troubleshooting documentation for legacy products (e.g., Omni Switch/Router) can be downloaded at http://support.ind.alcatel.com/releasefiles/indexpage.cfm.

Who Should Read this Manual?

The principal audience for this user guide is Service and Support personnel who need to troubleshoot switch problems in a live network. In addition, network administrators and IT support personnel who need to configure and maintain switches and routers can use this guide to troubleshoot a problem upon advice from Alcatel Service and Support personnel..
However, this guide is not intended for novice or first-time users of Alcatel OmniSwitches. Misuse or fail­ure to follow procedures in this guide correctly can cause lengthy network down time and/or permanent damage to hardware. Caution must be followed on distribution of this document.

When Should I Read this Manual?

Always read the appropriate section or sections of this guide before you log into a switch to troubleshoot problems. Once you are familiar with the commands and procedures in the appropriate sections you can use this document as reference material when you troubleshoot a problem.

What is in this Manual?

The principal sections (i.e., the chapters numbered numerically) use CLI and Dshell commands to analyze and troubleshoot switch problems. Each section documents a specific switch feature (e.g., hardware, server load balancing, routing).
Note. Dshell commands should only be used by Alcatel personnel or under the direction of Alcatel. Misuse or failure to follow procedures that use Dshell commands in this guide correctly can cause lengthy network down time and/or permanent damage to hardware.
Appendix A provides an architecture overview for the OmniSwitch 6600 Family, OmniSwitch 7700/7800, and the OmniSwitch 8800.
Appendices B and C provide the following for debug and technical support CLI commands:
Command description.
Syntax.
Description of keywords and variables included in the syntax.
Default values.
Usage guidelines, which include tips on when and how to use the command.
Examples of command lines using the command.
page -xvi OmniSwitch Troubleshooting Guide September 2005
Page 17
Related commands.
Release history, which indicates the release when the command was introduced.
Appendix D provides a list of useful VI editor commands and a sample VI session that modifies the boot.params file.

What is Not in this Manual?

This guide is intended for troubleshooting switches in live networks. It does not provide step-by-step instructions on how to set up particular features on the switch or a comprehensive reference to all CLI commands available in the OmniSwitch. For detailed syntax on non debug CLI commands and compre­hensive information on how to configure particular software features in the switch, consult the user guides, which are listed in “Related Documentation” on page xvii.

How is the Information Organized?

Each chapter in this guide includes troubleshooting guidelines related to a single software feature, such as server load balancing or link aggregation.

Related Documentation

The following are the titles and descriptions of all the Release 5.1 and later OmniSwitch user guides:
OmniSwitch 6600 Family Getting Started Guide
Describes the hardware and software procedures for getting an OmniSwitch 6624 or 6648 up and running. Also provides information on fundamental aspects of OmniSwitch software and stacking architecture.
OmniSwitch 6800 Series Getting Started Guide
Describes the hardware and software procedures for getting an OmniSwitch 6800 up and running. Also provides information on fundamental aspects of OmniSwitch software and stacking architecture.
OmniSwitch 7700/7800 Getting Started Guide
Describes the hardware and software procedures for getting an OmniSwitch 7700 or 7800 up and running. Also provides information on fundamental aspects of OmniSwitch software architecture.
OmniSwitch 8800 Getting Started Guide
Describes the hardware and software procedures for getting an OmniSwitch 8800 up and running. Also provides information on fundamental aspects of OmniSwitch software architecture.
OmniSwitch 6600 Family Hardware Users Guide
Complete technical specifications and procedures for all OmniSwitch 6624 and 6648 chassis, power supplies, fans, uplink modules, and stacking modules.
OmniSwitch Troubleshooting Guide September 2005 page -xvii
Page 18
OmniSwitch 6800 Series Hardware Users Guide
Complete technical specifications and procedures for all OmniSwitch 6800 chassis, power supplies, fans, uplink modules, and stacking modules.
OmniSwitch 7700/7800 Hardware Users Guide
Complete technical specifications and procedures for all OmniSwitch 7700 and 7800 chassis, power supplies, fans, and Network Interface (NI) modules.
OmniSwitch 8800 Hardware Users Guide
Complete technical specifications and procedures for all OmniSwitch 8800 chassis, power supplies, fans, and Network Interface (NI) modules.
OmniSwitch CLI Reference Guide
Complete reference to all CLI commands supported on the OmniSwitch 6624/6648, 7700/7800, and
8800. Includes syntax definitions, default values, examples, usage guidelines and CLI-to-MIB variable mappings.
OmniSwitch 6600 Family Switch Management Guide
Includes procedures for readying an individual switch for integration into a network. Topics include the software directory architecture, image rollback protections, authenticated switch access, managing switch files, system configuration, using SNMP, and using web management software (WebView).
OmniSwitch 6800 Series Switch Management Guide
Includes procedures for readying an individual switch for integration into a network. Topics include the software directory architecture, image rollback protections, authenticated switch access, managing switch files, system configuration, using SNMP, and using web management software (WebView).
OmniSwitch 7700/7800/8800 Switch Management Guide
Includes procedures for readying an individual switch for integration into a network. Topics include the software directory architecture, image rollback protections, authenticated switch access, managing switch files, system configuration, using SNMP, and using web management software (WebView).
OmniSwitch 6600 Family Network Configuration Guide
Includes network configuration procedures and descriptive information on all the major software features and protocols included in the base software package. Chapters cover Layer 2 information (Ethernet and VLAN configuration), Layer 3 information (RIP and static routes), security options (authenticated VLANs), Quality of Service (QoS), and link aggregation.
OmniSwitch 6800 Series Network Configuration Guide
Includes network configuration procedures and descriptive information on all the major software features and protocols included in the base software package. Chapters cover Layer 2 information (Ethernet and VLAN configuration), Layer 3 information (RIP and static routes), security options (authenticated VLANs), Quality of Service (QoS), and link aggregation.
page -xviii OmniSwitch Troubleshooting Guide September 2005
Page 19
OmniSwitch 7700/7800/8800 Network Configuration Guide
Includes network configuration procedures and descriptive information on all the major software features and protocols included in the base software package. Chapters cover Layer 2 information (Ethernet and VLAN configuration), Layer 3 information (routing protocols, such as RIP and IPX), security options (authenticated VLANs), Quality of Service (QoS), link aggregation, and server load balancing.
OmniSwitch 6600 Family Advanced Routing Configuration Guide
Includes network configuration procedures and descriptive information on the software features included in the advanced routing software package (OSPF).
OmniSwitch 6800 Series Advanced Routing Configuration Guide
Includes network configuration procedures and descriptive information on the software features and protocols included in the advanced routing software package (OSPF, DVMRP, PIM-SM).
OmniSwitch 7700/7800/8800 Advanced Routing Configuration Guide
Includes network configuration procedures and descriptive information on all the software features and protocols included in the advanced routing software package. Chapters cover multicast routing (DVMRP and PIM-SM) and OSPF.
Technical Tips, Field Notices
Includes information published by Alcatel’s Service and Support group.
Release Notes
Includes critical Open Problem Reports, feature exceptions, and other important information on the features supported in the current release and any limitations to their support.
These user guides are included on the Alcatel Enterprise User Manual CD that ships with every switch. You can also download these guides at http://www.ind.alcatel.com/library/manuals/index.cfm?cnt=index.
OmniSwitch Troubleshooting Guide September 2005 page -xix
Page 20

Before Calling Alcatel’s Technical Assistance Center

Before calling Alcatel’s Technical Assistance Center (TAC), make sure that you have read through the appropriate section (or sections) and have completed the actions suggested for your system’s problem.
Additionally, do the following and document the results so that the Alcatel TAC can better assist you:
Have a network diagram ready. Make sure that relevant information is listed, such as all IP addresses
and their associated network masks.
Have any information that you gathered while troubleshooting the issue to this point available to
provide to the TAC engineer.
If the problem appears to be with only a few-fewer than four-switches, capture the output from the
show tech-support CLI command on these switches. (See Appendix C, “Technical Support
Commands,” for more information on show tech-support CLI commands.)
When calling Alcatel TAC in order to troubleshoot or report a problem following information can be help­ful to get a quick resolution:
All the dump files that were created, if any
Output of switch log or the switch log files swlog1.log and swlog2.log
Configuration file boot.cfg
A capture of the show microcode command
A capture of the show module long command
A capture of the show tech-support command from CLI.
If a CMM fail over to the redundant CMM happened because of this failure then include this informa-
tion from both of the CMMs.
Dial-in or Telnet access can also helpful for effective problem resolution.
page -xx OmniSwitch Troubleshooting Guide September 2005
Page 21
1 Troubleshooting the Switch
System
In order to troubleshoot the system, a basic understanding of the operation of Chassis Management Modules (CMMs) and their interaction with Network Interface (NI) modules is required. Some concepts are covered in this chapter:
Understanding of the “Diagnosing Switch Problems” chapter in the appropriate OmniSwitch Switch
Management Guide.
Understanding of the “Using Switch Logging” from the appropriate OmniSwitch Network Configura-
tion Guide is highly recommended.

In This Chapter

“Introduction” on page 1-2
“Troubleshooting System for OS-6624/6648 and OS-7/8XXX” on page 1-3
“Advanced Troubleshooting” on page 1-9
“Dshell Troubleshooting” on page 1-11
“Using AlcatelDebug.cfg” on page 1-26
“Troubleshooting IPC on OS-6/7/8XXX Series of Switches” on page 1-27
“Port Numbering Conversion Overview” on page 1-36
OmniSwitch Troubleshooting Guide September 2005 page 1-1
Page 22
Introduction Troubleshooting the Switch System

Introduction

The CMM is the Management Module of the switch. All of the critical operations of the switch including the monitoring is the responsibility of the CMM. CMM not only provides monitoring but also synchro­nizes all of the NI for different operations. The operation of the CMM is the same in OS-6/7/8XXX switches. The only difference is that OS-6/7XXX has the switching fabric inherent to the module whereas OS-8800 has fabric at the back of the chassis.
NI has a build in CPU. Each NI has its own CPU, which acts independently of the CMM. The CPU of the NI has to interact with the CPU on the CMM for certain operations. If this operation becomes out of sync then it can create problems specific to that NI.
In order to troubleshoot the system, an understanding of the CMM and NI operation is essential.
page 1-2 OmniSwitch Troubleshooting Guide September 2005
Page 23
Troubleshooting the Switch System Troubleshooting System for OS-6624/6648 and OS-7/8XXX

Troubleshooting System for OS-6624/6648 and OS-7/8XXX

If the switch is having any problems the first place to look for is the CMM. All the task are supervised by CMM. Any in coherency between CMM and the NI can cause problems to appear.
1 The first step for troubleshooting problems with the switch is to look at the overall general health of the
switch.
OmniSwitch 7700/7800/8800
Verify that all of the modules in the chassis are up and operational, using the command:
-> show module status
Operational Firmware Slot Status Admin-Status Rev MAC
------+-------------+------------+---------+----------------­CMM-A UP POWER ON 36 00:d0:95:6b:09:40
NI-1 UP POWER ON 5 00:d0:95:6b:22:5c
NI-3 UP POWER ON 5 00:d0:95:6b:23:2e
NI-5 UP POWER ON 5 00:d0:95:6b:3a:20
OmniSwitch 6624/6648
If the switch is having any problems the first place to look for is the CMM. All the task are supervised by CMM. Any in coherency between CMM and the NI can cause problems to appear. For OS-6600 with 8 units stacked together, the CMM will be:
Primary
Secondary
Idle
The switch with the lowest ID will become the primary CMM.
The first step for troubleshooting problems with the switch is to look at the overall general health of the switch.
Verify that all of the modules in the chassis are up and operational, using the command:
-> show module status Operational Firmware
Slot Status Admin-Status Rev MAC
------+-------------+------------+---------+-----------------
CMM-1 UP POWER ON N/A 00:d0:95:84:4b:d2
CMM-2 SECONDARY POWER ON N/A 00:d0:95:84:4b:d2
NI-1 UP POWER ON N/A 00:d0:95:84:4b:d4
OmniSwitch Troubleshooting Guide September 2005 page 1-3
Page 24
Troubleshooting System for OS-6624/6648 and OS-7/8XXX Troubleshooting the Switch System
NI-2 UP POWER ON N/A 00:d0:95:84:3d:26
NI-3 UP POWER ON N/A 00:d0:95:86:50:f4
NI-4 UP POWER ON N/A 00:d0:95:84:49:be
NI-5 UP POWER ON N/A 00:d0:95:84:39:be
NI-6 UP POWER ON N/A 00:d0:95:84:4a:90
NI-7 UP POWER ON N/A 00:d0:95:84:39:f4
NI-8 UP POWER ON N/A 00:d0:95:84:3c:44
OmniSwitch 6600 with 8 stackable switches show up. Notice that the switch with ID 1 is the primary CMM and the switch with ID of 2 is the secondary. All the switch also show up as NI because each switch has a CPU and is also a NI.
To verify the stacking topology, use the following command:
-> show stack topology
Link A Link A Link A Link B Link B Link B
NI Role State RemoteNI RemoteLink State RemoteNI RemoteLink
---+-----------+---------+---------+-----------+---------+---------+----------
1 PRIMARY ACTIVE 8 51 ACTIVE 2 52 2 SECONDARY ACTIVE 3 27 ACTIVE 1 52 3 IDLE ACTIVE 2 51 ACTIVE 4 52 4 IDLE ACTIVE 5 51 ACTIVE 3 28 5 IDLE ACTIVE 4 51 ACTIVE 6 52 6 IDLE ACTIVE 7 51 ACTIVE 5 52 7 IDLE ACTIVE 6 51 ACTIVE 8 52 8 IDLE ACTIVE 1 51 ACTIVE 7 2
The above command shows the stacking topology. Switch 1 is the primary connected to Switch 8 on port 51 and Switch 2 on port 52. The state of CPUs for all the switches in the stack is shown by the output of this command.
2 Verify the power supply (or supplies).
OmniSwitch 7700/7800/8800
Omni Switch 7/8XXX has build-in mechanism to power off the modules if the power supply is not enough. Switching off a power supply in a chassis which does not have redundant power supply will result in power off of the modules. Make sure that there is no power involvement.
Check the power supply status, using the command:
-> show power supply 1
Module in slot PS-1
Model Name: OSR-PS-06, Description: OSR-PS-06, Part Number: 901750-10,
page 1-4 OmniSwitch Troubleshooting Guide September 2005
Page 25
Troubleshooting the Switch System Troubleshooting System for OS-6624/6648 and OS-7/8XXX
Hardware Revision: , Serial Number: B42N101P2, Manufacture Date: OCT 18 2001, Firmware Version: , Admin Status: POWER ON, Operational Status: UP
Make sure that all the known good power supplies are operational.
OmniSwitch 6624/6648
Check the power supply status, using the command:
-> show power supply
Power Supplies in chassis 1 PS Operational Status
-----+-------------------
PS-1 UP PS-2 NOT PRESENT
Power Supplies in chassis 2 PS Operational Status
-----+-------------------
PS-1 UP PS-2 NOT PRESENT
Power Supplies in chassis 3 PS Operational Status
-----+-------------------
PS-1 UP PS-2 NOT PRESENT
Power Supplies in chassis 4 PS Operational Status
-----+-------------------
PS-1 UP PS-2 NOT PRESENT
Power Supplies in chassis 5 PS Operational Status
-----+-------------------
PS-1 UP PS-2 NOT PRESENT
Power Supplies in chassis 6 PS Operational Status
-----+-------------------
PS-1 UP PS-2 NOT PRESENT
Power Supplies in chassis 7 PS Operational Status
-----+-------------------
PS-1 UP PS-2 NOT PRESENT
OmniSwitch Troubleshooting Guide September 2005 page 1-5
Page 26
Troubleshooting System for OS-6624/6648 and OS-7/8XXX Troubleshooting the Switch System
Power Supplies in chassis 8 PS Operational Status
-----+-------------------
PS-1 UP PS-2 NOT PRESENT
Make sure that all the known good power supplies are operational.
3 Verify the CPU utilization.
OmniSwitch 6624/6648 and 7700/7800/8800
The CPU utilization of CMM can be viewed by using the command:
-> show health
* - current value exceeds threshold
Device Min 1 Hr 1 Hr Resources Limit Curr Avg Avg Max
----------------+-------+-----+------+-----+-----+-------
Receive 80 00 00 00 0 Transmit/Receive 80 00 00 00 00 Memory 80 43 43 43 43 Cpu 80 02 06 05 07 Temperature Cmm 50 38 37 37 37 Temperature Cmm Cpu 50 32 32 31 32
The above command shows the receive, transmit/receive, memory, CPU, temperature CMM and tempera­ture CMM CPU statistics for current, 1 minimum average, 1 hour average and 1 hour maximum. All the values should be within the threshold. Anything above the threshold depicts that some abnormal behavior. Normally 1 hour average maximum might be high if the switch was booted up in the last hour but it should be fairly steady during normal operation.
If none of the above are above the threshold then the next step is to try to isolate the problem to a particu­lar NI. Due to the distributed architecture every NI has it own CPU to perform some operations locally. It is possible that a particular NI might be at high CPU utilization at a time when other NI as well as the CPU are within the thresholds.
If none of the above are above the threshold then the next step is to try to isolate the problem to a particu­lar NI (or a switch within an OmniSwitch 6624/6648 stack) with the show health slot_number CLI command:
-> show health 5
* - current value exceeds threshold
Slot 05 1 Min 1 Hr 1 Hr Resources Limit Curr Avg Avg Max
----------------+-------+-----+------+-----+-----+-------
Receive 80 01 01 01 01 Transmit/Receive 80 01 01 01 01 Memory 80 39 39 39 39 Cpu 80 21 22 21 24
The principle for the health of an NI is the same as for CMM.
page 1-6 OmniSwitch Troubleshooting Guide September 2005
Page 27
Troubleshooting the Switch System Troubleshooting System for OS-6624/6648 and OS-7/8XXX
The average on one minute is calculated from the average of 12 samples. Each sample is an average of the CPU utilization during 5 seconds. Those values are stored in a table. The current minute (1 Min Avg or “min”) displays the average of the last 12 samples.
Every 60 seconds the average of the 12 samples is recorded into the average value for this minute. Those values are stored in a form of 60 samples which represent one hour.
Most probably one of the above would help to localize the problem to a particular NI or to CMM. For more details see, Section “Monitoring Switch Health” in the chapter titled “Diagnosing Switch Problems” in the appropriate OmniSwitch Network Configuration Guide.
4 Check the switch log.
OmniSwitch 6624/6648 and 7700/7800/8800
Now, one of the most important things to check is the switch log. Switch log would contain the error messages depending on the settings of the log levels and applications configured to generate error messages. Default settings of the log switch log can be view using the command:
-> show swlog
Switch Logging is:
- INITIALIZED
- RUNNING
Log Device(s)
----------------
flash console
Only Applications not at the level ‘info’ (6) are shown Application ID Level
----------------------------
CHASSIS (64) debug3 (9)
By default, log devices are set to be flash and console. This can be changed and specific log servers can be used to log the messages, please refer to the Switch Management Guide for further details. The applica­tion trace level is set for ‘info’. Any error messages or informational messages would be logged in the switch log.
Switch log should be viewed to see if any errors messages were generated by the switch. The command to use is:
-> show log swlog
Displaying file contents for 'swlog2.log' FILEID: fileName[swlog2.log], endPtr[32]
configSize[64000], currentSize[64000], mode[2] Displaying file contents for 'swlog1.log' FILEID: fileName[swlog1.log], endPtr[395]
configSize[64000], currentSize[64000], mode[1]
Time Stamp Application Level Log Message
------------------------+--------------+-------+--------------------------------
MON AUG 21 23:09:57 2023 HSM-CHASSIS info == HSM == GBIC extraction detect ed on NI slot 1, GBIC port 2 MON AUG 21 23:28:33 2023 HSM-CHASSIS info == HSM == GBIC Insertion detecte d on NI slot 1, GBIC port 1 MON AUG 21 23:28:33 2023 HSM-CHASSIS info == HSM == GBIC Insertion detecte
OmniSwitch Troubleshooting Guide September 2005 page 1-7
Page 28
Troubleshooting System for OS-6624/6648 and OS-7/8XXX Troubleshooting the Switch System
d on NI slot 1, GBIC port 2 MON AUG 21 23:28:39 2023 HSM-CHASSIS info == HSM == GBIC extraction detect ed on NI slot 5, GBIC port 2 MON AUG 21 23:30:39 2023 HSM-CHASSIS info == HSM == GBIC Insertion detecte d on NI slot 5, GBIC port 2 MON AUG 21 23:30:41 2023 HSM-CHASSIS info == HSM == GBIC extraction detect ed on NI slot 1, GBIC port 1 MON AUG 21 23:30:45 2023 HSM-CHASSIS info == HSM == GBIC extraction detect ed on NI slot 1, GBIC port 2 TUE AUG 22 00:05:45 2023 CSM-CHASSIS info == CSM == !!!ACTIVATING!!! TUE AUG 22 00:05:45 2023 CSM-CHASSIS info == CSM == !!! REBOOT !!! TUE AUG 22 00:05:53 2023 SYSTEM alarm System rebooted via ssReboot(), restart type=0x2 (COLD)
The log message are kept in two log files: swlog1.log and swlog2.log in flash. In the above example, log messages show that some GBICs were extracted and inserted at a particular time. In addition, the switch was rebooted. This information helps to relate the time of the problem together with the events happening at the switch. In addition, it also provides an idea about if the source of the problem was external or inter­nal to the switch.
If the log messages do not show enough information then they can be changed for specific applications to a higher log level or for all the applications running in the switch. For setting up different log levels in switch log, please refer to the “Using Switch Logging” chapter in the appropriate OmniSwitch Network
Configuration Guide.
If the switch is running in redundant configuration make sure that the two CMMs are completely synchro­nized. This can be done using the command:
-> show running-directory Running CMM : PRIMARY, Running configuration : WORKING, Certify/Restore Status : CERTIFIED, Synchronization Status : SYNCHRONIZED
If the two CMMs are not synchronized and the problem leads to the failure of Primary CMM then it will result in re-initialization of all of the modules. If the two CMMs are properly synchronized and primary CMM failed, the take over mechanism will be transparent to the end user. So, for complete redundancy keep the two CMMs synchronized.
Look for any post-mortem dump files that may be created due to the problem with the switch. Post Mortem Dump files have an extension of *.dmp and are created in /flash directory of the CMM (be sure to check the secondary CMM, if running in redundant mode). System dump files are normally named as “cs_system.dmp”, Memory related dump files are normally created as “MemMon000.dmp” and NI related dump files are named as “SloXSliYver1.dmp”, where X is the slot number and Y is the slice number.
The creation of a dump file indicates a problem with the switch. System related dump files can be viewed through CLI but other dump files cannot. For system related dump files use the command:
-> show log pmd cs_system.pmd
Capture the output of this command. In addition, if there are any dump files created in the switch, they should be downloaded through FTP to forward them to technical support. Technical Support can have them analyzed to find the source of the problem.
page 1-8 OmniSwitch Troubleshooting Guide September 2005
Page 29
Troubleshooting the Switch System Advanced Troubleshooting

Advanced Troubleshooting

One level of switch logging is stored in the two log files located in the /flash directory. There is another low level debug that can be enabled and used for diagnosing the problems. This debug is known as “systrace”, meaning system trace. The information in this trace is stored in NVRAM on the CMM, so it is valid until powered off. Soft reboot of the switch will retain the trace information but powering off the switch will result in loosing all of the information. This is less CMM intensive so can be used to collect all the background information about the different tasks running in the switch.
The command to look at the default settings for systrace is
-> debug systrace show sysTrace is:
- INITIALIZED
- RUNNING
- configured to TRACE CALLERS
- configured to NOT WATCH on stdout
All Applications have their trace level set to level ’info’ (6)
Systrace is set for the level of “info” for all the applications. Any application with trace level other than 6 is displayed in the above command output. Notice that it is initialized by default and is running in the background. By default it is configured not to display messages on the console. The purpose of systrace is to track all the system processes called and the caller.
Application log levels can be changed and specific applications can also be set for the logging purposes. The commands are similar to switch log.
-> debug systrace appid ?
WEB VRRP VLAN TRAP TELNET SYSTEM STP SSL SSH
SNMP SMNI SLB SESSION RSVP RMON QOS QDRIVER QDISPATCHER PSM PRB-CHASSIS PORT-MGR POLICY PMM NOSNMP NI-SUPERVISION NI-INTERFACE NAN-DRIVER MODULE MIPGW LINKAGG LDAP IPX IPMS IPC-MON IPC-LINK IPC-DIAG IP-HELPER IP INTERFACE HSM-CHASSIS HEALTH GMAP GM FTP EPILOGUE EIPC DRC DISTRIB DIAG CVM-CHASSIS CSM-CHASSIS CONFIG CMS-CHASSIS CMM-INTERFACE CLI CHASSIS CCM-CHASSIS BRIDGE AMAP ALL AAA 802.1Q <num>
(System Service & File Mgmt Command Set)
The applications and the log levels are the same as switch log applications. Please refer to the “Section Switch Logging Commands Overview” section in the “Using Switch Logging” chapter in the appropriate OmniSwitch Network Configuration Guide.
Systrace can be enabled using the command:
-> debug systrace enable
To look at the systrace log file use the following command:
swnygb02 > debug systrace show log TStamp(us) AppId Level Task Comment
----------+------+------+----------+--------------------------------------­3349119104 CSM-CH info tCsCSMtask ***HELLO FSM TRACE*** 3349118980 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> Event = CS_CSM_HELLO
OmniSwitch Troubleshooting Guide September 2005 page 1-9
Page 30
Advanced Troubleshooting Troubleshooting the Switch System
_SM_IPCUP_TIMEOUT 3349118948 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> CS_TIMEOUT 3345200526 CSM-CH info tCsCSMtask ***HELLO FSM TRACE*** 3342928783 CSM-CH info tCsCSMtask ***HELLO FSM TRACE*** 3342928661 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> Event = CS_CSM_HELLO _SM_IPCUP_TIMEOUT 3342928628 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> CS_TIMEOUT 3336738410 CSM-CH info tCsCSMtask ***HELLO FSM TRACE*** 3336738287 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> Event = CS_CSM_HELLO _SM_IPCUP_TIMEOUT 3336738256 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> CS_TIMEOUT 3334849145 CSM-CH info tCsCSMtask ***HELLO FSM TRACE*** 3330548020 CSM-CH info tCsCSMtask ***HELLO FSM TRACE*** 3330547902 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> Event = CS_CSM_HELLO _SM_IPCUP_TIMEOUT 3330547869 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> CS_TIMEOUT 3324495309 CSM-CH info tCsCSMtask ***HELLO FSM TRACE*** 3324357940 CSM-CH info tCsCSMtask ***HELLO FSM TRACE*** 3324357816 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> Event = CS_CSM_HELLO _SM_IPCUP_TIMEOUT 3324357782 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> CS_TIMEOUT 3318167293 CSM-CH info tCsCSMtask ***HELLO FSM TRACE*** 3318167171 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> Event = CS_CSM_HELLO _SM_IPCUP_TIMEOUT 3318167139 CSM-CH info tCsCSMtask csCsmHelloReceptio - -> CS_TIMEOUT
This information is useful to analyze the different processes taking place in the switch.
Other useful command to use in case of problem is:
-> show tech-support
This command captures all of the information from the chassis, including the hardware information, configuration, software release active and some other statistics about the number of buffers being used at the time of the use of command. The output of the command is saved in /flash as “tech_support.log”. Other variation of this command is:
-> show tech-support layer2
This command collects Layer 2 data only.
-> show tech-support layer3
This command collects Layer 3 data only.
page 1-10 OmniSwitch Troubleshooting Guide September 2005
Page 31
Troubleshooting the Switch System Dshell Troubleshooting

Dshell Troubleshooting

To further diagnose the task consuming the CPU on the CMM one needs to use the following Dshell commands:
Note. Dshell commands should only be used by Alcatel personnel or under the direction of Alcatel. Misuse or failure to follow procedures that use Dshell commands in this guide correctly can cause lengthy network down time and/or permanent damage to hardware.
Working: [Kernel]->spyReport NAME ENTRY TID PRI total % (ticks) delta % (ticks)
-------- -------- ----- --- --------------- --------------­tExcTask excTask 7545100 0 0% ( 179) 0% ( 0) tLogTask logTask 753f800 0 0% ( 0) 0% ( 0) tShell shell 41b1600 1 0% ( 25) 1% ( 1) tWdbTask 73ae6a0 3 0% ( 0) 0% ( 0) IPC_tick IPC_tick 6862660 4 6% ( 11855) 0% ( 0) tSpyTask spyComTask 41aab10 5 0% ( 0) 0% ( 0) tAioIoTask1 aioIoTask 7528580 50 0% ( 0) 0% ( 0) tAioIoTask0 aioIoTask 75212d0 50 0% ( 0) 0% ( 0) tNetTask netTask 741c820 50 1% ( 2047) 10% ( 7) tIpedrMsg ipedrKerne 53043b0 50 0% ( 25) 0% ( 0) tAioWait aioWaitTas 752f830 51 0% ( 0) 0% ( 0) bbussIntMoni tBbusIntMo 6864a00 70 0% ( 0) 0% ( 0) ipc_monitor ipc_monito 67ff4a0 70 0% ( 10) 0% ( 0) tL2Stat esmStatMsg 57626b0 70 0% ( 701) 0% ( 0) Gateway mipGateway 67e1770 80 0% ( 2) 0% ( 0) EIpc eipcMgr_ma 678ac10 80 0% ( 0) 0% ( 0) EsmDrv esmDrv 579f990 80 0% ( 124) 0% ( 0) tMemMon memMonTask 7230d60 90 0% ( 0) 0% ( 0) tCS_PTB csPtbMain 67ee930 93 0% ( 19) 0% ( 0) tCS_CCM csCcmMain 722d590 93 0% ( 18) 0% ( 0) tCS_PRB csPrbMain 72299f0 93 0% ( 132) 0% ( 0) tCS_CMS csCmsMain 7227720 93 0% ( 0) 0% ( 0) tCS_HSM Letext 7225420 93 0% ( 438) 0% ( 0) tCsCSMtask Letext 67f3c10 94 0% ( 207) 0% ( 0) tNanISR nanProcInt 4ae2660 95 0% ( 0) 0% ( 0) SwLogging swLogTask 724b900 100 0% ( 4) 0% ( 0) DSTwatcher dstWatcher 7214f00 100 0% ( 0) 0% ( 0) tWhirlpool batch_entr 71fb210 100 0% ( 4) 0% ( 0) ipc_tests ipc_tests_ 67fd1f0 100 0% ( 3) 0% ( 0) PortMgr pmMain 67df2f0 100 0% ( 6) 0% ( 0) PsMgr psm_main 67d9030 100 0% ( 0) 0% ( 0) VlanMgr Letext 678f1d0 100 0% ( 292) 0% ( 0) TrapMgr trap_task 6774fc0 100 0% ( 10) 0% ( 0) PartMgr partm_eup_ 675fcf0 100 0% ( 0) 0% ( 0) SNMPagt snmp_task 6980d70 100 0% ( 93) 0% ( 0) SesMgr sesmgr_mai 697b8a0 100 0% ( 1) 0% ( 0) SsApp tssAppMain 59126b0 100 0% ( 11) 0% ( 0) Ftpd cmmFtpd 58f3a80 100 0% ( 38) 0% ( 0) NanDrvr nanDriver 58da2b0 100 0% ( 0) 0% ( 0) Health healthMonM 58d7b70 100 0% ( 414) 0% ( 0) L3Hre l3hre_cmm_ 58709d0 100 0% ( 7) 0% ( 0) DbgNiGw dbgGw_main 585f170 100 0% ( 0) 0% ( 0) SrcLrn slCmmMain 5798ab0 100 0% ( 91) 0% ( 0)
OmniSwitch Troubleshooting Guide September 2005 page 1-11
Page 32
Dshell Troubleshooting Troubleshooting the Switch System
GrpMob gmcControl 5793320 100 0% ( 80) 0% ( 0) Stp stpCMM_mai 56b3eb0 100 0% ( 82) 0% ( 0) 8021q main_8021q 5841290 100 0% ( 0) 0% ( 0) LnkAgg la_cmm_mai 543dbc0 100 0% ( 57) 0% ( 0) tSlcMsgHdl slcMsgProc 54397f0 100 0% ( 70) 0% ( 0) AmapMgr xmap_main_ 53f0140 100 0% ( 924) 0% ( 0) GmapMgr gmap_main_ 535c750 100 0% ( 164) 0% ( 0) PMirMon pmmMain 5340e20 100 0% ( 3) 0% ( 0) Ipedr ipedrMain 5327730 100 3% ( 6664) 26% ( 17) AAA aaa_main 5324110 100 0% ( 152) 0% ( 0) stpTick stpcmm_tim 5316800 100 0% ( 27) 0% ( 0) tIpedrPkt ipedrPktDu 52f3d90 100 0% ( 0) 0% ( 0) AVLAN aaaAvlanMa 5256670 100 0% ( 7) 0% ( 0) onex onex_main 52513e0 100 0% ( 15) 0% ( 0) Ipmem ipmem_main 522fad0 100 0% ( 398) 0% ( 0) la_cmm_tick la_cmm_tic 522b370 100 0% ( 19) 0% ( 0) ipmfm ipmfm_main 51e7cc0 100 0% ( 23) 0% ( 0) ipmpm ipmpm_main 58b99e0 100 0% ( 24) 0% ( 0) Ipx ipxMain 4fbd180 100 0% ( 41) 0% ( 0) Vrrp vrrpMain 4fba330 100 0% ( 1318) 9% ( 6) UdpRly udpRlyMain 4f926d0 100 0% ( 804) 0% ( 0) Qos qos_main 4f70ac0 100 0% ( 36) 0% ( 0) PolMgr pyPolicyMa 4e7cdb0 100 0% ( 3) 0% ( 0) SlbCtrl slbcMain 4e787f0 100 0% ( 4) 0% ( 0) WebView tEmWeb 4e74000 100 0% ( 2) 0% ( 0) SNMP GTW snmp_udp_g 4add0a0 100 0% ( 1) 0% ( 0) SNMP TIMER snmp_timer 4ada6e0 100 0% ( 2) 0% ( 0) GmapTimer gmap_proc_ 4ad7080 100 0% ( 2) 0% ( 0) DrcTm tmMain 4acf630 100 0% ( 28) 0% ( 0) tDrcIprm iprmMain 499bba0 100 0% ( 336) 0% ( 0) tOspf ospfMain 4898d60 100 1% ( 3419) 16% ( 11) tPimsm pimsmMain 46d39e0 100 0% ( 371) 0% ( 0) tDrcIpmrm ipmrmMain 46132e0 100 0% ( 66) 0% ( 0) cliConsole clishell_m 44b3b00 100 0% ( 0) 0% ( 0) tWebTimer web_timer 4b7e520 107 0% ( 2) 0% ( 0) tssApp_SNMP_ tssAppChil 58fbbf0 110 0% ( 0) 0% ( 0) tssApp_3_4 tssAppChil 4251500 110 0% ( 0) 0% ( 0) CfgMgr confMain 67ec480 120 0% ( 455) 0% ( 0) tCS_CCM2 csCcmChild 4ae03b0 130 0% ( 0) 0% ( 0) Sshd cmmsshd 5b38d50 150 0% ( 0) 0% ( 0) Telnetd cmmtelnetd 590d420 150 0% ( 13) 0% ( 0) Rmon rmonMain 5873ff0 150 0% ( 86) 0% ( 0) tCS_CVM csCvmMain 72194a0 200 0% ( 0) 0% ( 0) SmNiMgr smNiTask 586e060 200 0% ( 0) 0% ( 0) tIpxTimer ipxTimer 4f83f90 200 0% ( 8) 0% ( 0 tIpxGapper ipxGapper 4f7b1c0 200 0% ( 0) 0% ( 0) SesMon_3 Letext 429ef90 200 0% ( 0) 0% ( 0) tTelnetOut0 cmmtelnetO 429c5d0 200 0% ( 0) 0% ( 0) tTelnetIn0 cmmtelnetI 42652e0 200 0% ( 0) 0% ( 0) CliShell0 clishell_m 42611b0 200 0% ( 0) 0% ( 0) tPolMonSvr pyMonitorM 4e45ae0 210 0% ( 1) 0% ( 0) tDcacheUpd dcacheUpd 74f8e70 250 0% ( 41) 0% ( 0) KERNEL 3% ( 6051) 23% ( 15) INTERRUPT 0% ( 19) 0% ( 0) IDLE 79% ( 154794) 12% ( 8) TOTAL 93% ( 193565) 97% ( 65)
page 1-12 OmniSwitch Troubleshooting Guide September 2005
Page 33
Troubleshooting the Switch System Dshell Troubleshooting
2 tasks were created. 2 tasks were deleted. spyStop value = 0 = 0x0
It seems that the CPU task is high because of tNetTask, Ipedr, and tOSPF.
Check to see if any of the task is suspended on the CMM.
Working: [Kernel]->i
NAME ENTRY TID PRI STATUS PC SP ERRNO DELAY
---------- ------------ -------- --- ---------- -------- -------- ------- ----­tExcTask excTask 7545100 0 PEND 17fd68 7544d40 3d0001 0 tLogTask logTask 753f800 0 PEND 17fd68 753f430 0 0 tShell shell 41b1600 1 READY 15c0e0 41b09a0 30065 0 tWdbTask 150520 73ae6a0 3 PEND 158540 73ae130 0 0 IPC_tick IPC_tick 6862660 4 READY 158540 6862340 0 0 tDrcIprm iprmMain 499bba0 100 PEND+T 158540 499b520 b 243 tOspf ospfMain 4898d60 100 SUSPEND 158540 48986d0 b 299
value = 0 = 0x0 Working: [Kernel]->
In the above example, the OSPF task is suspended. Typically when a task is suspended, the system will automatically reboot and generate a system dump file. In the event that the system does not reboot, then try to gather the task trace and memory dump for that specific task using the following command:
Working: [Kernel]->tt 0x4898d60 108e9c vxTaskEntry +c : Letext (&dataInfo, 67f3920, 67f3a20, 34000000, 66ff800, 6a69800) 66b69b4 Letext +2d4: zcSelect (5, 67f3a20, 0, 0, 6a6c800, 247) 6ff56f8 zcSelect +458: semTake (67eedc0, ffffffff, a, 28, a, 0) 158b4c semTake +2c : semBTake (67eedc0, ffffffff, &semTakeTbl, 0, &semBTake, 264c00) value = 0 = 0x0
Working: [Kernel]->ti 0x4898d60
NAME ENTRY TID PRI STATUS PC SP ERRNO DELAY
---------- ------------ -------- --- ---------- -------- -------- ------- ----­tOspf ospfMain 4898d60 100 SUSPEND 13e060 490f920 b 0
stack: base 0x49103d0 end 0x49009d0 size 63312 high 10036 margin 53276
options: 0x4 VX_DEALLOC_STACK
%pc = 13e060 %npc = 13e064 %ccr = 0 %y = 0 %asi = 0 %cwp = 0 %tt = 0 %tl = 0 %pil = 0 %pstate = 1e %g0 = 0 %g1 = 0 %g2 = 0 %g3 = 0 %g4 = 0 %g5 = 0 %g6 = 0 %g7 = 0 %i0 = 0 %i1 = 0 %i2 = 0 %i3 = 0 %i4 = 0 %i5 = 0 %fp = 490f9e0 %i7 = 0 %l0 = 0 %l1 = 0 %l2 = 0 %l3 = 0 %l4 = 0 %l5 = 0 %l6 = 0 %l7 = 0
OmniSwitch Troubleshooting Guide September 2005 page 1-13
Page 34
Dshell Troubleshooting Troubleshooting the Switch System
%o0 = 490f9e0 %o1 = 0 %o2 = 0 %o3 = 0 %o4 = 0 %o5 = 0 %sp = 490f920 %o7 = 0 value = 76612560 = 0x49103d0 Certified: [Kernel]->
To troubleshoot a CPU or memory spike with 5.1.5.X, you can start a software routine in dshell and it will log the task name to the swlog whenever there is a spike in CPU or memory usage.
Switch/> dshell Certified: [Kernel]->lkup "Hog" catchCpuHog 0x00152700 text => to turn on CPU watch catchMemHog 0x0013fa80 text => to turn on Memory watch releaseCpuHog 0x00152720 text => to turn off CPU watch releaseMemHog 0x0013fb60 text => to turn off Memory watch value = 58685232 = 0x37f7730 Certified: [Kernel]->
To troubleshoot a problem related to stack overflow:
Working: [Kernel]->checkStack
NAME ENTRY TID SIZE CUR HIGH MARGIN
------------ ------------ -------- ----- ----- ----- -----­tExcTask excTask 7525070 19992 960 4344 15648 tLogTask logTask 751f750 8184 976 2176 6008 tPingTmo854 0x0000103d60 3d9c580 8184 800 1068 7116 tShell shell 3c13ef0 19048 6368 8644 10404 tWdbTask 0x0000168200 7395dc0 7904 1392 2060 5844 IPC_tick IPC_tick 6efb040 32760 800 4952 27808 tCsCSMtask2 csCsmHelloBa 7214c40 19992 1200 4264 15728 tCS_PTB csPtbMain 5ca5960 8184 800 5152 3032 tCS_CCM csCcmMain 720f8e0 13312 1632 9436 3876 tCS_PRB csPrbMain 720bc50 9320 1504 5588 3732 tCS_CMS csCmsMain 7209240 8176 1872 3044 5132 tCS_HSM csHsmMain 7206f20 29320 1472 7556 21764 tCsCSMtask csCsmMain 5cb2240 29320 2304 14516 14804 tCsCSMtask3 csCsmChecksu 5caaa40 19984 848 6724 13260 tAioIoTask1 aioIoTask 7508450 28664 944 1136 27528 tAioIoTask0 aioIoTask 7501180 28656 944 1136 27520 tNetTask netTask 74040a0 14992 944 5204 9788 tIpedrMsg ipedrKernelM 4d4a7a0 19984 944 2828 17156 tTrapPing 0x0005b89e60 3aa7830 19984 2864 3172 16812
OmniSwitch 7700/7800/8800 Dshell Task Definitions
tExcTask Exception Handling Task
tLogTask Log Task
tShell Shell Task
tWdbTask Wind Debug Agent
IPC_tick IPC ticks
tSpyTask Spy Task monitor the system utilization
tAioIoTask1 Asynchronous I/O Support
page 1-14 OmniSwitch Troubleshooting Guide September 2005
Page 35
Troubleshooting the Switch System Dshell Troubleshooting
tAioIoTask0 Asynchronous I/O Support
tNetTask Routing Task
tIpedrMsg IP Ethernet Driver Message Handle Task
tAioWait Asynchronous I/O Support
bbussIntMoni BBUS monitor Task
ipc_monitor IPC monitor Task
tL2Stat L2 statistics gathering task
Gateway Management Information Protocol Gateway
EIpc Extended IPC task
EsmDrv Ethernet switching manager Driver Task
tMemMon Memory Monitor Task
tCS_PTB Chassis Supervision Pass-through Support
tCS_CCM Chassis Configuration Manager
tCS_PRB Chassis Supervision Prober Task
tCS_CMS Chassis MAC Server
tCS_HSM Chassis Supervision Hardware Services Manager
tCsCSMtask Chassis Supervision Chassis State Manager
tNanISR Nantucket Interrupt Service Routine
SwLogging Switch Logging Task
DSTwatcher Clock Task of the switch
tWhirlpool Encryption Support
ipc_tests IPC debugging and test support
PortMgr Port Manager Task
PsMgr Power Supply Manager Task
VlanMgr VLAN Manager Task
TrapMgr Trap Manager Task
PartMgr Partition management task
SNMPagt SNMP agent task
SesMgr Session Manager Task
SsApp Session Application Task
Ftpd FTP Daemon Task
NanDrvr Nantucket Driver Task
OmniSwitch Troubleshooting Guide September 2005 page 1-15
Page 36
Dshell Troubleshooting Troubleshooting the Switch System
Health Health Task
L3Hre Layer 3 HRE Task
DbgNiGw NI Debug support
SrcLrn Source Learning Task
GrpMob Group Mobility Task
Stp Spanning Tree Task
8021q 802.1Q Task
LnkAgg Link Aggregation Task
tSlcMsgHdl Source Learning Message Handler Task
AmapMgr AMAP Manager Task
GmapMgr GMAP Manager Task
PMirMon Port Mirror Monitoring Task
Ipedr IP Extended Dynamic Routing Task
AAA AAA task
stpTick STP Timing Task
tIpedrPkt IP Ethernet Driver Task
AVLAN Authenticated VLAN Task
onex 802.1X Task
Ipmem IP Multicast Task
la_cmm_tick CMM Link Aggregation Timer
ipmfm IP Multicast Forwarding
ipmpm IP Multicast Management
Ipx IPX Task
Vrrp VRRP Task
UdpRly UDP Relay Task
Qos QOS Task
PolMgr Policy Manager Task
SlbCtrl Server Load Balancing Control Task
WebView WebView Task
SNMP GTW SNMP Gateway
SNMP TIMER SNMP Timer
GmapTimer GMAP Timer Task
page 1-16 OmniSwitch Troubleshooting Guide September 2005
Page 37
Troubleshooting the Switch System Dshell Troubleshooting
DrcTm Dynamic Routing Control Timer Task
tDrcIprm Dynamic Routing Control Task for IP Route Manager
tOspf OSPF Task
tPimsm PIM SIM Task
tDrcIprm Dynamic Routing Control IP Route Manager task
cliConsole CLI Console Task
tWebTimer Web Session Timer
tssApp_SNMP Temporary System Services task to support SNMP
tssApp_3_4 Temporary System Services task to support CLI
CfgMgr Configuration Manager Task
tCS_CCM2 Chassis Configuration Manager
Sshd Secure Shell Daemon Task
Telnetd Telnet Task
Rmon RMON Task
tCS_CVM Chassis Version Manager Task
SmNiMgr CMM-NI Shared Memory Manager
TIpxTimer IPX Timer
TIpxGapper IPX Routing Protocol InterPacket Gap Control
SesMon_3 Session Monitor for Session Number
tTelnetOut0 Telnet Session 0 out task
tTelnetIn0 Telnet Session 0 in Task
CliShell0 CLI session 0 shell Task
TPolMonSvr Policy Manager Monitor LDAP Servers
TDcacheUpd FPGA Support
OmniSwitch 6624/6648 Dshell Task Definitions
tExcTask Exception Handling Task
tLogTask Log Task
tShell Shell Task
tNetTask Routing Task
qdrCpu Queue Driver of from CPU queues
qdsCpu Queue Dispatcher of to CPU queues
OmniSwitch Troubleshooting Guide September 2005 page 1-17
Page 38
Dshell Troubleshooting Troubleshooting the Switch System
tIpedrMsg IP Ethernet Driver Task Message handler
tahw_sch Spanning Tree Support
qdsUnr Queue Dispatcher of unresolved queues
taSM_DVR NI Stack Manager
ipcReceive IPC Receive Task
taSM_NI NI Stack Manager
la_ni_tick_ NI Link Aggregation Timer
tahw_stp Spanning Tree Support
IPCHAWKTIME IPC Timer
ipc_monitor IPC monitor task
tNiSup&Prb NI supervision and Prober task
tL2Stat L2 statistics gathering task
taEipc Extended IPC task
CfgMgr Configuration Manager Task
Gateway MIP gateway
EIpc Extended IPC task
Ftpd FTP Daemon Task
taStp Spanning Tree task
tMemMon Memory Monitor task
tssApp_SNMP Temporary task to support SNMP
tssApp_12_4 Temporary task to support CLI
tCS_CCM Chassis Configuration Manager
tCS_PRB Chassis supervision Prober task
tCS_CMS Chassis MAC Server
tCS_HSM Chassis Supervision Hardware Services Manager
tCsCSMtask Chassis Supervision Chassis State Manager
SwLogging Switch Logging task
DSTwatcher Daylight saving task
tWhirlpool Encryption Support
ipc_tests IPC debugging and test support
ipc_ping IPC ping task
IXE2424_ IXE2424 task
page 1-18 OmniSwitch Troubleshooting Guide September 2005
Page 39
Troubleshooting the Switch System Dshell Troubleshooting
taNiEsmDrv NI Ethernet switching driver task
tsLnkState Link State monitor task
PortMgr Port manager task
PsMgr Power supply Manager task
VlanMgr VLAN Manager task
TrapMgr Trap manager task
SM_CMM CMM Stack Manager
PartMgr Partition Manager task
SNMPagt SNMP agent
SNMP GTW SNMP Gateway
SNMP TIMER SNMP Agent Timer
SesMgr Session Manager Task
SsApp Session Application Task
Ntpd NTP Daemon Task
Health Health Monitor task
EsmDrv Ethernet NI software (ESM) driver task
SrcLrn Source learning task
tSlcMsgHdl Source learning message handler task
GrpMob Group Mobility task
Stp Spanning tree task
stpTick CMM Spanning tree timer
8021q 802.1Q task
LnkAgg Link Aggregation task
la_cmm_tick CMM Link Aggregation timer
AmapMgr AMAP manager task
GmapMgr GMAP manager task
GmapTimer GMAP timer task
PMirMon Port Mirroring task
Ipedr IP Ethernet driver task
tIpedrPkt IP ethernet packet handler task
AAA AAA task
AVLAN Authenticated VLAN task
OmniSwitch Troubleshooting Guide September 2005 page 1-19
Page 40
Dshell Troubleshooting Troubleshooting the Switch System
onex 802.1X
Vrrp VRRP task
UdpRly UDP Relay task
Qos CMM QOS
PolMgr Policy Manager task
Ipmem IP Multicast Task
ipmfm IP Multicast Forwarding
ipmpm IP Multicast Management
DrcTm Dynamic Routing Control Timer task
TDrcIprm Dynamic Routing Control IP Route Manager task
taDot1q_ 802.1Q task
taSLNEvent Source learning event handler
taGmnCtrl NI group mobility
taVmnCtrl NI VLAN manager
taLnkAgg NI link aggregation
taQoS NI QOS task
taIpni IP task on a NI
taIpms IPMS task
taXMAP_ni XMAP task on a NI
taUdpRelay NI UDP relay
taAvlan NI Authenticated VLAN
taPortMir NI Port Mirroring
taQFab Software fabric for stacks
tSLNAdrLrn NI source learning task
RADIUS Radius task
cliConsole Console
tWebTimer Web Session Timer
tCS_CCM2 Chassis Configuration Manager
Sshd SSH daemon (secure shell)
NtpDaemon NTP daemon (network time protocol)
Rmon RMON task
WebView WebView Task
page 1-20 OmniSwitch Troubleshooting Guide September 2005
Page 41
Troubleshooting the Switch System Dshell Troubleshooting
tCS_CVM Chassis Version Manager
SesMon_12 Session Monitor
tTelnetOut0e4208c Telnet Outgoing
tTelnetIn0 Telnet Incoming
CliShell0 CLI session 0 shell Task
tPolMonSvr Policy Manager Monitor LDAP Servers
tDcacheUpd FPGA Support
To further qualify the source of the problem we need to look at each and every NI.

Troubleshooting NIs on OmniSwitch 7700/7800/8800

Looking at the health statistics of each NI would give an idea about which one is causing the problem. Following CLI command can be used to diagnose:
Show health <slot number>
Example:
-> show health 5 * - current value exceeds threshold
Slot 05 1 Min 1 Hr 1 Hr Resources Limit Curr Avg Avg Max
----------------+-------+-----+------+-----+-----+------­Receive 80 01 01 01 01 Transmit/Receive 80 01 01 01 01 Memory 80 39 39 39 39 Cpu 80 21 22 21 24
The NI Debugger software can be launched in Dshell using the following command:
Working: [dshell]-> <nidebug
This will launch the NI Debugger. To change to a specific slot and slice (Coronado) the following command can be used:
changeSlot slot,slice
Now the processor on that slot can be accessed just like CMM to see all tasks (running or suspended), tasks consuming the CPU the most, and other commands like task trace (tt) or task info (ti).
Working: [Kernel]->NiDebug 1:0 nidbg> 1:0 nidbg> nisup_cpuShow 1:0 1:0 Task Cpu 1:0 Id Name Abs Rel 1:0 -------- ----------- ---- ---­1:0 017fd170 tsHw_qdisp 13% 13% 1:0 015ea1c0 taIpni 2% 9% 1:0 015fae50 taVmnCtrl 0% 2%
OmniSwitch Troubleshooting Guide September 2005 page 1-21
Page 42
Dshell Troubleshooting Troubleshooting the Switch System
1:0 0160cef8 t_ipc_cmm_p 1% 1% 1:0 015f61d0 taL3Hre 0% 1% 1:0 015ee670 taXMAP_ni 0% 1% 1:0 015f7768 taStp 0% 0% 1:0 015f4080 taQoS 0% 0% 1:0 015ed4c0 taIpms 0% 0% 1:0 017fb470 tExcTask 0% 0% 1:0 017f8fb8 tDBG_sp_tk 0% 0% 1:0 017f6290 tNiSup&Prb 0% 0% 1:0 01602bf8 taHw_qdrv 0% 0% 1:0 01601e30 taIpc_ni 0% 0% 1:0 01601450 taEipc 0% 0% 1:0 015fed08 taSLNEvent 0% 0% 1:0 015fbc18 taGmnCtrl 0% 0% 1:0 015fa088 taLnkAgg 0% 0% 1:0 015f0f90 taDot1q_ni 0% 0% 1:0 015eb370 taIpx 0% 0% 1:0 015e70d0 taUdpRelay 0% 0% 1:0 015e47b0 taAvlan 0% 0% 1:0 015e2e30 taPortMir 0% 0% 1:0 015e1030 tQDriverSub 0% 0% 1:0 015c09e0 la_ni_tick_ 0% 0% 1:0 015a2b28 taEniMsgHdl 0% 0% 1:0 015a16d0 tahw_stp 0% 0% 1:0 015a0ca0 tahw_sch 0% 0% 1:0 01593e98 tSLNAdrLrn 0% 0% 1:0 01590da8 tSLNDAMgr 0% 0% 1:0 014f5e80 tsLnkState 0% 0% 1:0 014f4cd0 tsStatistic 0% 0% 1:0 KERNEL 1% 1% 1:0 INTERRUPT 0% 0% 1:0 IDLE 78% 69% 1:0 value = 0 = 0x0
To force a NI to create a dump file the following command can be used in Dshell:
Working: [Kernel]->pmdni_generate 1,0,"slo1slic0.pmd"
Syntax is pmdni_generate slot,slice, file_name.
This will result in generating a PMD file for slot 1 slice 0 in /flash directory, which can then be forwarded to Engineering for analysis. In addition, there is a software available known as “ni_pmdexploit” which can be used on UNIX OS to exploit the PMD files in VI format. The OMD files generated on the switch for NI are in binary format and cannot be viewed by switch log commands on the switch. These files need to be converted to VI format to be analyzed.
The format to exploit a NI pmd file is “ni_pmdexploit <filename> < <new filename”. Once it is exploited, it can be viewed using normal UNIX editors.
page 1-22 OmniSwitch Troubleshooting Guide September 2005
Page 43
Troubleshooting the Switch System Dshell Troubleshooting

OmniSwitch 6624/6648 Dshell Troubleshooting

One of the important things in OS-6600 is to confirm the stack topology. This can be confirmed using the command:
Working: [Kernel]->smctx ****************************************************
local_slot=1 * base_mac= 00:d0:95:84:4b:d2 * local_mac=0000 1111 1111 * TYPE_48 * heart_beat=19007
state=SUPERV role=PRIMARY (primary_slot=1 secondary_slot=2) opposite_way=0
nb=7 elements=0x300ff in_loop=1 supervision=ON (check=0x10100 change=0x0)
gport1=0x1a lport1=0x1a status=1 * gport2=0x1b lport2=0x1b status=1
neighbor1 (nb1=7) [0]=0|0 [1]=8|1a [2]=7|1b [3]=6|1a [4]=5|1b [5]=4|1a [6]=3|1b [7]=2|1a neighbor2 (nb2=7) [0]=2|1b [1]=2|1b [2]=3|1a [3]=4|1b [4]=5|1a [5]=6|1b [6]=7|1a [7]=8|1b
topology role [1]=1 [2]=2 [3]=3 [4]=3 [5]=3 [6]=3 [7]=3 [8]=3 topology outport [1]=ff [2]=1b [3]=1b [4]=1b [5]=1a [6]=1a [7]=1a [8]=1a topology base mac
[2]= 00:d0:95:84:3d:24 [3]= 00:d0:95:86:50:f2 [4]= 00:d0:95:84:49:bc [5]= 00:d0:95:84:39:bc [6]= 00:d0:95:84:4a:8e [7]= 00:d0:95:84:39:f2 [8]= 00:d0:95:84:3c:42
netid [1]=1|1 [2]=0|0 [3]=0|0 [4]=0|0 [5]=0|0 [6]=0|0 [7]=0|0 [8]=0|0 lookup [1]=ff [2]=1b [3]=1b [4]=1b [5]=1b [6]=1b [7]=1b [8]=1a subrole [1]=2 [2]=4 [3]=7 [4]=7 [5]=7 [6]=7 [7]=6 [8]=5 list [1]=1 [2]=2 [3]=3 [4]=4 [5]=5 [6]=6 [7]=7 [8]=8 [0]=8
hop [0] [1] [2] [3] [4] [5] [6] [7] [8]
[0] -1 -1 -1 -1 -1 -1 -1 -1 -1 [1] -1 0 1 2 3 4 5 6 1 [2] -1 1 0 1 2 3 4 5 6 [3] -1 2 1 0 1 2 3 4 5 [4] -1 3 2 1 0 1 2 3 4 [5] -1 4 3 2 1 0 1 2 3 [6] -1 5 4 3 2 1 0 1 2 [7] -1 6 5 4 3 2 1 0 1 [8] -1 1 6 5 4 3 2 1 0
*****************************************************************
value = 2 = 0x2
This command indicates the role of the local stack.
output definitions
Local slot Local Stack ID.
Base Mac Base Mac Address of this Stack.
OmniSwitch Troubleshooting Guide September 2005 page 1-23
Page 44
Dshell Troubleshooting Troubleshooting the Switch System
output definitions (continued)
Local Mac Local Mac address used for IPC communication across the stacking
cables.
Role Primary or Secondary.
Nb Neighbor ID (1-Based).
In_loop 1 if the stacks are connected in a loop for redundant path.
Neighbor1 Shows the connections to other stacks through the port number.
Topology Role 1=Primary, 2= Secondary, 3=Idle.
Topology Outport Displays the port used to access the other stacks.
Topology base Mac Displays the base mac addresses of all the other stacks.
Lookup The stacking port to be used to do a lookup for another stack.
Hop Displays the hops for each stack to the other stack.
Gport Global port used for stacking (either stack_number a or stack_number
b).
Lport Logical port used for stacking (either stack_number a or stack_number
b).
Status 1=up, 0=down.
To view the stack topology in detail, use the following command:
Working: [Kernel]->stack_topo
local_slot=1 role=PRIMARY P=1 S=2 (elements=0x300ff nb=8 loop=1 sup=2 type=2)
7 elements seen by link1 (gport=0x1a lport=0x1a status=1)
slot=8 originate_port=26 role=IDLE slot=7 originate_port=27 role=IDLE slot=6 originate_port=26 role=IDLE slot=5 originate_port=27 role=IDLE slot=4 originate_port=26 role=IDLE slot=3 originate_port=27 role=IDLE slot=2 originate_port=26 role=SECONDARY
7 elements seen by link2 (gport=0x1b lport=0x1b status=1)
slot=2 originate_port=27 role=SECONDARY slot=3 originate_port=26 role=IDLE slot=4 originate_port=27 role=IDLE slot=5 originate_port=26 role=IDLE slot=6 originate_port=27 role=IDLE slot=7 originate_port=26 role=IDLE slot=8 originate_port=27 role=IDLE
NI=1 CMM=65 role=1
* state_linkA=1 remote_slotA=8 remote_linkA=51 * state_linkB=1 remote_slotB=2 remote_linkB=52
NI=2 CMM=66 role=2
* state_linkA=1 remote_slotA=3 remote_linkA=27 * state_linkB=1 remote_slotB=1 remote_linkB=52
NI=3 CMM=0 role=3
* state_linkA=1 remote_slotA=2 remote_linkA=51 * state_linkB=1 remote_slotB=4 remote_linkB=52
page 1-24 OmniSwitch Troubleshooting Guide September 2005
Page 45
Troubleshooting the Switch System Dshell Troubleshooting
NI=4 CMM=0 role=3
* state_linkA=1 remote_slotA=5 remote_linkA=51 * state_linkB=1 remote_slotB=3 remote_linkB=28
NI=5 CMM=0 role=3
* state_linkA=1 remote_slotA=4 remote_linkA=51 * state_linkB=1 remote_slotB=6 remote_linkB=52
NI=6 CMM=0 role=3
* state_linkA=1 remote_slotA=7 remote_linkA=51 * state_linkB=1 remote_slotB=5 remote_linkB=52
NI=7 CMM=0 role=3
* state_linkA=1 remote_slotA=6 remote_linkA=51 * state_linkB=1 remote_slotB=8 remote_linkB=52
NI=8 CMM=0 role=3
* state_linkA=1 remote_slotA=1 remote_linkA=51 * state_linkB=1 remote_slotB=7 remote_linkB=52
output definitions
local slot number Local stack number.
role Either Primary, secondary or idle.
nb Number of stacks.
loop If redundant path is available
elements seen by link Number of elements seen by the link with the global/local port number
as 1a, in the order they are seen and the role of each stack
NI NI number of the switch in the stack.
CMM CMM number of the switch in the stack. CMM number can be 65 (Pri-
mary), 66 (Secondary) or 0 (Idle). Role can be 1 (Primary), 2 (Second­ary) or 3 (Idle).
state_link Status of link A and B which can be 1 if up or 0 if down.
remote_slot Remote slot number.
remote_link Remote link number.
Accessing Dshell on Idle Switches
OS6600 in standalone environment is like one NI for OmniSwitch 7000 and 8000 series switches. Just going into Dshell will allow the use of normal Vx Works commands.
There are two ways to access Dshell. One is using the dshell command from CLI or pressing control-w, control-w (twice). The second method is used when the console or telnet is not accessible. However, before doing so, it must be enabled by following the steps below on the primary and secondary switches:
1 From the CLI prompt enter:
->dshell
2 From the Dshell prompt enter
Certified: [Kernel]->WWON=1
OmniSwitch Troubleshooting Guide September 2005 page 1-25
Page 46
Using AlcatelDebug.cfg Troubleshooting the Switch System
In stacking environment, only the primary and secondary switches have console enabled whereas console is disabled for the idle switches. To enable the Dshell access to the idle switches use the following command on primary stack:
Nisup_control_WW_on slot
You must execute this command on each idle switch in the stack. Please note that these switches will not allow to exit with the exit command. To restore normal Dshell access you will need to reboot the switch.

Using AlcatelDebug.cfg

When you are using IPMS/DVMRP with 802.1Q it is recommended that debug interfaces set backpres­sure enable be used. This command can be put in the boot.cfg file, but it is overwritten as soon as write memory is issued, since it is a debug command and the setting is lost after a reboot. To retain the debug
settings after a system reboot, put debug commands into a file called AlcatelDebug.cfg in both the work­ing and certified directories. Use Notepad or VI editor to create the AlcatelDebug.cfg file.
Example:
-> vi AlcatelDebug.cfg
-> debug set WWON 1 => to allow dshell access in the event of the console lockup
-> debug set esmDebugLevel 4 => see port up/down event on swlog
-> debug interfaces set backpressure enable => to enable system backpressure
page 1-26 OmniSwitch Troubleshooting Guide September 2005
Page 47
Troubleshooting the Switch System Troubleshooting IPC on OS-6/7/8XXX Series of Switches

Troubleshooting IPC on OS-6/7/8XXX Series of Switches

IPC (Inter Process Communication) is should by the system to communicate between different software modules. This communication can be between different processes in the same software module or between two entirely separate modules. This process can be between NI and CMM or between CMM to CMM.
Burst Bus commonly known as BBUS (management bus) is used for the IPC communication. IPC uses connectionless build-in Vx Works sockets to communicate.
Typical problems that can arise because of the problems with IPC can cause the following symptoms:
Loss of access to the console of the switch
Loss of messages between CMM and NI resulting in switching and routing problems.
High CPU utilization on CMM

Debugging IPC

IPC has 5 different buffer pools:
Urgent Pools
Control Pools for control messages
Normal Pools for some control messages as well as other messages
Jumbo Pools
Local Pools
Each of these pools have some dedicated buffers available. Once any of these processes initiates a socket to communicate, it is suppose to tear the socket down after the communication is done. If it does not tear the socket then it might result in occupying the buffer space which will not be available for other processes.
IPC pools can be looked in dshell using the command:
Working: [Kernel]->ipc_pools UrgentPool: Full size is 1024, remaining: 1024
In socket queues: 0 Not queued: 0: In DMA queues: 0
ControlPool: Full size is 5096, remaining: 5090
In socket queues: 1 Not queued: 3: In DMA queues: 2
NormalPool: Full size is 2024, remaining: 2022
In socket queues: 0 Not queued: 2: In DMA queues: 0
JumboPool: Full size is 256, remaining: 255
In socket queues: 1 Not queued: 0: In DMA queues: 0
OmniSwitch Troubleshooting Guide September 2005 page 1-27
Page 48
Troubleshooting IPC on OS-6/7/8XXX Series of Switches Troubleshooting the Switch System
LocalPool: Full size is 64, remaining: 64
In socket queues: 0 Not queued: 0: In DMA queues: 0
Each type of pool has the following listed in the command output:
Maximum size of buffers available
Currently available buffers
Socket Queues being used
Not Queued in pool
Direct Memory Access Queues
Currently available buffers should always be around the maximum available in normal operation. In some scenarios, it might happen that the remaining pools are decrementing at a fast rate and are never freeing up the buffers. This can lead to problem with IPC.
Iterative use of the command will help to identify the situation.
An example is as follows:
Working: [Kernel]->ipc_pools UrgentPool: Full size is 1024, remaining: 1024
In socket queues: 0 Not queued: 0: In DMA queues: 0
ControlPool: Full size is 5096, remaining: 5062
In socket queues: 4 Not queued: 20: In DMA queues: 10
NormalPool: Full size is 2024, remaining: 2022
In socket queues: 0 Not queued: 2: In DMA queues: 0
JumboPool: Full size is 256, remaining: 255
In socket queues: 1 Not queued: 0: In DMA queues: 0
LocalPool: Full size is 64, remaining: 64
In socket queues: 0 Not queued: 0: In DMA queues: 0
Working: [Kernel]->ipc_pools UrgentPool: Full size is 1024, remaining: 1024
In socket queues: 0 Not queued: 0: In DMA queues: 0
ControlPool: Full size is 5096, remaining: 5060
In socket queues: 6 Not queued: 20: In DMA queues: 10
NormalPool: Full size is 2024, remaining: 2022
In socket queues: 0 Not queued: 2: In DMA queues: 0
JumboPool: Full size is 256, remaining: 255
In socket queues: 1 Not queued: 0: In DMA queues: 0
page 1-28 OmniSwitch Troubleshooting Guide September 2005
Page 49
Troubleshooting the Switch System Troubleshooting IPC on OS-6/7/8XXX Series of Switches
LocalPool: Full size is 64, remaining: 64
In socket queues: 0 Not queued: 0: DMA queues: 0
In the above two outputs it seems that the control pool is stuck and the socket queues are incrementing. In order to find out which task is using these queues we need to look at the socket information.
To look in detail about these pools the following commands can be used in Dshell:
Ipc_urgent_pools_detail number
Ipc_control_pools_detail number
Ipc_normal_pools_detail number
Ipc_jumbo_pools_detail number
Ipc_local_pools_detail number
The above commands have an option to specify the number of sockets to be displayed in Dshell. If no number is specified then it will display all the sockets in use which can be real problem in case of thou­sands of sockets being used.
Working: [Kernel]->ipc_control_pools_detail ipc_control_pools_details
ControlPool: Full size is 5096, remaining: 5090
Socket ID = 0x3, dest slot = 66, remote addr = 0x0, ipc status = D Task ID = 0x67f3c10, PayLoad Len= 68, ipc priority = 0x1, data ptr = 0x5e09f9
8
next = 0x0, pFreeQ = 0x6f565d0, data_offset = 0, free_list_num = 3
Socket ID = 0x5, dest slot = 66, remote addr = 0x8400041, ipc status = D Task ID = 0x67f3c10, PayLoad Len= 68, ipc priority = 0x1, data ptr = 0x5e09ff
8
next = 0x0, pFreeQ = 0x6f565d0, data_offset = 0, free_list_num = 3
Socket ID = 0x8, dest slot = 66, remote addr = 0xf400042, ipc status = G Task ID = 0x6862660, PayLoad Len= 64, ipc priority = 0x1, data ptr = 0x5e0a1d
8
next = 0x6818ba4, pFreeQ = 0x6f565d0, data_offset = 0, free_list_num = 3
Socket ID = 0x8, dest slot = 66, remote addr = 0xf400042, ipc status = G Task ID = 0x6862660, PayLoad Len= 64, ipc priority = 0x1, data ptr = 0x5e1ba5
8
next = 0x68231d8, pFreeQ = 0x6f565d0, data_offset = 0, free_list_num = 3
Socket ID = 0x8, dest slot = 65, remote addr = 0x5090041, ipc status = S Task ID = 0x6862660, PayLoad Len= 68, ipc priority = 0x1, data ptr = 0x5e202b
8
next = 0x0, pFreeQ = 0x6f565d0, data_offset = 0, free_list_num = 3
Socket ID = 0x1, dest slot = 66, remote addr = 0x10400042, ipc status = G Task ID = 0x6862660, PayLoad Len= 64, ipc priority = 0x1, data ptr = 0x5e6999
8
next = 0x0, pFreeQ = 0x6f565d0, data_offset = 0, free_list_num = 3
In socket queues: 1 Not queued: 3: In DMA queues: 2
OmniSwitch Troubleshooting Guide September 2005 page 1-29
Page 50
Troubleshooting IPC on OS-6/7/8XXX Series of Switches Troubleshooting the Switch System
value = 10 = 0xa Working: [Kernel]->
The above command displays a lot of information but we are interested in the most repeating socket ID. In the above example it is 0x8. To look into what does this socket means the following command can be used in Dshell:
Working: [Kernel]->ipc_socket_info 0x8 ipc_socket_info Socket 8:
LocalSocketID = 0x8, localidx = 0x8, Local_address = 0xf400041 RemoteSocketID = 0x8, Remote_Address = 0xf400042 QnumBufs = 1, NumBufs = 1588, seqSent = 1588, seqRecv = 1588 USRnumBufs = 1, State = 0x3, OptionFlgs = 0x0, priority = 1 blk_timeout = 0, LingerTime = 0, RxQ_Full_Threshold = 65536, RxQ_Numbuf_Threshold = 128 congestion = 0, SockMask = 0x100, SockMsbs = 0x0, use_sw_buf = 1 remote_cong = 0, init_done = 13, sem_use = 0, alignmentSpace = 0 Task id = 0x67f3c10 (tCsCSMtask), LastTimeStamp = 1046954691 recvErrs = 0, txCnt = 1588, txErr = 0, eagainCnt = 0 xoffsent = 0, xonsent = 0, xoffrecv = 0, xonrecv = 0, congcount = 0 value = 8 = 0x8 Working: [Kernel]->
The output of the above command shows that tCsCSMtask is the one consuming this socket.
Older versions of the code might not show the task name in the task ID so the following command can be used to find out the tasked:
Working: [Kernel]->ti 0x67f3c10
NAME ENTRY TID PRI STATUS PC SP ERRNO DELAY
---------- ------------ -------- --- ---------- -------- -------- ------- ----­tCsCSMtask csCsmMain 67f3c10 94 PEND 158540 67f34a0 3d0002 0
stack: base 0x67f3c10 end 0x67eede8 size 19320 high 15072 margin 4248
options: 0x4 VX_DEALLOC_STACK
%pc = 158540 %npc = 158544 %ccr = 44 %y = 0 %asi = 15 %cwp = 0 %tt = 0 %tl = 0 %pil = 0 %pstate = 1e %g0 = 0 %g1 = 0 %g2 = 0 %g3 = 0 %g4 = 0 %g5 = 0 %g6 = 0 %g7 = 0 %i0 = 67eedc0 %i1 = ffffffffffffffff %i2 = 1e5c54 %i3 = 0 %i4 = 158440 %i5 = 264c00 %fp = 67f3560 %i7 = 158b4c %l0 = 0 %l1 = 67eedc0 %l2 = 0 %l3 = 14 %l4 = 66fc800 %l5 = 6a62038 %l6 = 66ff810 %l7 = 4 %o0 = 0 %o1 = 0 %o2 = 0 %o3 = 0 %o4 = 0 %o5 = 0 %sp = 67f34a0 %o7 = 0
value = 109001744 = 0x67f3c10 Working: [Kernel]->
page 1-30 OmniSwitch Troubleshooting Guide September 2005
Page 51
Troubleshooting the Switch System Troubleshooting IPC on OS-6/7/8XXX Series of Switches
Now doing a task trace on this task can be helpful to see if the task is moving:
Working: [Kernel]->tt 0x67f3c10 108e9c vxTaskEntry +c : Letext (&dataInfo, 67f3920, 67f3a20, 34000000, 66ff8 00, 6a69800) 66b69b4 Letext +2d4: zcSelect (5, 67f3a20, 0, 0, 6a6c800, 247) 6ff56f8 zcSelect +458: semTake (67eedc0, ffffffff, a, 28, a, 0) 158b4c semTake +2c : semBTake (67eedc0, ffffffff, &semTakeTbl, 0, &semBTa ke, 264c00) value = 0 = 0x0 Working: [Kernel]->
Using this command multiple times will give an idea if the task is stuck in some routine.
Gathering this data and attaching in the Problem Report will help Engineering to identify the source of the problem.
The CMM also keeps a prospective of NI for their IPC Pools. These can be displayed using the following commands:
IpcSlotPools slot,slice
IpcSlotUrgentPoolsDetail slot,slice
IpcSlotControlPoolsDetail slot,slice
IpcSlotNormalPoolsDetail slot,slice
IpcSlotJumboPoolsDetail slot,slice
IpcSlotLocalPoolsDetail slot,slice
Rest of the information about the sockets and the tasks can be found using the same commands as discussed above.
If a NI generating many IPC messages then CMM might not be able to see the IPC pools of that and as well as any other NI. E.g.
Certified: [Kernel]->ipcSlotPools 6,0 ipcSlotPools slot 6, slice 0
UrgentPool: Full size is 0, remaining: 256
In socket queues: 0 Not queued: 0: In DMA queues: 0
ControlPool: Full size is 0, remaining: 1024
In socket queues: 0 Not queued: 0: In DMA queues: 0
NormalPool: Full size is 0, remaining: 255
In socket queues: 0 Not queued: 0: In DMA queues: 0
JumboPool: Full size is 0, remaining: 64
In socket queues: 0 Not queued: 0: In DMA queues: 0
OmniSwitch Troubleshooting Guide September 2005 page 1-31
Page 52
Troubleshooting IPC on OS-6/7/8XXX Series of Switches Troubleshooting the Switch System
LocalPool: Full size is 0, remaining: 1024
In socket queues: 0 Not queued: 0: In DMA queues: 0
value = 6 = 0x6
The above display of the command does not show the Full size of any of the pools. This indicates that CMM is unable to view the IPC pools of the NI. In this scenario, one needs to load the NI Debugger and go to the NI and look at the IPC Pools. One of the NI would be generating many IPC messages that would result in IPC sockets to be eaten up by that NI resulting in flooding of enormous amount of IPC messages and in turn loosing communication with the CMM.
The following is an example of using the NiDebug command to display the IPC pools of all NIs.
Certified:[Kernel]->NiDebug nidbg> ipc_pools ipc_pools
UrgentPool: Full size is 256, remaining: 256
In socket queues: 0 Not queued: 0: In DMA queues: 0
ControlPool: Full size is 1024, remaining: 1024
In socket queues: 0 Not queued: 0: In DMA queues: 0
NormalPool: Full size is 256, remaining: 131
In socket queues: 123 Not queued: 2: In DMA queues: 0
JumboPool: Full size is 64, remaining: 64
In socket queues: 0 Not queued: 0: In DMA queues: 0
LocalPool: Full size is 1024, remaining: 1024
In socket queues: 0 Not queued: 0: In DMA queues: 0
value = 0 = 0x0
nidbg> ipc_normal_pools_detail 10
ipc_normal_pools_details
NormalPool: Full size is 256, remaining: 135
Socket ID = 0x19, dest slot = 2, remote addr = 0x3030002, ipc status = S Task ID = 0x17fd170, PayLoad Len= 128, ipc priority = 0x1, data ptr = 0x1
62d108
next = 0x17ca60c, pFreeQ = 0x2fc7a8, data_offset = 0, free_list_num = 6
Socket ID = 0x19, dest slot = 2, remote addr = 0x3030002, ipc status = S Task ID = 0x17fd170, PayLoad Len= 128, ipc priority = 0x1, data ptr = 0x1
630108
next = 0x17c8bec, pFreeQ = 0x2fc7a8, data_offset = 0, free_list_num = 6
Socket ID = 0x19, dest slot = 2, remote addr = 0x3030002, ipc status = S Task ID = 0x17fd170, PayLoad Len= 128, ipc priority = 0x1, data ptr = 0x1
page 1-32 OmniSwitch Troubleshooting Guide September 2005
Page 53
Troubleshooting the Switch System Troubleshooting IPC on OS-6/7/8XXX Series of Switches
631908
next = 0x17c8c44, pFreeQ = 0x2fc7a8, data_offset = 0, free_list_num = 6
Socket ID = 0x19, dest slot = 2, remote addr = 0x3030002, ipc status = S Task ID = 0x17fd170, PayLoad Len= 128, ipc priority = 0x1, data ptr = 0x1
632108
next = 0x17caab0, pFreeQ = 0x2fc7a8, data_offset = 0, free_list_num = 6
Socket ID = 0x19, dest slot = 2, remote addr = 0x3030002, ipc status = S Task ID = 0x17fd170, PayLoad Len= 128, ipc priority = 0x1, data ptr = 0x1
632908
next = 0x17c98d0, pFreeQ = 0x2fc7a8, data_offset = 0, free_list_num = 6
Socket ID = 0x19, dest slot = 2, remote addr = 0x3030002, ipc status = S Task ID = 0x17fd170, PayLoad Len= 128, ipc priority = 0x1, data ptr = 0x1
633908
next = 0x17c9d1c, pFreeQ = 0x2fc7a8, data_offset = 0, free_list_num = 6
Socket ID = 0x19, dest slot = 2, remote addr = 0x3030002, ipc status = S Task ID = 0x17fd170, PayLoad Len= 128, ipc priority = 0x1, data ptr = 0x1
634108
next = 0x17c9e7c, pFreeQ = 0x2fc7a8, data_offset = 0, free_list_num = 6
Socket ID = 0x19, dest slot = 2, remote addr = 0x3030002, ipc status = S Task ID = 0x17fd170, PayLoad Len= 128, ipc priority = 0x1, data ptr = 0x1
634908
next = 0x17ca244, pFreeQ = 0x2fc7a8, data_offset = 0, free_list_num = 6
Socket ID = 0x19, dest slot = 2, remote addr = 0x3030002, ipc status = S Task ID = 0x17fd170, PayLoad Len= 128, ipc priority = 0x1, data ptr = 0x1
635908
next = 0x17ca1ec, pFreeQ = 0x2fc7a8, data_offset = 0, free_list_num = 6
Socket ID = 0x19, dest slot = 2, remote addr = 0x3030002, ipc status = S Task ID = 0x17fd170, PayLoad Len= 128, ipc priority = 0x1, data ptr = 0x1
636908
next = 0x0, pFreeQ = 0x2fc7a8, data_offset = 0, free_list_num = 6
In socket queues: 119 Not queued: 2: In DMA queues: 0
value = 10 = 0xa
LocalSocketID = 0x19, localidx = 0x19, Local_address = 0x100b0002 RemoteSocketID = 0x0, Remote_Address = 0x0 QnumBufs = 124, NumBufs = 4276, seqSent = 0, seqRecv = 0 USRnumBufs = 0, State = 0x2, OptionFlgs = 0x0, priority = 1 blk_timeout = 0, LingerTime = 0, RxQ_Full_Threshold = 65536, RxQ_Numbuf_Thre
shold = 128
congestion = 0, SockMask = 0x2000000, SockMsbs = 0x0, use_sw_buf = 0 remote_cong = 0, init_done = 0, sem_use = 0, alignmentSpace = 0 Task id = 0x15f7768 (taStp), LastTimeStamp = 0 recvErrs = 0, txCnt = 68, txErr = 0, eagainCnt = 0 xoffsent = 0, xonsent = 0, xoffrecv = 0, xonrecv = 0, congcount = 0
value = 25 = 0x19
OmniSwitch Troubleshooting Guide September 2005 page 1-33
Page 54
Troubleshooting IPC on OS-6/7/8XXX Series of Switches Troubleshooting the Switch System
nidbg> tt 0x15f7768
1e6ce0 vxTaskEntry +c : stp_task_entry (0, 0, 0, 0,
0, 0)
f22e8 stp_task_entry +80 : stpNISock_start (22bc00, 22bea0, 22bdc4, 3, 22bd
f4, 3)
Multiple task trace of the task with IPC Pools should be taken. This process might have to be repeated on multiple NI in order to find out the cause of the problem and identify the NI causing the problem to happen.

OmniSwitch 6624/6648 Example

Follow the steps below for an example of displaying IPC pool data on an OmniSwitch 6624/6648;r
1 Check the In socket queues and Not queued fields for all the pools and identify the pool that has the
highest value with the ipc_pools command as shown below:
Working: [Kernel]->ipc_pools ipc_pools
UrgentPool: Full size is 1024, remaining: 1024
In socket queues: 0 Not queued: 0: In DMA queues: 0
ControlPool: Full size is 4096, remaining: 3451
In socket queues: 640 Not queued: 5: In DMA queues: 0
NormalPool: Full size is 1024, remaining: 377
In socket queues: 620 Not queued: 16: In DMA queues: 0
JumboPool: Full size is 256, remaining: 256
In socket queues: 0 Not queued: 0: In DMA queues: 0
LocalPool: Full size is 1024, remaining: 1024
In socket queues: 0 Not queued: 0: In DMA queues: 0
value = 1 = 0x1
2 Find the most repeated socket ID ipc_normal_pools_detail command as shown below:
Working: [Kernel]->ipc_pools_detail 1,0
NormalPool: Full size is 1024, remaining: 377
Socket ID = 0x7, dest slot = 1, remote addr = 0x60001, ipc status = G Task ID = 0x756ba38, PayLoad Len= 20, ipc priority = 0x1, data ptr = 0x6cfcba
0
next = 0x0, pFreeQ = 0x74fb4e0, data_offset = 0, free_list_num = 6
Socket ID = 0x100, dest slot = 90, remote addr = 0x50001, ipc status = S Task ID = 0x7571700, PayLoad Len= 812, ipc priority = 0x1, data ptr = 0x6cfd3
a0
next = 0x739fac0, pFreeQ = 0x74fb4e0, data_offset = 0, free_list_num = 6
page 1-34 OmniSwitch Troubleshooting Guide September 2005
Page 55
Troubleshooting the Switch System Troubleshooting IPC on OS-6/7/8XXX Series of Switches
Socket ID = 0x100, dest slot = 5, remote addr = 0x5400042, ipc status = S Task ID = 0x7571700, PayLoad Len= 812, ipc priority = 0x1, data ptr = 0x6cfe3
a0
next = 0x739b810, pFreeQ = 0x74fb4e0, data_offset = 0, free_list_num = 6
Socket ID = 0x100, dest slot = 65, remote addr = 0x8440041, ipc status = S Task ID = 0x7571700, PayLoad Len= 812, ipc priority = 0x1, data ptr = 0x6cfeb
a0
next = 0x7396da4, pFreeQ = 0x74fb4e0, data_offset = 0, free_list_num = 6
Socket ID = 0x2, dest slot = 65, remote addr = 0x11b0001, ipc status = G Task ID = 0x5514c80, PayLoad Len= 20, ipc priority = 0x1, data ptr = 0x6cffba
0
next = 0x73a1cac, pFreeQ = 0x74fb4e0, data_offset = 0, free_list_num = 6
3 Obtain the task ID with the ipc_socket_info command. Use the most-repeated socket ID discovered in
Step 2.
Certified: [Kernel]->ipc_socket_info 0x100 ipc_socket_info Socket 100:
LocalSocketID = 0x100, localidx = 0x100, Local_address = 0x5450041 RemoteSocketID = 0x0, Remote_Address = 0x0 QnumBufs = 128, NumBufs = 193, seqSent = 0, seqRecv = 0 USRnumBufs = -65, State = 0x2, OptionFlgs = 0x0, priority = 1 blk_timeout = 0, LingerTime = 0, RxQ_Full_Threshold = 65536, RxQ_Numbuf_Threshold = 128 congestion = 0, SockMask = 0x200, SockMsbs = 0x5, use_sw_buf = 0 remote_cong = 0, init_done = 0, sem_use = 0, alignmentSpace = 0 Task id = 0x4e105c0 (WebView), LastTimeStamp = 1063601688 recvErrs = 0, txCnt = 0, txErr = 0, eagainCnt = 0 xoffsent = 0, xonsent = 0, xoffrecv = 0, xonrecv = 0, congcount = 0 value = 68 = 0x44 = 'D'
4 Dump the task ID discovered in Step 3 with the tt command as shown below:
Certified: [Kernel]->tt 0x4e105c0
Run this command 3–4 times.
On the primary switch in the stack you can execute the debugDisplayRcvDesc Dshell command to see the near-end of IPC health as shown below:
->dshell Certified: [Kernel]-> debugDisplayRcvDesc
OmniSwitch Troubleshooting Guide September 2005 page 1-35
Page 56
Port Numbering Conversion Overview Troubleshooting the Switch System

Port Numbering Conversion Overview

The sections below document how to convert port number parameters.
Note. Dshell commands should only be used by Alcatel personnel or under the direction of Alcatel. Misuse or failure to follow procedures that use Dshell commands in this guide correctly can cause lengthy network down time and/or permanent damage to hardware.

ifindex to gport

To convert from ifindex to global port (gport) number use the findGlobalPortFromIfIndex Dshell command as shown below:
-> dshell Working: [Kernel]->findGlobalPortFromIfIndex 16011 value = 505 = 0x1f9

gport to ifindex

To convert from global port (gport) to ifindex use the findIfIndexFromGlobalPort Dshell command as shown below:
-> dshell Working: [Kernel]->findIfIndexFromGlobalPort 505 value = 16011 = 0x3e8b

Converting from lport

The lport numbering process varies on each platform type (e.g., Falcon/Eagle or Hawk), as well as module type (e.g., ENI-C24, GNI-C2, GNI-U12, GNI-U8, GNI-C24, GNI-U24, etc.). To determine the lport value use two Dshell commands: dmpValidPorts and dmpAbsPort.
The following subsections describe conversions based on platform type. You need to be careful that both commands can be used on either Dshell or Nidebug based on platform type. In addition, input values for dmpAbsPort vary depending on platform type.
OmniSwitch 7700/7800/8800 (Falcon/Eagle) Example
The following displays all valid lport values with the dmpValidPorts command from NiDebug. After­wards, you should do a dump for each slice.
1 Use the dmpValidPorts command as shown below:
8:0 nidbg> dmpValidPorts 8:0 valid lports: [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] 8:0 8:0 valid uports: [ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ]
2 Find the corresponding lport value from the uport value using dmpAbsPort command. Please note
that you must use the uport value for this command.
8:0 nidbg> dmpAbsPort 1
page 1-36 OmniSwitch Troubleshooting Guide September 2005
Page 57
Troubleshooting the Switch System Port Numbering Conversion Overview
Note that 1 is the uport number. Output similar to the following will be displayed:
8:0 8:0 Valid 1 8:0 in LSM 0 8:0 ---------- Port Numbers -----------------­8:0 Slot 8 8:0 Slice 0 8:0 Mac 0 8:0 Bus 0 8:0 phy 0 8:0 gport 224 8:0 lport 0 8:0 iport 0 8:0 pport 0 8:0 uport 1
OmniSwitch 6624/6648 (Hawk) Example
Find all valid lports values with the dmpValidPorts command from Dshell on each element (i.e., each slot in a stack). Afterwards, you should do a dump for each slot.
1 Use the dmpValidPorts command as shown below:
Certified: [Kernel]->dmpValidPorts valid lports: [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ][ 10 ][ 11 ][ 12 ][ 13 ][ 14 ][ 15 ][ 16 ][ 17 ][ 18 ][ 19 ][ 20 ][ 21 ][ 22 ][ 23 ][ 24 ][ 25 ][ 26 ][ 27 ][ 32 ][ 33 ][ 34 ][ 35 ][ 36 ][ 37 ][ 38 ][ 39 ][ 40 ][ 41 ][ 42 ][ 43 ][ 44 ][ 45 ][ 46 ][ 47 ][ 48 ][ 49 ][ 50 ][ 51 ][ 52 ][ 53 ][ 54 ][ 55 ][ 58 ][ 59 ] value = 1 = 0x1
2 Find the corresponding uport value from lport value using the dmpAbsPort command. Make sure you
use the lport value as the input value. This is different from Falcon/Eagle.
Certified: [Kernel]->dmpAbsPort 49
Note that 49 is the lport number. Output similar to the following will be displayed:
Valid 1 in LSM 0 portType 4
---------- Port Numbers ------------------
Slot 3 gport 177 lport 49 dport 17 pport 17 uport 42
OmniSwitch Troubleshooting Guide September 2005 page 1-37
Page 58
Port Numbering Conversion Overview Troubleshooting the Switch System
page 1-38 OmniSwitch Troubleshooting Guide September 2005
Page 59
2 Troubleshooting Switched
Ethernet Connectivity
This chapter assumes that it has been verified that the connectivity problem is across Ethernet media and the connection between the non-communicating devices is switched/bridged not routed (i.e., Devices are in the same IP Subnet).
For configuration assistance in designing and configuring switched Ethernet connectivity, please refer to the “Configuring Ethernet Ports” chapter in the appropriate OmniSwitch Network Configuration Guide. For known specifications and limitations, Please refer to the appropriate Release Notes Revision.

In This Chapter

“Overview of Troubleshooting Approach” on page 2-2
“Verify Physical Layer Connectivity” on page 2-3
“Verify Current Running Configuration” on page 2-5
“Verify Source Learning” on page 2-6
“Verify Switch Health” on page 2-7
“Verify ARP” on page 2-7
“Using the Log File” on page 2-8
OmniSwitch Troubleshooting Guide September 2005 page 2-1
Page 60
Overview of Troubleshooting Approach Troubleshooting Switched Ethernet Connectivity

Overview of Troubleshooting Approach

Verify physical layer connectivity.
Verify current running configuration is accurate.
Verify source learning.
Investigate any error conditions.
Verify health of NIs involved.
Verify health of CMM.
Client A Client B
5/1 5/2
IP = 192.168.10.2 IP = 192.168.103
OmniSwitch 7800
VLAN 7
Diagram 1
page 2-2 OmniSwitch Troubleshooting Guide September 2005
Page 61
Troubleshooting Switched Ethernet Connectivity Verify Physical Layer Connectivity

Verify Physical Layer Connectivity

Verify that there is valid link light along the entire data path between the devices that can not switch to each other. Make sure to include all interswitch links. Verify LED’s on all involved CMMs and NIs are Solid OK1, Blinking OK2. If this is not the case, contact technical support.
Use the show interfaces command to verify operational status is Up, speed and duplex are correct and match the other side of the connection. Run this command on the same interface multiple times to verify errors (Error Frames, CRC Error Frames, Alignment Errors) are not incrementing. If the error counts are incrementing verify the health of the cabling as well as the NIC involved. Also note that if the Collision Frames is incrementing, this is normal for a half duplex connection. If the port is set to full duplex and these errors are still incrementing, verify the duplex setting on the other side of the connection. Finally, if these commands were run while the end stations were trying to ping each other, verify Bytes Received is incrementing. If is not, verify the NIC card.
Note. Remember to do this for each port along the data path, not just the ports that directly attached to the end stations.
-> show interfaces 5/1 Slot/Port 5/1 :
Operational Status : up, Type : Fast Ethernet, MAC address : 00:d0:95:7a:63:87, BandWidth (Megabits) : 100, Duplex : Full, Long Accept : Enable, Runt Accept : Disable, Long Frame Size(Bytes) : 1553, Runt Size(Bytes) : 64
Input :
Bytes Received : 14397, Lost Frames : 0, Unicast Frames : 6, Broadcast Frames : 93, Multicast Frames : 7, UnderSize Frames : 0, OverSize Frames : 0, Collision Frames : 0, Error Frames : 0, CRC Error Frames : 0, Alignments Error : 0
Output :
Bytes transmitted : 83244, Lost Frames : 0, Unicast Frames : 10, Broadcast Frames : 84, Multicast Frames : 1106, UnderSize Frames : 0, OverSize Frames : 0,
Collision Frames : 0,
Error Frames : 0
If the port reports operational status down, verify the physical link, but also verify the necessary NIs and CMM are receiving power and are up and operational. Use the show ni command followed by the slot number and the show cmm command to verify this.
OmniSwitch Troubleshooting Guide September 2005 page 2-3
Page 62
Verify Physical Layer Connectivity Troubleshooting Switched Ethernet Connectivity
-> show ni 5 Module in slot 5
Model Name: OS7-ENI-C24 , Description: 24PT 10/100 MOD, Part Number: 902136-10, Hardware Revision: A02, Serial Number: 22030298, Manufacture Date: MAY 18 2002, Firmware Version: 6, Admin Status: POWER ON, Operational Status: UP, Power Consumption: 44, Power Control Checksum: 0x808, MAC Address: 00:d0:95:7a:63:87, ASIC - Physical: 0x1a01 0x0201 0x0201 0x001e 0x001e 0x001e
-> show cmm Module in slot CMM-A-1
Model Name: OS7800-CMM , Description: BBUS Bridge, Part Number: 901753-10, Hardware Revision: 306, Serial Number: 2153117A, Manufacture Date: APR 11 2002, Firmware Version: 38, Admin Status: POWER ON, Operational Status: UP, Power Consumption: 85, Power Control Checksum: 0x80e, MAC Address: 00:d0:95:79:62:8a, ASIC - Physical: 0x0801 0x0801 0x0801 0x0801 0x0801 0x0801 0x08
Module in slot CMM-A-2
Model Name: , Description: Processor, Part Number: 901753-10, Hardware Revision: 303, Serial Number: 2133035A, Manufacture Date: APR 11 2002, Firmware Version: 38
page 2-4 OmniSwitch Troubleshooting Guide September 2005
Page 63
Troubleshooting Switched Ethernet Connectivity Verify Current Running Configuration

Verify Current Running Configuration

If the physical layer looks OK, then verify the configuration. Use the show configuration snapshot all to display the current running configuration. Use this command to verify the ports that are involved are in the correct VLAN. Also review the output of the command to verify there is nothing explicit in the configura­tion that would cause the problem, such as a deny ACL that could be found under the QoS subsection.
-> show configuration snapshot all ! Chassis : system name OS7800 ! Configuration: ! VLAN : vlan 7 enable name "VLAN 7" vlan 7 port default 5/1 vlan 7 port default 5/2 ! 802.1Q : ! Spanning tree : ! Bridging : ! IPMS : ! AAA : aaa authentication console "local" ! QOS : qos apply ! Policy manager : ! Session manager : ! SNMP : ! IP route manager : ip router router-id 127.0.0.1 ip router primary-address 127.0.0.1 ! RIP : ! OSPF : ! BGP : ! IP multicast : ! Health monitor : ! Interface : ! Link Aggregate : ! Port mirroring : ! UDP Relay : ! Server load balance : ! System service : ! VRRP : ! Web : ! AMAP : ! GMAP : ! Module :
OmniSwitch Troubleshooting Guide September 2005 page 2-5
Page 64
Verify Source Learning Troubleshooting Switched Ethernet Connectivity
To further verify the ports are in the correct VLAN and that they are in spanning tree forwarding instead of blocking use the show vlan port command. Also note that the port type must match what it is connecting to. If the port is 802.1Q tagged enabled for the required vlan, then the device it attaches to must also be Q tagged enabled for that vlan. Remember to run this command on all ports in the data path.
-> show vlan 7 port port type status
--------+---------+--------------
5/1 default forwarding 5/2 default forwarding 5/9 qtagged inactive
If ports that should be in forwarding are in blocking, or vice versa, please consult Chapter 4, “Trouble-
shooting Spanning Tree.”

Verify Source Learning

If the configuration looks correct, source learning should be examined. If connectivity exists but is slow, or intermittent source learning could be the root cause, since data packets would be flooded. However, if there is no packet throughput between the devices the problem is likely not due to a source learning prob­lem.
To verify that the MAC addresses are being learned correctly use the show mac-address-table slot command. Verify that the correct mac address is being learned of the correct port, in the correct vlan.
-> show mac-address-table slot 5 Legend: Mac Address: * = address not valid
Vlan Mac Address Type Protocol Operation Interface
------+-------------------+--------------+-----------+------------+-----------
7 00:00:39:73:13:0e learned 10800 bridging 5/1 7 00:b0:d0:75:f1:97 learned 10800 bridging 5/2
Total number of Valid MAC addresses above = 2
page 2-6 OmniSwitch Troubleshooting Guide September 2005
Page 65
Troubleshooting Switched Ethernet Connectivity Verify Switch Health

Verify Switch Health

If source learning appears to be not working correctly, verify the health of the switch with the show health, and show health slot commands. Be sure to run the latter command on all necessary NIs. Any variables that have reached or exceeded their limit value could cause forwarding problems on the switch. In this case please contact Technical Support. For more detailed source learning trouble shooting, please see
Chapter 3, “Troubleshooting Source Learning.”
-> show health * - current value exceeds threshold
Device 1 Min 1 Hr 1 Hr Resources Limit Curr Avg Avg Max
-----------------+-------+------+------+-----+---­Receive 80 00 00 00 00 Transmit/Receive 80 00 00 00 00 Memory 80 39 39 39 39 Cpu 80 02 02 02 03 Temperature Cmm 50 39 39 39 39 Temperature Cmm Cpu 50 31 31 31 31
-> show health 5 * - current value exceeds threshold
Slot 05 1 Min 1 Hr 1 Hr Resources Limit Curr Avg Avg Max
-----------------+-------+------+------+-----+---­Receive 80 00 00 00 01 Transmit/Receive 80 01 01 01 01 Memory 80 16 16 16 16 Cpu 80 29 33 32 35

Verify ARP

If everything checked appears to be valid, verify that this is not an ARP problem. On the end stations involved, enter a static mac address for the device it is trying to communicate with. If connectivity is restored, please see Chapter 11, “Troubleshooting ARP.”
OmniSwitch Troubleshooting Guide September 2005 page 2-7
Page 66
Using the Log File Troubleshooting Switched Ethernet Connectivity

Using the Log File

If none of the above suggest a reason as to why Ethernet switching is not properly working, look into the log file and see if there are any messages that may suggest why switching is not working properly. Use the show log swlog command to view the system log file. Look for evidence of a system or interface problem around the time the problem began.
-> show log swlog Displaying file contents for ’swlog2.log’ FILEID: fileName[swlog2.log], endPtr[32]
configSize[64000], currentSize[64000], mode[2] Displaying file contents for ’swlog1.log’
FILEID: fileName[swlog1.log], endPtr[48903]
configSize[64000], currentSize[64000], mode[1]
Time Stamp Application Level Log Message
------------------------+--------------+-------+-------------------------------­THU DEC 12 08:13:51 2002 SYSTEM info Switch Logging device ’swlog1.lt THU DEC 12 08:13:53 2002 SYSTEM info Switch Logging device ’swlog2.lt THU DEC 12 08:13:56 2002 SYSTEM info Switch Logging device ’/dev/cont THU DEC 12 08:13:56 2002 CSM-CHASSIS info == CSM == start up THU DEC 12 08:13:56 2002 CSM-CHASSIS info == CSM == Activating a new vers THU DEC 12 08:13:56 2002 CSM-CHASSIS info == CSM == The working version i THU DEC 12 08:13:56 2002 CSM-CHASSIS info == CSM == MONITORING ON THU DEC 12 08:13:56 2002 CSM-CHASSIS info == CSM == This CMM is primary
After following the troubleshooting steps via CLI for physical connection, configuration validation, system health and source learning, here are the additional commands in dshell to troubleshoot problems related connectivity problem:

Checking the 7700/7800 Nantucket Fabric

nanlistB04 Certified: [Kernel]->nanListB04
No SOP Interrupt: 0 Multicast FIFO Full Interrupt: 0 Multicast Buffer Full Interrupt: 0 Unicast Buffer Full Interrupt: 0 Multicast Dump Interrupt: 0 Unicast Dump Interrupt: 0 Unicast Attempt Count: 8a620 Multicast Attempt Count: acecf Unicast In Count: 8a627 Multicast In Count: acecf Unicast Out Count: 8a634 Multicast Out Count: 3600e Dummy Count: 61578
Total FLength Count: 0
value = 0 = 0x0 Certified: [Kernel]->
The total Flengtlh Count value should be 0 or a small value, a large value indicating that there are frames being back up in the fabric queue.
page 2-8 OmniSwitch Troubleshooting Guide September 2005
Page 67
Troubleshooting Switched Ethernet Connectivity Using the Log File

Checking the 7700/7800 Nantucket Fabric for Interrupts, Data Counts and Error Counts

Working: [Kernel]->nanListB02 HB Out of Sync Interrupts: 0 Error Count Exceeded Interrupts: 0 Framing Error Interrupts: 0 Parity Error Interrupts: 0 B02 Data Port 0 Frame Count = 690dbd37 B02 Data Port 1 Frame Count = 0 B02 Data Port 2 Frame Count = 542e70d9 B02 Data Port 3 Frame Count = 0 B02 Data Port 4 Frame Count = 0 B02 Data Port 5 Frame Count = 0 B02 Data Port 6 Frame Count = 0 B02 Data Port 7 Frame Count = 9e75d47 B02 Data Port 8 Frame Count = 690dbd39 B02 Data Port 9 Frame Count = 0 B02 Data Port 10 Frame Count = 542e70d9 B02 Data Port 11 Frame Count = 0 B02 Data Port 12 Frame Count = 0 B02 Data Port 13 Frame Count = 0 B02 Data Port 14 Frame Count = 0 B02 Data Port 15 Frame Count = 9e75d47

Checking the Traffic Queue on the NI

Working: [Kernel]->FindBuffer 3,0 => where 3 is the slot number Queue = 0x62 length = 0x40, Address 0x6881880 Queue = 0x63 length = 0x40, Address 0x68818c0 value = 3 = 0x3
The above capture shows one of the queues is backed up on the NI. Check if the queue is sending traffic using the following command syntax:
esmDumpCoronado slot,slice,address,bytes
Working: [Kernel]->esmDumpCoronado 3,0,0x6881880,20
6881880 : 90 0 2f906d3 0 0 0 40 0 68818a0 : 40 0 0 0 d8d0620 0 0 0 68818c0 : 10090 0 2f906d3 0 value = 3 = 0x3 Working: [Kernel]->esmDumpCoronado 3,0,0x6881880,20
6881880 : 90 0 2f906d3 0 0 0 40 0 68818a0 : 40 0 0 0 d8d0620 0 0 0 68818c0 : 10090 0 2f906d3 0 value = 3 = 0x3
The above capture shows the queue is stuck and not moving.
OmniSwitch Troubleshooting Guide September 2005 page 2-9
Page 68
Using the Log File Troubleshooting Switched Ethernet Connectivity

Check for Catalina (MAC) or Port Lockup

Lab-Span1 > dshell Working: [Kernel]->getNiResetCount
Slot 1, ASICResetCnt_p addr 0x2c3ee0 Slot 2, ASICResetCnt_p addr 0x2c3ee0
ENI HALF Duplex Reset count addr 0x2c3f60 phy 0: 0 0 0 0 0 0 0 0 phy 1: 0 0 0 0 0 0 0 0
PHY FIFO LOCKUP Reset count addr 0x2c3fc0 phy 0: 0 0 0 0 0 0 0 0 phy 1: 0 0 0 0 0 0 0 0
value = 0 = 0x0 Working: [Kernel]->
page 2-10 OmniSwitch Troubleshooting Guide September 2005
Page 69
3 Troubleshooting Source
Learning
In order to troubleshoot Source Learning problems, a basic understanding of the process is required.
A review of the “Managing Source Learning” chapter from the appropriate OmniSwitch Network Configu- ration Guide is required. The following RFC and IEEE standards are supported:
RFCs supported 2674 - Definitions of Managed Objects for Bridges
with Traffic Classes, Multicast Filtering and Virtual LAN Extensions
IEEE Standards supported 802.1Q - Virtual Bridged Local Area Networks
802.1D - Media Access Control Bridges

In This Chapter

“Introduction” on page 3-2
“Troubleshooting a Source Learning Problem” on page 3-3
“Advanced Troubleshooting” on page 3-5
“Dshell Troubleshooting” on page 3-7
OmniSwitch Troubleshooting Guide September 2005 page 3-1
Page 70
Introduction Troubleshooting Source Learning

Introduction

VLAN 114 Port: 8/23 IP: 10.40.114.50 MAC:00-C0-4f-12-F7-1B
OmniSwitch 7800
VLAN 114 Port: 16/16 IP: 10.40.114.100 MAC: 00-10-A4-B5-B5-38
Source Learning Example
When a packet first arrives on NI source learning examines the packet and tries to classify the packet to join its correct VLAN. If a port is statically defined in a VLAN, the MAC address is classified in the default VLAN. Otherwise, if Group Mobility is being used the MAC address is classified into the correct VLAN based on the rules defined.
As soon as the MAC address is classified in a VLAN, an entry is made in Source Address Pseudo-CAM associating the MAC address with the VLAN ID and the Source Port. This Source Address is then relayed to the CMM for management purposes.
If an entry already exists in MAC address database with the same VLAN ID and the same source port number then no new entry is made. If VLAN ID or the source port is different from the existing entry in MAC address database then the previous entry is aged out and a new entry is made in the MAC address database. This process of adding a MAC address in the MAC address database is known as Source Learn­ing.
A MAC address can be denied to learn on a port based on different policies configured through QOS or Learned Port Security. A MAC address may be learned in a wrong VLAN based on the policies defined for the port.
Note: This document does not discuss the basic operation of Source Learning. To learn about how Source Learning works, refer to the “Managing Source Learning” in the appropriate OmniSwitch Network Config- uration Guide.
page 3-2 OmniSwitch Troubleshooting Guide September 2005
Page 71
Troubleshooting Source Learning Troubleshooting a Source Learning Problem

Troubleshooting a Source Learning Problem

In order to troubleshoot a source learning problem the first step is to verify that the physical link is up and the port has correctly auto-negotiated with the end-station.
The next thing is to verify that the port is a member of the right VLAN, if a port is statically configured for a VLAN, or the Group Mobility policies are correctly defined. The workstation configuration should also be verified.
The first thing to look for is the MAC address table to verify that the MAC address is being learned:
-> show mac-address-table
Vlan Mac Address Type Protocol Operation Interface
------+-------------------+--------------+-----------+------------+-----------
105 00:00:5e:00:01:69 learned 10800 bridging 4/2 105 00:d0:95:6b:4c:d8 learned 10800 bridging 4/2 105 00:d0:95:79:62:eb learned 10806 bridging 4/2 150 00:d0:95:6b:4c:e7 learned 10800 bridging 4/2 1 00:d0:95:79:65:ea learned 10800 bridging 6/1 108 00:d0:95:6b:4c:db learned 10800 bridging 6/1 110 00:d0:95:6b:4c:dd learned 10800 bridging 7/1 114 00:c0:4f:12:f7:1b learned 10800 bridging 8/23 112 00:d0:95:6b:4c:df learned 10800 bridging 9/2 112 00:d0:95:79:65:10 learned 10806 bridging 9/2 50 00:00:5e:00:01:32 learned 10800 bridging 11/1 50 00:d0:95:83:e7:81 learned 10800 bridging 11/1 51 00:00:5e:00:01:33 learned 10800 bridging 11/1 51 00:d0:95:83:e7:82 learned 10800 bridging 11/1 52 00:00:5e:00:01:34 learned 10800 bridging 11/1 52 00:d0:95:83:e7:83 learned 10800 bridging 11/1 53 00:00:5e:00:01:35 learned 10800 bridging 11/1 53 00:d0:95:83:e7:84 learned 10800 bridging 11/1 54 00:00:5e:00:01:36 learned 10800 bridging 11/1 54 00:d0:95:83:e7:85 learned 10800 bridging 11/1 55 00:00:5e:00:01:37 learned 10800 bridging 11/1 55 00:d0:95:83:e7:86 learned 10800 bridging 11/1 56 00:00:5e:00:01:38 learned 10800 bridging 11/1 56 00:d0:95:83:e7:87 learned 10800 bridging 11/1 57 00:00:5e:00:01:39 learned 10800 bridging 11/1 57 00:d0:95:83:e7:88 learned 10800 bridging 11/1 58 00:00:5e:00:01:3a learned 10800 bridging 11/1 58 00:d0:95:83:e7:89 learned 10800 bridging 11/1 59 00:00:5e:00:01:3b learned 10800 bridging 11/1 59 00:d0:95:83:e7:8a learned 10800 bridging 11/1 60 00:00:5e:00:01:3c learned 10800 bridging 11/1 60 00:d0:95:83:e7:8b learned 10800 bridging 11/1 61 00:00:5e:00:01:3d learned 10800 bridging 11/1 61 00:d0:95:83:e7:8c learned 10800 bridging 11/1 62 00:00:5e:00:01:3e learned 10800 bridging 11/1 62 00:d0:95:83:e7:8d learned 10800 bridging 11/1 114 00:10:a4:b5:b5:38 learned 10806 bridging 16/16
Total number of Valid MAC addresses above = 37
The above command shows all the MAC addresses learned by the switch.
OmniSwitch Troubleshooting Guide September 2005 page 3-3
Page 72
Troubleshooting a Source Learning Problem Troubleshooting Source Learning
In order to narrow down to a specific NI the following command can be used (any valid slot number can be specified):
-> show mac-address-table slot 8 Legend: Mac Address: * = address not valid
Vlan Mac Address Type Protocol Operation Interface
------+-------------------+--------------+-----------+------------+-----------
114 00:c0:4f:12:f7:1b learned 10800 bridging 8/23
Total number of Valid MAC addresses above = 1
This does show that the MAC address 00:c0:4f:12:f7:1b is learned on port 8/23, see the figure on page 3-2. So, the source learning process for this workstation has been completed successfully.
Now, a single MAC address can be a member of multiple VLANs based on different protocols. To verify that the MAC address has been learned in all of the VLANs, the above command can be used. The proto­col field will be different based on different protocols being used and classified into different VLANs.
MAC addresses can also be viewed based on VLAN ID, using the following command:
->show mac-address-table 114 Legend: Mac Address: * = address not valid
Vlan Mac Address Type Protocol Operation Interface
------+-------------------+--------------+-----------+------------+-----------
114 00:c0:4f:12:f7:1b learned 10800 bridging 8/23 114 00:10:a4:b5:b5:38 learned 10806 bridging 16/16
Total number of Valid MAC addresses above = 2
The above command shows the two workstations learned in VLAN 114 on NI 8 and 16.
Whether it be a Layer 3 packet or layer 2, the first step is to have the source MAC address learned in the MAC address table. Layer 3 involves resolution of ARP, for more details on ARP see troubleshooting section of ARP, and then the available routes to the destination which involves routing, for more details on Routing see troubleshooting section of Routing.
By default the MAC address aging time is set to 300 seconds. This can be viewed:
->show mac-address-table aging-time Mac Address Aging Time (seconds) for Vlan 1 = 300 Mac Address Aging Time (seconds) for Vlan 114 = 300
This can be changed using the command:
->mac-address-table aging-time 500 Mac Address Aging Time (seconds) for Vlan 1 = 500 Mac Address Aging Time (seconds) for Vlan 114 = 500
This can also be changed on a particular VLAN:
->mac-address-table aging-time 600 vlan 114
It may be required to change the aging timer to a higher value to prevent the aging time of silent devices.
Another method by which silent devices can be accommodated is to use the permanent/static MAC address assigned to a port using the command:
->mac-address-table permanent 00:10:a4:b5:b5:38 16/16 114
page 3-4 OmniSwitch Troubleshooting Guide September 2005
Page 73
Troubleshooting Source Learning Advanced Troubleshooting
Once, the MAC addresses are learned on the ports then the devices should be able to communicate depending on the upper layers. Variations of MAC-related commands can be viewed in the “Managing Source Learning” chapter from the appropriate OmniSwitch Network Configuration Guide.

Advanced Troubleshooting

The advanced troubleshooting for Source learning related problems is to look whether the traffic is coming in from a port and the NI is not learning the MAC, if not prevented by using any other rules.
->debug ip packet board ni 8 start R 8/23 00c04f12f71b->00d0957962c4 IP 10.40.114.50->10.40.114.2 ICMP 8,0 seq=58460. 8 S 8/23 00d0957962c4->00c04f12f71b IP 10.40.114.2->10.40.114.50 ICMP 0,0 seq=58460. ebug ip 8 R 8/23 00c04f12f71b->00d0957962c4 IP 10.40.114.50->10.40.114.2 ICMP 8,0 seq=58716. 8 S 8/23 00d0957962c4->00c04f12f71b IP 10.40.114.2->10.40.114.50 ICMP 0,0 seq=58716. packet 8 R 8/23 00c04f12f71b->00d0957962c4 IP 10.40.114.50->10.40.114.2 ICMP 8,0 seq=58972. 8 S 8/23 00d0957962c4->00c04f12f71b IP 10.40.114.2->10.40.114.50 ICMP 0,0 seq=58972. stop8 R 8/23 00c04f12f71b->00d0957962c4 IP 10.40.114.50->10.40.114.2 ICMP 8,0 seq=59228. 8 S 8/23 00d0957962c4->00c04f12f71b IP 10.40.114.2->10.40.114.50 ICMP 0,0 seq=59228.
->debug ip packet stop
This command shows that the packets are coming into the switch and a reply is being sent by the switch to the end station.
Various combinations of debug ip packet command can be used to find out the incoming traffic. The combinations possible are as follows:
debug ip packet [start] [timeout seconds] [stop] [direction {in | out | all}] [format {header | text | all}] [output {screen | switchlog}] [board {cmm | ni [1-16] | all | none} [ether-type {arp | ip | hex [hex] | all}] [ip-address ip_address] [ip-pair [ip1] [ip2]] [protocol {tcp | udp | icmp | igmp | num [integer] | all}] [show-broadcast {on | off}] show-multicast {on | off}]
start Starts an IP packet debug session.
timeout Sets the duration of the debug session, in seconds. To specify a dura-
tion for the debug session, enter timeout, then enter the session length.
seconds The debug session length, in seconds.
stop Stops IP packet debug session.
direction in Debugs incoming packets
direction out Debugs outgoing packets.
direction all Debugs both incoming and outgoing packets.
format header Debugs the packet header.
OmniSwitch Troubleshooting Guide September 2005 page 3-5
Page 74
Advanced Troubleshooting Troubleshooting Source Learning
format text Debugs the packet text.
format all Debugs the entire packet.
output screen Output will appear on screen.
output switchlog Output will be saved to a log file.
board cmm Debugs CMM packets.
board ni Debugs packets for a Network Interface (NI). To debug a specific inter-
face, enter ni, then enter the slot number of the NI.
board all Debugs packets for all CMMs and NIs on the switch
board none Clears the previous board settings.
If the problems are associated with the source learning on a specific NI then the limitations of the Number of MAC addresses learned should also be considered. Current limitations are:
Number of learned MAC
32K addresses per network interface (NI) module
Number of learned MAC
64K addresses per switch
The total number of MAC addresses learned per switch can be viewed using the command:
-> show mac-address-table count Mac Address Table Count:
Permanent Address Count = 0, DeleteOnReset Address Count = 0, DeleteOnTimeout Address Count = 0, Dynamic Learned Address Count = 36, Total MAC Address In Use = 36
If the problem is still not resolved then kindly contact Tech Support for further troubleshooting.
page 3-6 OmniSwitch Troubleshooting Guide September 2005
Page 75
Troubleshooting Source Learning Dshell Troubleshooting

Dshell Troubleshooting

The OmniSwitch 6/7/8XXX has a distributed architecture. Source Learning is specific to a NI. Each NI has a layer 2 pseudo-cam which is which can hold 64K entries. 32K entries are reserved for L2 Source Addresses which are local to that NI in L2SA table and the rest of 32K entries are reserved for L2 Destina­tion Addresses which can be from local or remote NI in L2DA table.
Note. Dshell commands should only be used by Alcatel personnel or under the direction of Alcatel. Misuse or failure to follow procedures that use Dshell commands in this guide correctly can cause lengthy network down time and/or permanent damage to hardware.
If a problem is specific to a NI and the MAC address is not being learned by the switch, then the first step is to verify from the pseudo-cam of that NI that the MAC address has been learned. There can be a possi­bility that the NI has learned the MAC but CMM is not reporting that MAC because of IPC messages lost between the CMM and NI.
The commands available to troubleshoot this problem are:
slcDumpL2SA: Display all the SA PseudoCAM entries on one slot/slice.
Format: slcDumpL2SA slot_num, slice_num
slcDumpL2DA: Display all the Destination Address (DA) PseudoCAM entries on one slot/slice.
Format: slcDumpL2DA slot_num, slice_num
slcLkupL2SA: Display the SA PCAM entries with MAC, VLAN) tuple on a slot/slice, the high 4 bytes of
MAC are MacHi, other 2 bytes are macLo, VLAN non-significant value is 0.
Format: slcLkupL2SA slot_num, slice_num, macHi, macLo, vlanId
slcLkupL2DA: Display the DA PCAM entries with (MAC, VLAN} tuple on a slot/slice, the high 4 bytes of MAC are MacHi, other 2 bytes are macLo, VLAN non-significant value is 0.
Format: slcLkupL2DA slot_num, slice_num, macHi, macLo, vlanId
Now, if device A connected on slot 8 is unable to communicate to device B in slot 16 then the following steps can be taken to verify configuration on the NI
First look at the source MAC on slot 8 using the command:
Working: [Kernel]->slcDumpL2SA 8,0
Index Mac Address Vlan GlobalPort 4-words content
-------+-------------------+------+-----------+---------------------------------
---­0x371b 00:c0:4f:12:f7:1b 114 250 007200c0 4f12f71b 00000000 0000003a Total L2 SA entry amount = 1
Look at the source MAC on slot 16:
Working: [Kernel]->slcDumpL2SA 16,0
Index Mac Address Vlan GlobalPort 4-words content
-------+-------------------+------+-----------+---------------------------------
---­0x3538 00:10:a4:b5:b5:38 114 499 00720010 a4b5b538 00000000
OmniSwitch Troubleshooting Guide September 2005 page 3-7
Page 76
Dshell Troubleshooting Troubleshooting Source Learning
000001f3 Total L2 SA entry amount = 1
Both of the MAC addresses are learned in the correct VLANs on the right NI.
Now, if device A is trying to communicate to device B then the next thing to look for is the destination MAC address table. This is to verify that the destination MAC address table has the information about the device B.
Working: [Kernel]->slcDumpL2DA 8,0
Index Mac Address Vlan GlobalPort 4-words content
-------+-------------------+------+-----------+---------------------------------
---­0x3004 00:20:da:00:70:04 1 0 00010020 da007004 c0004000 00024000 0x3538 00:10:a4:b5:b5:38 114 499 00720010 a4b5b538 00180000 1f057f3
So the entry do show up for the destination device.
Similarly for bidirectional traffic the entry should show up on slot 16.
Working: [Kernel]->slcDumpL2DA 16,0
Index Mac Address Vlan GlobalPort 4-words content
-------+-------------------+------+-----------+---------------------------------
---­0x3004 00:20:da:00:70:04 1 0 00010020 da007004 c0004000 00024000 0x371b 00:c0:4f:12:f7:1b 114 250 007200c0 4f12f71b 00180000 1f05b3a
So, the two devices should be able to communicate.
The L2SA and L2DA tables will be different for each slot. L2SA table will be based on the MAC address learned on that slot. This will not be synchronized to all the other modules. Only the CMM will know about it. When the request comes in from device A for device B, first a lookup is done on the local L2SA and L2DA tables to see if there is a matching entry. If there is no matching entry then a request is sent on the BBUS to all the other Coronados, if any Coronado has the matching entry in its L2SA table it responds back with the Global port number of that entry. L2DA table is updated on the originating Coronado and the packet is forwarded to the Global port to reach the destination.
If no other Coronado responds back to the request then the packet is sent over the flood queue to all the other Coronado to be flooded out of the ports in the same VLAN. If a device responds back on the flooded request, L2SA for that NI is updated and the Global port number is send to the originating device using the same lookup as the response will be a unicast packet.
To see Source learning in action on an NI, set the debug level higher (levels are 1-6):
-> Sl_NiDebug=4
To see Source Learning in action on a CMM, set the debug level higher (levels are 1-6):
-> Sl_CmmDebug=5
To view the messages on the console, disable systrace:
-> Sl_no_systrace=1
The following is a sample output:
page 3-8 OmniSwitch Troubleshooting Guide September 2005
Page 77
Troubleshooting Source Learning Dshell Troubleshooting
Working: [Kernel]->Sl_no_systrace=1 Sl_no_systrace = 0x56402f4: value = 1 = 0x1 Working: [Kernel]->nidbg 3:0 nidbg> Sl_NiDebug=4 3:0 Sl_NiDebug = 0x2d1fc4: value = 4 = 0x4 3:0 nidbg> 3:0 3:0 ----------------------------- HRE PACKET HRADER ----------------------­3:0 isIPMS = 0, isSAMatched = 0, isDAMatched = 0, isMcst = 1, qId = 49, isRouted = 0, isTagged = 0, isFlood = 1, protoco l = 0, sPort = 64 3:0 payLoadLength = 66, isLocked = 0, lockId = 0 3:0 isFBMsg = 0, isIPCMsg = 0, isSTPfrm = 0, isPrtTagged = 0, sVlanId = 21, reQId = 2, mcVlanId = 21 3:0 conditionCodes = 0x180, daMac = 0x00005e000115 3:0 saMac = 0x006008:91bb72, tagType = 0x8100, taginfo = 15, ethType = 800 3:0 ------------------------------ HRE PACKET HEADER END ----------------------­3:0 3:0 sln_salrn: gport = 64, vlanId = 21 3:0 SA 00:60:08:91:bb:72 successfully added to SA CAM
OmniSwitch Troubleshooting Guide September 2005 page 3-9
Page 78
Dshell Troubleshooting Troubleshooting Source Learning

OS-6600

To look at the forwarding database on OS-6600 in Dshell use the slcDumpSlotSlice command., which displays which slot/slice is considered to be up and operational by the source learning software:
Certified: [Kernel]->slcDumpSlotSlice Source Learning Slice Up List: slot/slice 2/0, type = 838930434, firstgport = 64, lastgport = 123 value = 68 = 0x44 = ’D’
To look at the forwarding database on OS-6600 in Dshell use the dumpL2 command:
Certified: [Kernel]->dumpL2
Addr# VID Addr DN PN Age AVID
----------------------------------------------­00000 0001 00:01:02:03:00:00 00 30 STATIC xxxx 00001 0001 00:10:a4:f5:89:e2 03 00 DYNAMIC xxxx 00002 0002 00:00:5e:00:01:02 02 26 DYNAMIC xxxx 00003 0002 00:d0:95:84:07:1e 02 26 STATIC xxxx 00004 0003 00:00:5e:00:01:03 02 26 DYNAMIC xxxx 00005 0003 00:d0:95:84:07:1e 02 26 STATIC xxxx 00006 0004 00:d0:95:84:07:1e 02 26 STATIC xxxx 00007 0320 00:d0:95:84:3c:ce 02 01 DYNAMIC xxxx 00008 0333 00:d0:95:84:3c:ce 02 13 DYNAMIC xxxx 00009 0334 00:d0:95:82:12:ef 02 08 STATIC xxxx 00010 0334 00:d0:95:84:3c:ce 02 08 DYNAMIC xxxx 00011 0336 00:d0:95:79:64:ab 03 24 STATIC xxxx 00012 0340 00:d0:95:84:3c:ce 03 10 DYNAMIC xxxx 00013 0451 00:d0:95:84:3c:ce 03 11 DYNAMIC xxxx 00014 0999 00:00:5e:00:01:02 02 00 STATIC xxxx 00015 0999 00:00:c0:e0:29:e6 02 00 DYNAMIC xxxx 00016 0999 00:20:da:0a:54:10 02 00 STATIC xxxx 00017 0999 00:20:da:6c:20:4c 02 00 STATIC xxxx 00018 0999 00:90:27:17:f7:eb 02 00 STATIC xxxx 00019 0999 00:a0:24:d2:3f:cb 02 00 STATIC xxxx
Do you want to printf more addresses 0 -> No 1 -> Yes a -> all 1
Addr# VID Addr DN PN Age AVID
----------------------------------------------­00020 0999 00:b0:d0:77:3e:3d 02 00 STATIC xxxx 00021 0999 00:d0:95:2a:02:4c 02 00 STATIC xxxx 00022 0999 00:d0:95:6a:84:51 02 00 STATIC xxxx 00023 0999 00:d0:95:84:3b:a0 02 00 DYNAMIC xxxx 00024 0999 00:d0:95:84:3d:90 02 00 DYNAMIC xxxx 00025 0999 00:d0:95:88:a7:28 02 00 STATIC xxxx 00026 0999 08:00:20:87:44:61 02 00 STATIC xxxx
No more addr in Master DB.
page 3-10 OmniSwitch Troubleshooting Guide September 2005
Page 79
Troubleshooting Source Learning Dshell Troubleshooting
L2 Physical Pool Stats:
Total Used Free DstSwp Tables 16384 0 16384 NetID Tables 16384 0 16384 Protocol Tables 2046 1 2045 ASIC Rsrc Wraps 2048 26 2022
value = 294 = 0x126
Output of many fields are described below:
output definitions
Addr The index.
VID The VLAN ID.
Addr The MAC address learned.
DN The device number (stack number).
PN The port number.
Age The MAC address type, which can be Dynamic or Static.
AVID The Authenticated VLAN ID.
DstSwp Tables The entry for Next Hop info.
NetID Tables Contains transmit enables, prepend information, and address based
VLAN information.
To see Source learning in action, set the debug level higher (levels are 1-6):
SlnDebugLevel=1
The following is a sample output:
Certified: [Kernel]->SlnDebugLevel=1 SlnDebugLevel = 0x65c8af8: value = 1 = 0x1 =============== Start of CPU Unresolved Packet =============== TxFlags = 0x2017, BufSize = 64, DiffservCodePoint = 0x0, CpuCode = 0x20, PrtclCode = 0x1f, RxPNum = 10 PrepRxDevNum = 1, PrepRxPNum = 10, DstUnrCode = 0x1f, SrcUnrCode = 0x0, PacketRa­mAddr = 0x68228 DstMacAddr16_48 = 0x3d9f8000, DstMacAddr0_15 = 0xe639 SrcMacAddr32_47 = 0x8000, SrcMacAddr0_31 = 0x180a539f IPPayLoadOffset = 38, EnetType = 0x800, TagPriority = 1, TagVID = 3072 DstIPAddr = 0xc0a80b1b, SrcIPAddr = 0xc0a80b06 SrcIPSkt = 0x7f80, DstIPSkt = 0x7d00 hslnProcessL2Packet(258): vlanid = 0, gport = 42. hsln_core_adrlrn_handler: Get the packet from Q-Dispatcher... ======================= address pktPtr = 0x63e255c queue_port_id = 0x402a length = 60 lock = 0 packet_info = 0x0 ccode = 0x80 da = 00:80:9f:3d:50:b3 sa = 00:80:9f:53:0a:18 === End of E_FRAME_PARAMS ===
OmniSwitch Troubleshooting Guide September 2005 page 3-11
Page 80
Dshell Troubleshooting Troubleshooting Source Learning
page 3-12 OmniSwitch Troubleshooting Guide September 2005
Page 81
4 Troubleshooting Spanning
Tree
In order to troubleshoot spanning tree related problems an understanding of the protocol and its features are needed. The OmniSwitch supports two Spanning Tree Algorithms; 802.1D (standard) and 802.1w (rapid reconfiguration). In addition, the Omniswitch supports two Spanning Tree operating modes: flat (single STP instance per switch) and 1x1 (single STP instance per VLAN).
Spanning Tree Protocol is defined in the IEEE 802.1D standard.
The 802.1w amendment to that standard, Rapid Reconfiguration of Spanning Tree, improves upon STP by providing rapid reconfiguration capability via Rapid Spanning Tree Protocol
For configuration assistance please read the “Configuring Spanning Tree Parameters” in the appropriate OmniSwitch Network Configuration Guide.

In This Chapter

“Introduction” on page 4-1
“Troubleshooting Spanning Tree” on page 4-2
“Dshell” on page 4-5
“Generic Troubleshooting in Dshell” on page 4-10
“CMM Spanning Tree Traces” on page 4-25

Introduction

The primary purpose for spanning tree is to allow for physical redundancy in a bridged network, while assuring the absence of data loops. The protocol allows for dynamic fail-over as well.
One of the most important tools needed in troubleshooting a STP problem, is to be prepared before it happens. It is essential to have a network diagram that depicts both the physical (cables) and logical (VLANs) configurations. It also very useful to know which ports are normally in blocking/forwarding prior to any problem.
OmniSwitch Troubleshooting Guide September 2005 page 4-1
Page 82
Troubleshooting Spanning Tree Troubleshooting Spanning Tree

Troubleshooting Spanning Tree

A failure of the Spanning Tree Protocol (STP) will usually cause either a bridge loop on the LAN or constant reconvergence of STP. This in turn can cause several resultant problems.
If there is a bridge loop on the LAN, there can appear to be a broadcast storm since broadcast packets
will continuously loop the network. In addition, unicast traffic can be affected as the port a unicast address is learned off of, can toggle from one port to another in a very short time period.
If STP is constantly reconverging, this can cause temporary network outages as ports could through the
30 seconds of listening and learning as defined by 802.1D. One can see if STP is constantly reconverg­ing that the LAN could be perpetually down.
In determining the cause of the STP problem, its useful to first verify the configuration, especially if the network having problems has recently been installed.
Use the show spantree command to verify that STP is enabled and that both sides of the link are running the same STP protocol.
-> show spantree Vlan STP Status Protocol Priority
-----+----------+--------+--------
1 ON 802.1D 32768 10 ON 802.1D 32768
Use the show spantree command and specify a VLAN to verify the correct mode, designated root ID, root port, and configurable timers. The timers need to be consistent across a physical link running STP. Also very useful to note in this command are Topology changes and Topology age. If topology changes are incrementing quickly, the LAN can not agree who is root. This can be caused by dropped BPDUs (which will be discussed later), a bridge that insists it is root regardless of received BPDUs, or a physical link going in and out of service.
-> show spantree 10
Spanning Tree Parameters for Vlan 10
Spanning Tree Status : ON, Protocol : IEEE 802.1D, mode : 1X1 (1 STP per Vlan), Priority : 32768 (0x8000), Bridge ID : 8000-00:d0:95:79:62:8a, Designated Root : 8000-00:d0:95:79:62:8a, Cost to Root Bridge : 0, Root Port : None, Next Best Root Cost : 0, Next Best Root Port : None, Hold Time : 1, Topology Changes : 0, Topology age : 0:0:0
Current Parameters (seconds)
Max Age = 20, Forward Delay = 15, Hello Time = 2
Parameters system uses when attempting to become root
System Max Age = 20, System Forward Delay = 15, System Hello Time = 2
page 4-2 OmniSwitch Troubleshooting Guide September 2005
Page 83
Troubleshooting Spanning Tree Troubleshooting Spanning Tree
Use the show spantree ports command to determine if the port is in forwarding or blocking and are in the correct VLAN. Remember that in any LAN with physical redundancy there must be at least one port in blocking status. If it is known which ports are usually in blocking, those ports can be a good place to start to verify they are still in blocking status.
-> show spantree ports Vlan Port Oper Status Path Cost Role
-----+-----+------------+---------+-----
10 5/10 FORW 100 DESG
If ports that should be in blocking are now in forwarding, there are two likely causes. The first is that there was a physical failure in a link that was previously in forwarding. The second is that the BPDUs from the root are being dropped. If it appears that BPDUs are being dropped, troubleshoot this as if it were any other packet being dropped.
Use the show interfaces command to look for errors incrementing on the port as well as to verify duplex settings match on either side of the link.
-> show interfaces 5/10
Slot/Port 5/10 :
Operational Status : up, Type : Fast Ethernet, MAC address : 00:d0:95:7a:63:90, BandWidth (Megabits) : 10, Duplex : Half, Long Accept : Enable, Runt Accept : Disable, Long Frame Size(Bytes) : 1553, Runt Size(Bytes) : 64
Input :
Bytes Received : 765702, Lost Frames : 0, Unicast Frames : 2317, Broadcast Frames : 3855, Multicast Frames : 480, UnderSize Frames : 0, OverSize Frames : 0, Collision Frames : 0, Error Frames : 0, CRC Error Frames : 0, Alignments Error : 0
Output :
Bytes transmitted : 566131, Lost Frames : 0, Unicast Frames : 2153, Broadcast Frames : 8, Multicast Frames : 5931, UnderSize Frames : 0, OverSize Frames : 0, Collision Frames : 0, Error Frames : 0
OmniSwitch Troubleshooting Guide September 2005 page 4-3
Page 84
Troubleshooting Spanning Tree Troubleshooting Spanning Tree
Since STP is run in a distributed fashion it is important to verify that each NI that is involved is not having a resource problem. Use the show health command to verify the resources available on an NI.
-> show health 5
* - current value exceeds threshold
Slot 05 1 Min 1 Hr 1 Hr Resources Limit Curr Avg Avg Max
-----------------+-------+------+------+-----+----
Receive 80 01 01 01 01 Transmit/Receive 80 01 01 01 01 Memory 80 39 39 39 39 Cpu 80 26 29 28 30
If the problem has been ascertained to be layer 2 data loop, and it is needed to restore network connectiv­ity quickly, it is recommended to disable all redundant links either administratively or by disconnecting cables.
page 4-4 OmniSwitch Troubleshooting Guide September 2005
Page 85
Troubleshooting Spanning Tree Dshell

Dshell

As mentioned previously, it is important to verify the health of the NI as well as the CMM. Please refer to
Chapter 1, “Troubleshooting the Switch System,” for directions.
Note. Dshell commands should only be used by Alcatel personnel or under the direction of Alcatel. Misuse or failure to follow procedures that use Dshell commands in this guide correctly can cause lengthy network down time and/or permanent damage to hardware.
The commands run above to verify STP configuration on a particular port give the CMM perspective. Since STP is run on the NI it is important to query the NI to verify what was seen from the CMM. To verify a ports forwarding status use the esmDumpCoronado slot,slice, 0x6608000+vlan_id*4,32 command. This will indicate if the port as the NI sees it is in forwarding/blocking. The 32 in the above command shows 32 register values starting from the vlan_id specified. If the vlan_id used is 1 then the above command will display the values from VLAN 1 to VLAN 31. The bits are dedicated to the ports in the following order, starting from least significant bit. The bits are set (value=1) to indicate that the ports are forwarding for that VLAN. If 0 then the port is blocking for that VLAN.
Please note that the examples in this section have the following assumptions:
Ports 1-12: First 12 Ethernet ports.
Port 13: First Gigabit port.
Ports 14,15,16: Not used.
Ports 17-28: Second half of 12 Ethernet ports.
Port 29: Second Gigabit port.
Port 1/1 is a member of VLANs 1,140,141,150, and 511.
-> show vlan port 1/1 vlan type status
--------+---------+--------------
1 default forwarding 140 qtagged forwarding 141 qtagged forwarding 150 qtagged forwarding 511 qtagged forwarding
-> dshell Working: [Kernel]->esmDumpCoronado 1,0,0x6608000+1*4,32
6608004 : 1000 0 0 0 0 0
0 0
6608024 : 0 0 0 0 0 0
0 0
6608044 : 0 0 0 0 0 0
0 0
6608064 : 0 0 0 0 0 0
0 0
value = 1 = 0x1
OmniSwitch Troubleshooting Guide September 2005 page 4-5
Page 86
Dshell Troubleshooting Spanning Tree
Working: [Kernel]->esmDumpCoronado 1,0,0x6608000+140*4,32
6608230 : 1000 1000 0 0 0 0
0 0
6608250 : 0 0 1000 0 0 0
0 0
6608270 : 0 0 0 0 0 0
0 0
6608290 : 0 0 0 0 0 0
0 0
value = 1 = 0x1
Working: [Kernel]->esmDumpCoronado 1,0,0x6608000+511*4,32
66087fc : 1000 0 0 0 0 0
0 0
660881c : 0 0 0 0 0 0
0 0
660883c : 0 0 0 0 0 0
0 0
660885c : 0 0 0 0 0 0
0 0
value = 1 = 0x1
The above commands that the spanning tree vector is set for Gigabit port 1/1 for VLANs 1, 140, 141, 150, and 511.
Now, the following:
-> show vlan port 9/1 vlan type status
--------+---------+--------------
1 default forwarding
-> show vlan port 9/2
vlan type status
--------+---------+--------------
1 default forwarding
-> show vlan port 9/24
vlan type status
--------+---------+--------------
2 default forwarding
-> show vlan 3 port port type status
--------+---------+-------------­9/11 default forwarding 9/12 default forwarding
-> dshell
Working: [Kernel]->esmDumpCoronado 1,0,0x6608000+1*4,32
66087fc : 203 8000000 c00 0 0 0
0 0
660881c : 0 0 0 0 0 0
0 0
660883c : 0 0 0 0 0 0
page 4-6 OmniSwitch Troubleshooting Guide September 2005
Page 87
Troubleshooting Spanning Tree Dshell
0 0
660885c : 0 0 0 0 0 0
0 0
value = 1 = 0x1
Binary: 0000 0000 0011
For VLAN 1 the bits set are 203 which are equivalent to binary 0000 0000 0011. Bits 1 and 2 are set indi­cating that ports 1 and 2 have the spanning tree vector set for VLAN 1. The next register value is for VLAN 2, hex value is 8000000.
Binary: 1000 0000 0000 0000 0000 0000 0000
Binary value indicates that bit 28 is set which means that port 24 is set for VLAN 2. The next register value will indicate the value for VLAN 3. Hex value is c00.
Binary: 1100 0000 0000
Bits 11 and 12 are set indicating that spanning tree has been set for ports 11 and 12. These ports are forwarding.
Each NI when boots up sends a message to every other NI indicating that it is up and running. This message is critical for setting up the port Queues to transfer data as well as for Spanning tree. If an IPC message is lost by a particular NI then other NI will not see that NI as being a part of spanning tree domain. This may result in split spanning tree leading to a layer 2 loop. This kind of scenario might happen in the case of hot swaps.
To verify that each NI known about every other NI the following command should be used in NI Debug­ger, This should be run on all NIs that are used in STP.
Working: [Kernel]->NiDebug 1:0 nidbg> stpNISock_boardupprint 1:0 1:0 STP boards up : 1:0 board in slot : 2 slice : 0 is up 1:0 board in slot : 4 slice : 0 is up 1:0 board in slot : 5 slice : 0 is up 1:0 board in slot : 6 slice : 0 is up 1:0 board in slot : 7 slice : 0 is up 1:0 board in slot : 8 slice : 0 is up 1:0 board in slot : 9 slice : 0 is up 1:0 board in slot : 10 slice : 0 is up 1:0 board in slot : 11 slice : 0 is up 1:0 board in slot : 12 slice : 0 is up 1:0 board in slot : 13 slice : 0 is up 1:0 board in slot : 14 slice : 0 is up 1:0 board in slot : 16 slice : 0 is up 1:0 value = 0 = 0x0
This command will show all the other slots except for itself.
To look at all the BPDUs being received and transmitted on a particular slot and slice the following command can be used in NiDebug command. This will display, BPDUs as well as notifications when there is a topology change in real time.
1:0 nidbg> stp_printf_flag=1
1:0 *** stpkern_bpduIn stp_id=511 portid=c type=2 1:0 PIM port c state 4 1024 0
OmniSwitch Troubleshooting Guide September 2005 page 4-7
Page 88
Dshell Troubleshooting Spanning Tree
1:0 Message age of received BPDU : 0 1:0 PIM port c state 5 1024 0 1:0 recordProposed operPointToPointMAC=1 1:0 PIM port c state 7 1536 0 1:0 PIM port c state 4 1536 0 1:0 port 12 is forward (5) 1:0 tick (tack) time is now 701603 1:0 1:0 RSTBPDU transmitted on port 33 on STP 57 1:0 Root bridge ID = 3200d0 95820514 1:0 Path to Root cost = 3 1:0 Designated bridge ID = 800000d0 957962aa 1:0 Designated portId = 29697 1:0 Bridge portId = 29697 1:0 Message age : 256 1:0 Proposing 1:0 1:0 RSTBPDU transmitted on port 33 on STP 51 1:0 Root bridge ID = 3200d0 95820514 1:0 Path to Root cost = 3 1:0 Designated bridge ID = 800000d0 957962aa 1:0 Designated portId = 29697 1:0 Bridge portId = 29697 1:0 Message age : 256 1:0 Proposing 1:0 tick (tack) time is now 701628 1:0 tick (tack) time is now 701634 1:0 tick (tack) time is now 701635 1:0 1:0 RSTBPDU transmitted on port 33 on STP 60 1:0 Root bridge ID = 3200d0 95820514 1:0 Path to Root cost = 3 1:0 Designated bridge ID = 800000d0 957962aa 1:0 Designated portId = 29697 1:0 Bridge portId = 29697 1:0 Message age : 256 1:0 Proposing 1:0 tick (tack) time is now 701636 1:0 1:0 RSTBPDU transmitted on port 12 on STP 140 1:0 Root bridge ID = c800d0 957962aa 1:0 Path to Root cost = 0 1:0 Designated bridge ID = c800d0 957962aa 1:0 Designated portId = 29196 1:0 Bridge portId = 29196 1:0 Message age : 0 1:0 tick (tack) time is now 701637 1:0 1:0 RSTBPDU transmitted on port 33 on STP 52 1:0 Root bridge ID = 3200d0 95820514 1:0 Path to Root cost = 3 1:0 Designated bridge ID = 800000d0 957962aa 1:0 Designated portId = 29697 1:0 Bridge portId = 29697 1:0 Message age : 256 1:0 Proposing 1:0 RSTBPDU transmitted on port 33 on STP 61 1:0 Root bridge ID = 3200d0 95820514 1:0 Path to Root cost = 3
page 4-8 OmniSwitch Troubleshooting Guide September 2005
Page 89
Troubleshooting Spanning Tree Dshell
1:0 Designated bridge ID = 800000d0 957962aa 1:0 Designated portId = 29697 1:0 Bridge portId = 29697 1:0 Message age : 256 1:0 Proposing 1:0 tick (tack) time is now 701647 1:0 tick (tack) time is now 701648 1:0 1:0 RSTBPDU transmitted on port 33 on STP 53 1:0 Root bridge ID = 3200d0 95820514 1:0 Path to Root cost = 3 1:0 Designated bridge ID = 800000d0 957962aa 1:0 Designated portId = 29697 1:0 Bridge portId = 29697 1:0 Message age : 256 1:0 Proposing
OmniSwitch Troubleshooting Guide September 2005 page 4-9
Page 90
Generic Troubleshooting in Dshell Troubleshooting Spanning Tree

Generic Troubleshooting in Dshell

Note. Dshell commands should only be used by Alcatel personnel or under the direction of Alcatel. Misuse or failure to follow procedures that use Dshell commands in this guide correctly can cause lengthy network down time and/or permanent damage to hardware.
The stp_help command (executed from the NiDebug Dshell command prompt) displays the trace menu for the Spanning Tree algorithm on NIs. Enter stpNI_help at ???? at what? Text missing here. ????
-> dshell
Working: [Kernel]->NiDebug NiDebug>>stp_help stpNISock_globals : Global variables stpNISock_warningprint : warning trace stpNISock_totraceprint : time-out trace stpNISock_traceprint : event trace stpNISock_intraceprint : inter-NI trace stpNISock_boardupprint : boards up stpNISock_printon : activates STP Socket Handler printf stpNISock_printoff : desactivates STP Socket Handler printf stpni_printStaFied : status field description trace stpni_debugPport : Physical Port editing trace stpni_debugLport : Logical Port editing trace stpni_debugport : Physical & Logical Port editing trace stpni_traceprint : event and warning trace stpni_printon : activates STP NI printf stpni_printoff : desactivates STP NI printf
These NI spanning tree trace utilities are described in the subsections that follow.

Event Trace (stpni_traceprint)

This trace includes the events received and generated by the Spanning Tree and the warning detected while processing an event. A warning entry contains the name of the C source file and a line number. The explanation of the warning can be given by Engineering.
Each event trace entry is built as follows:
An ASCII pattern reflecting the event.
Up to 4 parameters (a -1 (or 0xffffffff) indicates that the parameter is not significant).
The following is an example of the stpni_traceprint command printout:
Nidebug>> stpni_traceprint 64 - PVLANBLK (1,1000000,18,ffffffff) 65 - PORTATCH (19,1,ffffffff,ffffffff) 66 - PVLANBLK (1,2000000,19,ffffffff) 67 - PORTATCH (1a,1,ffffffff,ffffffff 68 - PVLANBLK (1,4000000,1a,ffffffff) 69 - PORTATCH (1b,1,ffffffff,ffffffff) 70 - PVLANBLK (1,8000000,1b,ffffffff) 71 - PORTATCH (1900001,1,ffffffff,ffffffff) 72 - PORTDELE (1,ffffffff,ffffffff,ffffffff) 73 - PORTATCH (1,1,ffffffff,ffffffff)
page 4-10 OmniSwitch Troubleshooting Guide September 2005
Page 91
Troubleshooting Spanning Tree Generic Troubleshooting in Dshell
74 - PORTDELE (2,ffffffff,ffffffff,ffffffff) 75 - PORTATCH (2,1,ffffffff,ffffffff) 76 - LINK_UP (1,64,1,ffffffff) 77 - LINK_UP (2,64,1,ffffffff) 78 - LINK_UP (14,64,1,ffffffff) 79 - LINKDOWN (1,ffffffff,ffffffff,ffffffff) 80 - LINKDOWN (2,ffffffff,ffffffff,ffffffff 81 - LINK_UP (1,64,1,ffffffff) 82 - LINK_UP (2,64,1,ffffffff) 83 - AGGR_UP (1,120,2e,ffffffff) 84 - Warning File:stpni_bpduEvt.c line:744 85 - PORTJOIN (1,121,ffffffff,ffffffff)
Event names displayed by the stpni_traceprint command are described in the subsections that follow.
PORTATCH
This corresponds to a port attached event received from the Spanning Tree CMM. The Spanning Tree CMM generates this event when it receives a Port attach indication from the Port Manager.
The parameters are:
First parameter: Global port identifier.
Second parameter: Default VLAN associated to the port.
PORTDELE
This corresponds to a port detach event received from the Spanning Tree CMM. The Spanning Tree CMM generates this event when either it receives a Port detach indication from the Port Manager or there is change in the port type (e.g. transition from aggregable to fixed, mobile to fixed).
First parameter: Global port identifier.
ADDVLAN
This event is generated by the Spanning Tree CMM when it receives a VLAN added event from the VLAN Manager. This events is sent to all the NI that are up and running by the Spanning Tree CMM.
The parameters are:
First parameter: The VLAN identifier.
Second parameter: The Spanning Tree type. A 1 indicates Flat Spanning Tree while a 2 indicates 1x1
Spanning Tree.
Third parameter: The VLAN administrative state. A 1 indicates Enable while a 2 indicates Disable.
Fourth parameter: The Spanning Tree administrative state. A 1 indicates Enable while a 2 indicates
Disable.
OmniSwitch Troubleshooting Guide September 2005 page 4-11
Page 92
Generic Troubleshooting in Dshell Troubleshooting Spanning Tree
MODVLADM
This event is received is sent by the Spanning Tree CMM to the NIs when the administrative state of a VLAN is changed (event generated by the VLAN Manager to the Spanning Tree CMM).
The parameters are:
First parameter: The VLAN identifier.
Second parameter: The VLAN administrative state. A 1 indicates Enable while a 2 indicates Disable.
MODVLSTP
This event is received is sent by the Spanning Tree CMM to the NIs when the Spanning Tree state of a VLAN is changed (event generated by the VLAN Manager to the Spanning Tree CMM).
The parameters are:
First parameter: The VLAN identifier.
Second parameter: New Spanning Tree. A 1 indicates Enable while a 2 indicates Disable.
Note. When the Spanning Tree state is Disable, all the ports (Up) are moved to the forwarding state and are removed from the Spanning Tree scope.
ADDQTAG
This event is received is sent by the Spanning Tree CMM to the NI when a tag is added to a port belong­ing to that NI. This event is generated on the CMM by the 802.1Q application.
The parameters are:
First parameter: Global port identifier.
Second parameter: The 802.1Q tag.
Note. This event is processed by the Spanning Tree NI as a port attach event.
DELQTAG
This event is received is sent by the Spanning Tree CMM to the NI when a tag is removed a port belong­ing to that NI. This event is generated on the CMM by the 802.1q application.
The single parameters is:
First parameter: Global port identifier.
Note. This event is processed by the Spanning Tree NI as a port attach event.
page 4-12 OmniSwitch Troubleshooting Guide September 2005
Page 93
Troubleshooting Spanning Tree Generic Troubleshooting in Dshell
MDEFVLAN
This event is received is sent by the Spanning Tree CMM to the NI when the default VLAN of a fixed or q-tagged port is change (this also applies to logical port). This event is generated on the CMM by VLAN Manager application.
The parameters are:
First parameter: Global port identifier.
Second parameter: new default VLAN.
PORTAGGR
This event is currently unused.
PORTDISG
This event is currently unused.
AGGR_UP
This event is sent by Link Aggregation NI when it detects that a aggregator comes up; It could be either a static aggregator (OmniChannel) or a dynamic aggregator (802.3ad). This message is generated when the first port joins the aggregator only.
The parameters are:
First parameter: The aggregator identifier (logical port ID value between 0 and 31).
Second parameter: The global port identifier of the physical port that has joined the aggregator.
Third parameter: The output QID to be used by the Spanning Tree (not significant).
Note. The output QID is no more used by the Spanning Tree since at the time Link aggregation is asking for the default queue associated to the physical port, Qdriver might not be ready the provide it. However Link Aggregation keeps providing this parameter even if now this one is not significant.
AGGRDOWN
This event is sent by Link Aggregation NI when it detects that a aggregator goes down; It could be either a static aggregator (OmniChannel) or a dynamic aggregator (802.3ad). This message is generated when the last port has leaved the aggregator.
The single parameter is:
First parameter: The aggregator identifier (logical port ID value between 0 and 31).
OmniSwitch Troubleshooting Guide September 2005 page 4-13
Page 94
Generic Troubleshooting in Dshell Troubleshooting Spanning Tree
PORTJOIN
This event is sent by Link Aggregation NI when a physical port is joining an aggregator; It could be either a static aggregator (OmniChannel) or a dynamic aggregator (802.3ad). This message is generated after the first port has joined the aggregator (see “AGGR_UP” on page 4-13).
The parameters are:
First parameter: The aggregator identifier (logical port ID value between 0 and 31).
Second parameter: The global port identifier of the physical port that has joined the aggregator.
PORTLEAV
This event is sent by Link Aggregation NI when a physical port is leaving an aggregator; It could be either a static aggregator (OmniChannel) or a dynamic aggregator (802.3ad). This message is generated after the first port has joined the aggregator (see “AGGR_UP” on page 4-13). Link aggregation provides the aggre- gator identifier, the global port identifier of the port which is leaving it and the global port identifier of the newly primary port
The parameters are:
First parameter: The aggregator identifier (logical port ID value between 0 and 31).
Second parameter: The global port identifier of the physical port that has joined the aggregator.
Third parameter: The global port identifier of the physical port that will have the primary port role.
Fourth parameter: The output QID of the newly primary port (not significant; see note of “AGGR_UP”
on page 4-13).
BRGPARAM
The is event is generated by the Spanning Tree CMM when a configuration parameter of the Spanning Tree is changed by the operator. This message is sent to all the NI that are up and running.
The parameters are:
First parameter: The spanning identifier (i.e., VLAN identifier).
Second parameter: The type of the parameter. A 1 indicates Spanning Protocol (802.1w(third parame-
ter=4)/802.1D(third parameter=3)), a 2 indicates Spanning Tree (Flat (third parameter=1)/ or 1x1 (third parameter=2)/), a 3 indicates the bridge priority value, a 4 indicates the Hello timer value, and a 5 indi­cates the forward delay value, and a 6 indicates the maximum age.
Third parameter: The value of the parameter.
page 4-14 OmniSwitch Troubleshooting Guide September 2005
Page 95
Troubleshooting Spanning Tree Generic Troubleshooting in Dshell
PTSTPMOD
The is event is generated by the Spanning Tree CMM when the Spanning Tree configuration parameter of a port is changed by the operator.
The parameters are:
First parameter: The spanning identifier (i.e., VLAN identifier).
Second parameter: The global port identifier.
Third and fourth parameters: The type of the parameter/value. A 0x11 indicates mode of the port
(dynamic(1), blocking(2), forwarding(3)), a 0x12 indicates Spanning Tree administrative state of the port (enable(1),disable(2)), a 0x13 indicates port administrative state, a 0x14 indicates port priority, a 0x15 indicates port path cost, and a 0x16 indicates port connection type (half-duplex(1),point to point (2),auto point to point(3),edge(4)).
PORTMOD
The is event is sent by the Spanning Tree CMM to the Spanning Tree NI when the administrative state of a port is modified by the operator.
The parameters are:
First parameter: The spanning identifier (i.e., VLAN identifier).
Second parameter: The global port identifier.
Third and fourth parameters: The type of the parameter/value. A 0x13 indicates port administrative
state (enable (1),disable(2)).
PORTVLBK
This event is an internal event which generated by the Spanning Tree when STP is processing a Port/ VLAN blocking that can take place at VLAN level or port level.
The parameters are:
First parameter: The blocking status. A 0x44 indicates blocking already done, a 0x88 indicates nothing
to do, a 0x55 indicates blocking at port level, and a 0xaa indicates blocking at VLAN level.
Second parameter: The local port identifier.
Third parameter: The VLAN identifier.
PVLANBLK
This event is registered when the Spanning Tree is generated a Port VLAN Blocking message to Source Learning NI.
The parameters are:
First parameter: The VLAN identifier.
Second parameter: The port vector.
Third parameter: The local port identifier.
OmniSwitch Troubleshooting Guide September 2005 page 4-15
Page 96
Generic Troubleshooting in Dshell Troubleshooting Spanning Tree
The Port VLAN blocking message sent to the Source Learning NI has the following structure:
uint16 VlanId, uint32 PortVector
This event has the following values for the message ID:
appID: APPID_SPANNING_TREE.
subMsgNum: STP_PortVlanBlocking.
These event fields are defined below:
VlanId: A value 1 to 4095 identifies a VLAN (0 means that the message is applied to ports defined by
the PortVector on all VLANs).
PortVector: A field of bits, one bit by the physical port, which indicates if the port is concerned by the
change of state.
GMBPDU
This message is sent by the Spanning tree NI to the local Group Mobility NI each time a BPDU is received on a mobile port. Group mobility can take two actions depending on how the mobile port has been config­ured:
Ignore BPDU: In this case Spanning Tree will keep on sending GMBPDU each time a BPDU will be
received on the port (there is no Spanning Tree computation for the port).
Move port to fixed: Group Mobility asks Spanning Tree to revert the mobile port to the fixed state and
the port will be added to Spanning Tree associated to VLAN 1.
The BPDUonMobPort message sent by the Spanning Tree NI has the following format:
uint8 LocalPortId, uint8 bpdu_lgth, uint8 bpdu_data[STP_BPDULGTH]
This event has the following values for the message ID:
appID: APPID_SPANNING_TREE
subMsgNum: STP_BPDUonMobPort
These event fields are defined below:
LocalPortId: Identifies the physical Port (local reference: 0 to 23) which received the BPDU.
bpdu_lgth: The length in bytes of the following BPDU.
bpdu_data: The BPDU.
GMIGBPDU
This message is sent by Group Mobility NI in response to a BPDU on mobile port message sent by the Spanning Tree. By sending this message group mobility tells to Spanning Tree to ignore BPDU on the mobile port.
The single parameters is:
First parameter: The global port identifier.
page 4-16 OmniSwitch Troubleshooting Guide September 2005
Page 97
Troubleshooting Spanning Tree Generic Troubleshooting in Dshell
GM2FIXED
This message is sent by Group Mobility NI in response to a BPDU on mobile port message sent by the Spanning Tree. By sending this message group mobility tells to Spanning Tree that the mobile port must be reverted to the fixed state.
The parameters are:
First parameter: The global port identifier.
Second parameter: The default VLAN.
VMADDVPA
The event is sent by the VLAN manager NI when a new VLAN needs to be added to a mobile port (no longer used by the VLAN manager).
The parameters are:
First parameter: The global port identifier.
Second parameter: The default VLAN.
VMDELVPA
The event is sent by the VLAN manager NI when a VLAN needs to be removed from a mobile port (no more used by VLAN manager).
The parameters are:
First parameter: The global port identifier.
Second parameter: The default VLAN.
VMDEFVPA
The event is sent by the VLAN manager NI when a the default VLAN of a mobile port needs to be changed.
The parameters are:
First parameter: The global port identifier.
Second parameter: The default VLAN.
OmniSwitch Troubleshooting Guide September 2005 page 4-17
Page 98
Generic Troubleshooting in Dshell Troubleshooting Spanning Tree
TOPOCHGT
This event notifies a change of Spanning Tree topology. The format of the message is:
uint16 VlanId, uint16 aging_timer
This event has the following values for the message ID:
appID: APPID_SPANNING_TREE
subMsgNum: STP_TopologyChange
These event fields are defined below:
VlanId: A value of 1 to 4095 identifies a VLAN and 0 means that the message is applied to all the
VLANs (single Spanning Tree per switch).
aging_timer: The value in second of the aging timer.
LINK_UP
This event is sent by the ENI driver when a link goes up.
The parameters are:
First parameter: The global port identifier.
Second parameter: The default link bandwidth.
Third parameter: The link mode (full-duplex(1),half-duplex(2),auto-negociate(3)).
LINKDOWN
This event is sent by the ENI driver when a link goes down.
The parameters are:
First parameter: The global port identifier.
NI_UP
This event is sent by NI Supervision when it detects that a new NI is up and running.
The parameters are:
First parameter: The slot number.
Second parameter: The slice number.
NI_DOWN
This event is sent by NI Supervision when it detects that a new NI is up and running.
The parameters are:
First parameter: The slot number.
Second parameter: The slice number.
page 4-18 OmniSwitch Troubleshooting Guide September 2005
Page 99
Troubleshooting Spanning Tree Generic Troubleshooting in Dshell

Physical and Logical Port Dumps

Logical Ports (stpni_debugLport)
Here follows the display of the Logical port seen by the Spanning Tree. Each line corresponds to the local port identifier index.
Certified: [Kernel]->stpni_debugLport
Logical Ports array:
sta field:
- 0x80 -> 1:Point to point Port
- 0x20 -> 1:Aggregable port
- 0x02 -> 1:Link up ; 0:link Down
- 0x01 -> 1:Adm up ; 0:Adm Down
- 0x04 -> Fixed Port
- 0x08 -> Q-tagged Port
- 0x10 -> Mobile Port
sta dGid qid portid nTag vector Prim Mac Address Bw Duplex 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 0b 0001 0233 01900001 0001 00000000 03000000 38 00:00:00:00:00:00 03e8 00 0b 0001 0187 01900002 0001 00000300 00000000 09 00:00:00:00:00:00 0064 00 0b 0001 01cb 01900003 0003 0c000000 00000000 1a 00:00:00:00:00:00 03e8 00 0b 0001 01a3 01900004 0001 00030000 00000000 10 00:00:00:00:00:00 0064 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 00 0000 0000 00000000 0000 00000000 00000000 ff 00:00:00:00:00:00 0000 00 value = 9 = 0x9
OmniSwitch Troubleshooting Guide September 2005 page 4-19
Page 100
Generic Troubleshooting in Dshell Troubleshooting Spanning Tree
The fields displayed by the stpni_debugLport command are described below:
output definitions
dGid The field contains the value of the default VLAN associated to the port.
When the default GID is 0, it indicates that the port is in the IDLE state (field sta=00).
Qid Default QID (not used).
Portid Global port identifier (0x0190xxxx indicates that it is a logical port, and
0x0001 indicates that it is logical port 1).
NTag Number of tags (802.1q) attached to that port. This field should always
be 0 when the port is FIXED or MOBILE.
Vector Bitmap of the local ports that belong to the aggregator (logical port). In
the example local port 1 and 2 belong to the aggregator (MSB= port 31 and LSB = port 0).
Prim Local port identifier of the primary port. If the primary port does not
belong to that NI, the primary reference is set to 0xff.
Bw Bandwidth as received on Link up from the ENI driver.
Duplex Duplex mode as received from ENI driver on Link Up.
Physical Port (stpni_debugPport)
Here is the display of the Physical Port seen by the Spanning Tree NI:
Certified: [Kernel]->stpni_debugPport
Physical Ports array:
sta field:
- 0x80 -> 1:Point to point Port
- 0x20 -> 1:Aggregable port
- 0x02 -> 1:Link up ; 0:link Down
- 0x01 -> 1:Adm up ; 0:Adm Down
- 0x04 -> Fixed Port
- 0x08 -> Q-tagged Port
- 0x10 -> Mobile Port
sta dGid qid portid nTag lpid prim Mac Address Bw Duplex 07 03e7 0162 00000040 0000 ff ff 00:d0:95:84:3c:d0 0064 00 07 0140 0166 00000041 0000 ff ff 00:d0:95:84:3c:d1 0064 01 05 0001 016a 00000042 0000 ff ff 00:00:00:00:00:00 0000 00 05 0001 016e 00000043 0000 ff ff 00:00:00:00:00:00 0000 00 05 0001 0172 00000044 0000 ff ff 00:00:00:00:00:00 0000 00 05 0001 0176 00000045 0000 ff ff 00:00:00:00:00:00 0000 00 05 0001 017a 00000046 0000 ff ff 00:00:00:00:00:00 0000 00 05 0001 017e 00000047 0000 ff ff 00:00:00:00:00:00 0000 00 23 0000 0182 00000048 0000 82 ff 00:d0:95:84:3c:d8 0064 01 23 0000 0186 00000049 0000 82 ff 00:d0:95:84:3c:d9 0064 01 05 0001 018a 0000004a 0000 ff ff 00:00:00:00:00:00 0000 00 05 0001 018e 0000004b 0000 ff ff 00:00:00:00:00:00 0000 00 05 0001 0192 0000004c 0000 ff ff 00:00:00:00:00:00 0000 00 07 014d 0196 0000004d 0000 ff ff 00:d0:95:84:3c:dd 0064 01 05 0001 019a 0000004e 0000 ff ff 00:00:00:00:00:00 0000 00 05 0001 019e 0000004f 0000 ff ff 00:00:00:00:00:00 0000 00 23 0000 01a2 00000050 0000 84 ff 00:d0:95:84:3c:e0 0064 01
page 4-20 OmniSwitch Troubleshooting Guide September 2005
Loading...