Cisco Catalyst 4000 series, Catalyst 2980G series, Catalyst 4912 series, Catalyst 2948G Series Hardware Troubleshoot

Hardware Troubleshoot for Catalyst 4000/4912/2980G/2948G Series Switches
Document ID: 18935 Interactive: This document offers customized analysis of your Cisco
device.
Contents
Introduction Prerequisites
Requirements Components Used Conventions
Preparation for Troubleshooting Hardware on Catalyst Switches Online Troubleshooting Tools Catalyst 4000 Family Troubleshooting Procedures
General Problem Solving Model General Problem Solving Flow Chart Common Problems Symptom Description System/Supervisor/Module Problems and Steps to Resolve Them Supervisor Crashes and Steps to Resolve Them
Misleading Problems show Command Descriptions Related Information
Introduction
This document provides troubleshooting procedures on how to diagnose hardware problems on Catalyst 4000 family switches. The Catalyst 4000 family includes the 4003 and 4006 modular chassis and the 2948G, 2980G, and 4912G fixed models. The naming conventions for the Catalyst 4000 and Catalyst 2900 can be very confusing. Refer to Understanding Catalyst 2900 and Catalyst 4000 Naming Conventions for more information on how to help clarify these issues.
The goal is to help Cisco customers identify and fix some basic hardware issues, or to perform more extensive troubleshooting before you contact Cisco Technical Support. An orderly troubleshooting process with the collection of specific diagnostics ensures that information necessary to the resolution of the problem is not lost. If you refine the scope of the problem, this saves valuable time in the search for a solution.
Prerequisites
Requirements
Cisco recommends that you have knowledge of these topics:
Catalyst 4000 Command Reference How LAN Switches Work
Components Used
This document is not restricted to specific software and hardware versions.
Conventions
Refer to Cisco Technical Tips Conventions for more information on document conventions.
Preparation for Troubleshooting Hardware on Catalyst Switches
Many hardware problems encountered during field installations or during normal operation can be prevented by a thorough product overview ahead of time. For those customers not already familiar with general system and power requirements, proper installation procedure, switch management and software considerations for these switches, Cisco recommends that you read documents in Cisco Catalyst 4000 Series Switches Troubleshooting TechNotes.
This document covers this important information:
Which supervisor is supported in which chassis? How do I back up my configuration? Which software version is General Deployment (GD) for the Catalyst 4000 Family?
This document assumes familiarity with the Catalyst 4000 Command Reference. You should also have a prior understanding of switching fundamentals, or have read How LAN Switches Work. Additional online documentation is referenced throughout this document in order to assist in troubleshooting.
Online Troubleshooting Tools
Cisco has a variety of troubleshooting tools and resources in order to help you interpret switch output, determine hardware software compatibility, track bugs, and search field notices. These tools and resources are referenced throughout this document:
Output Interpreter (registered customers only) Paste in the output of a command and get the interpretation with relevant errors, warnings, and status information.
Error Message Decoder (registered customers only) Paste in system error messages and discover their meaning.
Bug Toolkit (registered customers only) Search for bugs. Troubleshooting AssistantThis provides step−by−step instructions to many common network issues.
Catalyst 4000 Family Troubleshooting Procedures
This section discusses troubleshooting procedures, symptoms, show commands, and diagnostics for the Catalyst 4000 family. This section assumes you have read the companion guide to this document, as described in the Introduction of this document, and that you understand your switch and its capabilities.
Note: If the switch is connected to the network, do not reset or reseat modules as a first troubleshooting step! In addition to the downtime that users experience, the internal buffer, which logs system messages are
erased and potentially useful information in regards to hardware or software errors are lost. If the switch is offline, you have more freedom to monitor LED status, pull cables, reseat modules, or reset the switch as necessary. Troubleshooting LED status is discussed in more detail later in this document.
Hidden Commands
Some commands presented in this document are known as hidden, which means that they cannot be parsed with a "?", and you cannot Tab in order to complete. When a hidden command is suggested in this document, simply gather the output and send it to the TAC engineer, if you open a case. It is possible that this output is useful in solving your case. These commands are undocumented, and therefore the TAC engineer is not required to explain the output to the customer.
If you want to troubleshoot any problem, this requires a method or set of procedures which, if followed correctly, produces a solution. Begin by understanding general problem solving for LAN networks.
General Problem Solving Model
If you want to troubleshoot any problem, this requires a method or set of procedures which, if followed correctly, produces a solution. Begin by understanding general problem solving for LAN networks. Hardware failures in LAN networks are characterized by certain symptoms. These symptoms can be general such as the inability to Telnet between switches, more specific such as link flapping, or perhaps the switch is resetting itself. Each symptom can be traced to one or more causes if you use specific troubleshooting techniques. A systematic approach works best. Define the specific symptoms, identify all potential problems that could be causing the symptoms, and then eliminate each potential problem, from most likely to least likely, until the symptoms disappear.
General Problem Solving Flow Chart
This diagram outlines the steps that detail the problem−solving process:
Complete these steps:
Define the problem.
It is important to first identify the problem being experienced. This allows you to identify what kinds of causes can result in these symptoms. In order to help determine the problem, ask yourself these questions:
What is the primary symptom? Is the problem specific to this switch or does it affect other switches on the network as well? Is this a problem with one or more ports on a specific module? What type of ports: 10/100, Multimode Fiber (MMF), Singlemode Fiber (SMF), GigabitEthernet, and so forth?
1.
What device is connected to the switch ports that experiences the problem? When did this problem first occur and has it occurred more than once? What happened at the time the problem was first noticed? Is there anything unique about traffic conditions at that time of day? For example, was this a peak time for traffic?
Did you run any particular commands at the time or make any configuration changes?
Gather the facts.
Gather diagnostics and show commands output from the switch to isolate the scope of the problem. If physical access to the equipment is possible, locate and list any modules with red or yellow LEDs, disconnected cables, or loose connections.
2.
Consider the possible causes.
Consider possible problems based on the information you gathered. With certain data, you are able, for example, to eliminate hardware as a problem, so that you can focus on software problems. At every opportunity, try to narrow the number of potential problems so that you can create an effective plan of action.
3.
Create and implement an action plan.
Create an action plan based on the potential problems. Focus on only one potential problem at a time. If you alter more than one variable simultaneously, you can solve the problem, but the identification of the specific change that eliminated the symptom becomes far more difficult and does not help you solve the same problem if it occurs in the future.
4.
Observe the results.
Be sure to gather and analyze the results each time a variable is changed to determine if the problem has been fixed.
5.
Repeat the process.
Repeat testing for possible causes until the problem is resolved.
6.
Common Problems
As described in the Problem Solving Model, the first step in resolving a problem is to identify the symptom. Refer to Catalyst Troubleshooting Tips for more information on some common problems associated with all Catalyst switches that can be resolved.
Most hardware problems with LAN networks fall into these categories and each category has various symptoms related to it:
Connectivity Problems System/Supervisor/Module Problems Supervisor Crashes
Connectivity Problems
These problems can occur when communication with the supervisor, module, or hosts connected to the module is intermittent or has been lost.
System/Supervisor/Module Problems
These problems can occur when system status LEDs indicate a problem, the supervisor or modules are not recognized or show faulty, or when users experience poor performance.
Supervisor Crashes
These problems can occur when the switch has reset, continually resets, or is down completely.
Symptom Description
This section discusses symptoms, troubleshooting procedures, and commands for Catalyst 4000 family switches. This section assumes you are able to identify your switch chassis, supervisor engine, modules, and feature cards, and that you understand the system specifications, cabling, power, and software requirements as described for Cisco Catalyst 4500 Series Switches Install and Upgrade Guides.
If you have not determined what your primary symptom is, see the General Problem Solving Model section of this document and apply the steps to your problem.
Connectivity Problems and Steps to Resolve Them
This section covers common connectivity issues that the customer can encounter with the Catalyst 4000.
These commands are supported by the Output Interpreter tool for CatOS and can be used to assist in troubleshooting switch port problems:
show version show module show system show port show mac show counters show cdp neighbors detail
If you have the output of the supported commands from your Cisco device, you can use the Output Interpreter (registered customers only) to display potential issues and fixes. In order to use Output Interpreter (registered customers only) , you must be a registered user, be logged in, and have JavaScript enabled.
Can not console/Telnet into the supervisor
Both of these problems are covered in the Catalyst Troubleshooting Tips document that is mentioned earlier.
Not able to console
Verify that the power switch is in the ON position (|) and the system OK LED is ON.1. Connect the cable directly to the console port and not through a patch panel.2. Verify that the correct cabling and hardware is used to connect to your particular supervisor engine. Refer to the Connecting a Terminal to the Console Port on Catalyst Switches document for more information.
3.
Not able to Telnet
Complete the steps in thedetailed procedure described in Catalyst Troubleshooting Tips. If it is determined that the sc0 management interface is not configured or not configured correctly, refer to Configuring an IP Address on Catalyst Switches for more information.
1.
Attempt to Telnet from a PC directly connected to the switch in the same VLAN as the sc0 interface in order to eliminate any routing issues.
2.
Gain console access to the switch and make sure the supervisor is not in boot> or rommon>. If the switch is in one of these modes, you need to complete the steps in the recovery procedures. Refer to Recovering Catalyst 4000 and Catalyst 5000 Switches from Corrupted or
3.
Missing Software, or an Upgrade Failure, or from ROMMON Mode for more information on recovery.
Receiving the "Failed to allocate session block" error message
If you receive the Failed to allocate session block error message while you access the switch on the Telnet, the problem occurs because the switch cannot allocate the required memory for the Telnet application. The available free memory is low because of some process that uses more memory or because of a memory leakage in the switch.
In order to avoid the error, issue the show proc mem command and verify the process that uses more memory in the switch. In order to resolve the problem, either add more memory to the system or disable some features in order to free some of the existing memory.
If there is memory leak in the switch, reset the switch in order to release all the process in the memory. If the error message still appears even after you reboot, upgrade the software version of the switch.
Can not connect to a remote host, router, or another switch
Complete these steps:
Verify that the port LED status is green. If the link LED is solid orange, it has been disabled by the software. If it is blinking orange after supervisor bootup and module initialization, this is a hardware failure. If there is no link LED, check and swap the cables. Verify operation of the end device and NIC.
Refer to Troubleshooting Cisco Catalyst Switches to NIC Compatibility Issues for more information on NIC troubleshooting.
1.
What type of media is involved? Fiber? Gigabit Interface Converter (GBIC)? Gigabit Ethernet? 10/100 BaseTX? If this a physical layer issue, refer to the Physical Layer Troubleshooting section of Troubleshooting Switch Port Problems for more information.
2.
Issue the show port <mod/port> command in order to verify that the status is connected, which means that the port is operational. If any other status is displayed, see the Port status shows not connected, faulty, disabled, inactive, or errdisable section for troubleshooting steps.
If the end device is a Cisco router or switch, and Cisco Discovery Protocol (CDP) is enabled, issue the show cdp neighbor detail command in order to identify the device, remote interface type, and remote IP address.
Note: A status of connected does not mean that the ports are free of errors. If there are errors on the ports, proceed to the Seeing errors on the ports section of this document.
3.
Swap the cables. Move the cable to a different port. Eliminate patch panels. Patch panels are a common source of connectivity failures, so attempt to connect directly to the end device. Verify the operation of the end device.
4.
Capture the output of the show config , show module , and show test 0 commands.
Issue the show module command in order to verify that the status is ok for that module and not disabled or faulty.
If the status is disabled, issue the set module enable <mod> command. If the status is faulty, establish a console connection to capture bootup Power On Self Test (POST) diagnostics and any system error messages. Issue the reset <mod> command in order to reset the module. Issue the show test 0 command in order to determine if this module passed all of it's diagnostic tests on bootup.
a.
5.
Remove the module and inspect for bent pins. Reseat the module, firmly press down the ejector levers, and tighten the captive installation screws. If the output of the show module command status is still faulty, try the module in another slot. Slot 2 accepts line cards or a supervisor engine. If necessary, power off/on the switch. If the status is still faulty, the module has failed.
Issue the show test 0 command in order to verify that the port has passed its last diagnostic test on bootup. If F is indicted for that port, proceed as in step a.
b.
Verify whether this device is on the same or a different VLAN. Remember that this is a Layer 2 (L2) device and a router is required to route between VLANs.
6.
If you connect to another switch, ask yourself these questions:
What type of port is this? A trunk port? If it is a trunk port, what trunk encapsulations does it support? Is the port capable of EtherChannel?
Issue the show port capabilities command for a quick look at port capabilities. Refer to LAN Technical Tips for more information on how to troubleshoot issues with trunking or EtherChannel.
7.
Port status shows not connected, faulty, disabled, inactive, or errdisable
Possible Port Status
Status
Description and Work Around
connected
Port is operational and connected to end device. A status of connected does not mean the ports are error−free. If there are errors on the ports, proceed to the Seeing errors on the ports section of this document.
notconnect
Nothing is connected to the port. Check or swap cables. Verify the operation of the end device.
faulty
Possible hardware failure. Issue the show test command in order to verify. If F displays for a port, proceed as in step 5 of the Can not connect to a remote host on the switch section of this document.
disabled
Manually disabled. Issue the set port enable <mod/port> command in order to enable the port.
If the port status does not change to enable, issue the show module command in order to determine if the module is disabled.
inactive
Port belongs to a VLAN that does not exist. Issue the set vlan <vlan> command in order to add a VLAN.
errdisable
Port had been shut down due to errors. Refer to the Recovering From errDisable Port State on the CatOS Platforms document for more information.
Seeing errors on the ports
Complaints of poor performance by users can sometimes translate to errors on switch ports. Output from the port error counters command help you troubleshoot connectivity problems.
Verify port status and troubleshoot accordingly. Refer to the Port status shows not connected, faulty, disabled, inactive, or errdisable section of this document.
1.
Capture the output of the show port <mod/port> , show mac <mod/port> , and show counters <mod/port> commands.
These are common causes for data link errors on ports:
speed/duplex misconfiguration network congestion NICs or drivers
Refer to Troubleshooting Cisco Catalyst Switches to NIC Compatibility Issues for more information.
cabling bad port
The show port <mod/port> command can show Late−Coll, Align−Err, FCS−Err, Xmit−Err, and Rcv−Err errors. Refer to the the Show Port for CatOS and Show Interfaces for Cisco IOS section of Troubleshooting Switch Port Problems for more informaiton on these errors and possible causes.
The show mac <mod/port> command shows the number of unicast, multicast, and broadcast frames transmitted. Issue this command in order to verify if frames are received and transmitted.
In−Discards show frames that do not need to be switched. This is normal if the port was connected to a hub and two devices exchanged data. Lrn−Discards indicate that Content Addressable Memory (CAM) entries are being discarded. In−Lost counter displays the sum of all error packets received on the port. The Out−Lost counter indicates egress port buffer overflows. Refer to the Show Mac for CatOS and Show Interfaces Counters for Cisco IOS section of Troubleshooting Switch Port Problems for more information on these errors and possible causes.
The show counters <mod/port> command is useful in particular for troubleshooting port problems.
For example, this counter results if you issue the command:
5 badTxCRC = 0
If badTxCRC were incrementing, this can be bad hardware corrupting packets. Capture the output of the show counters <mod/port> command and open a case with the Cisco Technical Support.
2.
Issue the clear counters command in order to reset the output of the show port <mod/port>, show mac <mod/port>, and show counters <mod/port> commands. View the command outputs several times in order to see if errors are incrementing.
If you have not been able to track down any reason for intermittent connectivity loss on the switch in the previous steps mentioned, capture the output of the show nvramenv 1 command, as well as the other commands in the previous steps, and open a case with the Cisco Technical Support.
3.
Refer to these documents for more information on how to troubleshoot the other causes of port errors:
Troubleshooting Cisco Catalyst Switches to NIC Compatibility Issues Configuring and Troubleshooting Ethernet 10/100Mb Half/Full Duplex Auto−Negotiation
4.
Experiencing poor performance
Poor performance is often perceived to be a hardware problem, when in fact it can be attributed most often to connectivity problems. See the Seeing errors on the ports section for troubleshooting steps.
Getting continuous %PAGP−5 left/joined bridge messages
Complete these steps:
Capture the show port <mod/port>, show mac <mod/port>, and show spantree summary command output.
System messages similar to these messages are informational, although if the errors continue to repeat, the link can be flapping.
2002 Jan 19 14:59:05 %PAGP−5−PORTFROMSTP:Port 2/11 left bridge port 2/11 2002 Jan 19 14:59:23 %PAGP−5−PORTTOSTP:Port 2/11 joined bridge port 2/11
1.
If these messages occur repeatedly on certain ports, refer to these document for possible causes:
Common CatOS Error Messages on Catalyst 4000 Series Switches Common CatOS Error Messages on Catalyst 5000/5500 Series Switches Common CatOS Error Messages on Catalyst 6000/6500 Series Switches
2.
If you also see errors on the port in the show port <mod/port> and show mac<mod/port> command output, see the Seeing errors on the ports section for troubleshooting steps.
3.
Issue the show spantree summary command in order to verify how many ports are in each VLAN, if any ports on the switch are blocking, and which VLANs are being blocked. Since Spanning−Tree Protocol (STP) loops can cause link flaps or actually bring down a switch or network, with the appearance of a hardware failure, this is vital information to capture, whether troubleshooting hardware or software. Refer to LAN Technical Tips for more information on how to troubleshoot STP.
4.
Can not autonegotiate or speed/duplex mismatch
Complete these steps:
Make sure you have speed and duplex configured identically on both sides of the link. Catalyst 4000 switchports are set to auto by default. When both sides of a 100 BaseTX link autonegotiate correctly, the show port <mod/port> command output is as follows:
Duplex Speed
−−−−−−− −−−−−−− a−full a−100
Hardcode both sides. Remember when hardcoding the port, the port speed must be set first and then the duplex setting must be set. Issue the show port <mod/port> command. The switch output is as follows:
Duplex Speed
−−−−−−− −−−−−−− full 100
Note: Even though the switch has been hard coded, the connecting device must still be hardcoded to eliminate problems.
1.
If there is an autonegotiation problem caused by a speed/duplex mismatch or NIC incompatibility, errors show up on the ports. Refer to these documents for more information:
Configuring and Troubleshooting Ethernet 10/100Mb Half/Full Duplex Auto−Negotiation Troubleshooting Cisco Catalyst Switches to NIC Compatibility Issues
2.
Loading...
+ 21 hidden pages