Cisco Systems 6503 User Manual

White Paper

High Availability for the Cisco Catalyst 650 0 Series Switches

Overview

Cisco Catalyst® 6500 Series multilayer switches have become an essential component of asound network designin today’senterprise and service provider environments. Having such a critical role, the Cisco Catalyst 6500 Series must provide a reliable switching platform, and offer high performance and intelligent network services. The high availability of the Cisco Catalyst 6500 Series evenhas the capability to maintain an IP phone call during supervisor engine failover. This paper discusses how the Cisco Catalyst 6500 Series provides high system availability through hardware and software redundancy features, and focuses specifically on the following three areas:
Fabric redundancy of the Switch Fabric Module (SFM)
Supervisor engine redundancy with the Cisco CatalystOperating System (Catalyst OS),High Availabilityfeature,which includes thestateful protocol redundancy and image versioning functions
• Multilayer Switch Feature Card (MSFC)
®
Cisco IOS
Software redundancy features—Dual Router Mode (DRM), Configuration-Synchronization (config-sync), and Single Router Mode (SRM).
This paper is based on the hybrid software model for the Cisco Catalyst 6500 Series (Cisco Catalyst OS on the supervisor engine, Cisco IOS Software on the MSFC) and not on the Cisco IOS Software model (native Cisco IOS Software). All feature set references will be specifically described as a Cisco Catalyst OS feature on a supervisor engine or a Cisco IOS Software feature on an MSFC. The Cisco Catalyst OS High Availabilityfeature was first introduced in the Cisco Catalyst OS 5.4 release and is available for both Cisco Catalyst Supervisor Engine 1A and Catalyst Supervisor Engine 2. Support for DRM began in Cisco IOS Software Release12.0(7)XE1. TheMSFC config-sync redundancy feature for DRM is supported in Cisco IOS Software Release
12.1(3a)E4 for both the MSFC and MSFC2. The MSFC SRM feature was first supported with Cisco Catalyst OS 6.3.1 and Cisco IOS Software Release 12.1(8)E2 for the MSFC2.
Figure 1 The Cisco Catalyst 6500
Series WS-6503, WS-C6506, WS-C6509, WS-C6509-NEBS, and WS-C6513
This paper is the second version of the original that was written in September 2000. This version includes some updated sections for moreprecise understanding and a discussionof SRM.
Although component-levelredundancy is very important, a high-availability network design relies on the proper combination of individual system redundancy and overall network
Cisco Systems, Inc.
All contents are Copyright © 1992–2002 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.
Page 1 of 19
redundancy. For more detail on high-availability network designs, refer to the white paper, Gigabit Campus Design, at:
http://www.cisco.com/warp/public/cc/so/neso/lnso/cpso/camp_wp.htm.

Switch Fabric Module Redundancy

Since its introduction, the Cisco Catalyst 6500 Series has been built on a single 32-Gbps bus switching architecture that provides the data path for all packets through the system. The Cisco Catalyst 6500 Series includes a 256-Gbps crossbar switching fabric (the SFM for higher bandwidth capacities and 30+ Mpps of forwardingperformance). The SFM is supported in the Cisco Catalyst 6506 and the Cisco Catalyst 6509 chassis. The SFM2 is essentially the same fabric but designed to work in all the Cisco Catalyst 6506, 6509, and 6513.

Switching Fabric Failover

The SFM also provides another level of hardware redundancy to the system. The single fabric channel versions of the fabric-enabled line cards providea connection to both theswitching fabric and the existingsystem busbackplane. This allows the Cisco Catalyst 6500 Series to use the SFM as the primary data path between fabric-enabled line cards. In the event that an SFM fails, the system will fail over to the 32-Gbps bus to ensure that packet switching continues (albeit at the bus capacities of 15 Mpps throughput and 32-Gbps bandwidth) and the network remains online. Additionally, a Cisco Catalyst 6500 Series can be configured with dual SFMs (in slots 5 and 6 of a Catalyst 6506 or Catalyst 6509 or in slots 7 and 8 of a Catalyst 6513), which provide a third level of fabric redundancy. In this configuration, a failure on the primary fabric module would result in a switchover to the secondary fabric module for continued operation at 30 Mpps. Also, in the event of further fabric module failures, the ability to switch over yet again to the system bus would still be available.

Switching Fabric Operation

Different combinations of SFMs, fabric-enabled line cards, and classic line cards in a chassis affect the internal switching operation, which in turn affects the failover characteristics. This is an important point to understand as fabric-to-fabric or fabric-to-bus failover scenarios are discussed. When an SFM is installed in a system of only fabric-enabled line cards, the switching operation is called compact mode. This allows for 32-byte compacted headers (not the entire packet) to be sent across the bus to the supervisor engine for each forwarding decision. The increase in efficiency for this operation allows for inherent system performance capable of 30 Mpps. The data path for fabric-enabled cards is via the SFM.
If a classic line card is installed in a system with an SFM, the header format on the bus must be compatible with all the line cards in the system. Because classic line cards do not support compact mode, the fabric-enabled line cards will change their switching modes to truncated mode. Truncated mode allows the fabric-enabled line card to send packets in a 64-byte header-only format that the classic line cards can understand. It is very important to note that the truncated mode still uses the SFM as the data path between fabric-enabled line cards. Although the maximum centralized forwarding performance is 15 Mpps in a system of classic and fabric cards, the switch fabric is still used to provide higher bandwidth to the system. If fabric-enabled line cards are installed in a system with no SFM, they will operate in flow-through mode even when classic cards are present. This mode essentially programs the line card to operate in a classic mode whereby the entire packet is sent across the system busfor a forwarding decision. Asystem in flow-through mode is capable of switching 15 Mpps and the data path is via the 32-Gbps bus.
All contents are Copyright © 1992–2002 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.
Cisco Systems, Inc.
Page 2 of 19
The changes to the switching mode are done automatically, depending on the hardware installed. No specific configuration is necessaryon theSFM for typical operation. Thecurrent switchingmode of the switch fabricmodule can be monitored through the Catalyst OS command-line interface (CLI) using the show fabric channel switchmode command. Example 1 shows a completely fabric-enabled system (all compact mode) and Example 2 shows classic and fabric-enabled line cards in an SFM system (flow-through and truncated mode).
Example 1: Fabric-Enabled System
The following output is from a configuration with dual supervisor engines, one SFM, and a fabric-enabled line card in slot 3.
Sup2-A> (enable) show fabric channel switchmode Module Num Fab Chan Fab Chan Switch Mode Channel Status
------ ------------ -------- ------------ -------------­ 1 1 0, 0 compact ok 2 1 0, 1 compact ok 3 1 0, 2 compact ok 5 18 0, 0 n/a ok 5 18 1, 1 n/a ok 5 18 2, 2 n/a ok 5 18 3, 3 n/a unknown 5 18 4, 4 n/a unknown 5 18 5, 5 n/a unknown 5 18 6, 6 n/a unknown 5 18 7, 7 n/a unknown 5 18 8, 8 n/a unknown 5 18 9, 9 n/a unknown 5 18 10, 10 n/a unknown 5 18 11, 11 n/a unknown 5 18 12, 12 n/a unknown 5 18 13, 13 n/a unknown 5 18 14, 14 n/a unknown 5 18 15, 15 n/a unknown 5 18 16, 16 n/a unknown 5 18 17, 17 n/a unknown 15 0 n/a n/a n/a 16 0 n/a n/a n/a
CLI output description for show fabric channel switchmode:
Num Fab Chan—The number of fabric channels that the module is associated with. FabChan—The first number is the fabric channel number that the module is associated with. The second number is the fabric
channel number that the SFM is associated with. Switch mode—Possible output is “flow through,” “truncated,”“compact.” Switch mode applies only to line cards with fabric
and bus connections. Channel status—Possible output is “ok,” “sync error,” “CRC error,” “heartbeat error,” “buffer error,” “timeout error,” or
“unknown.” Channel status applies only to line cards with fabric and bus connections.
All contents are Copyright © 1992–2002 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.
Cisco Systems, Inc.
Page 3 of 19
Example 2: Classic and Fabric-Enabled System
The following output is from a configuration with dual supervisor engines, one SFM, one classic line card in slot 3, and two fabric-enabled line cards in slots 7 and 9.
Sup-A> (enable) show fabric channel switchmode Module Num Fab Chan Fab Chan Switch Mode Channel Status
------ ------------ -------- ------------ -------------­ 1 1 0, 0 flow through ok 2 1 0, 1 truncated ok 3 0 n/a n/a n/a 5 18 0, 0 n/a ok 5 18 1, 1 n/a ok 5 18 2, 2 n/a unknown 5 18 3, 3 n/a unknown 5 18 4, 4 n/a unknown 5 18 5, 5 n/a unknown 5 18 6, 6 n/a ok 5 18 7, 7 n/a unknown 5 18 8, 8 n/a ok 5 18 9, 9 n/a unknown 5 18 10, 10 n/a unknown 5 18 11, 11 n/a unknown 5 18 12, 12 n/a unknown 5 18 13, 13 n/a unknown 5 18 14, 14 n/a unknown 5 18 15, 15 n/a unknown 5 18 16, 16 n/a unknown 5 18 17, 17 n/a unknown 7 1 0, 6 truncated ok 9 1 0, 8 truncated ok 15 0 n/a n/a n/a 16 0 n/a n/a n/a
Automatic switching mode changes allow the acceptance of classic or fabric-enabled cards into a system with no manual configuration change. As previously stated, there is a performance versus interoperability tradeoff when installing classic line cards into a fabric-enabled system. Because many network environments hold performance in higher regard, a fabric-enabled system can be configured to reject classic cards (for example, not support flow-through mode). By issuing the set system crossbar-fallback none command, the system will not start classic line cards installed in the chassis, thereby running in compact switching mode (30 Mpps) only.
Sup-A> (enable) set system crossbar-fallback none
The default for the crossbar-fallback is bus mode. Todetermine the current system state, the show system crossbar-fallback command is available.
Sup-A> (enable) show system crossbar-fallback Cross-fallback: bus-mode
In summary, the SFM can be redundantly configured in a chassis to provide fabric-to-fabric and fabric-to-bus failover. A system configured with dual SFMs can use the standby SFM for failover. Additionally, single SFM systems with fabric-enabled line cards can fail over to the 32-Gbps bus for continuous operation. In both of these scenarios, recovery and return to normal operation occur in less than three seconds. This quick recoverytime allows for a switching mode change and asynchronization processthat must takeplace betweeneach line card, supervisor engine,and the SFM fabric channelsin these scenarios. The capability to configure redundant SFMs provides up to three levelsof backplane redundancy,helping to enable continuous operations with minimal impact to network availability in the event of a hardware failure.
All contents are Copyright © 1992–2002 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.
Cisco Systems, Inc.
Page 4 of 19

Redundant Supervisor Engines

As previously mentioned, the High Availability feature on the Cisco Catalyst 6500 Series provides low-impact, stateful switchover between redundant supervisor engines. This feature was first available in Cisco Catalyst OS Software Version5.4.

Supervisor Engine Switchover

Dual supervisor engines provide hardware redundancy for the forwarding intelligence of the Cisco Catalyst 6500 Series. The Cisco Catalyst 6500 Series can support up to two supervisor engines in slots 1 and 2 only.One is the active supervisor engine and the other isthe standby supervisor engine. Theactive supervisor engine is the first one to go online. This can be confirmed by the “Active” LED on the supervisor engine or by typing the show module command from the console. Both supervisor engines must be the same hardware models. This means that if a Policy Feature Card (PFC) and a MSFC are on a Supervisor 1A in slot 1, then a PFC and MSFC must be also on a Supervisor Engine 1A in slot 2, or if a Supervisor Engine 2 is in slot 1, a Supervisor Engine 2 must also be in slot 2. Supervisor engines 1A and 2 can be used in the Cisco Catalyst 6000 and 6500 series. If an active supervisor is taken offline or fails, the standby supervisor takes control of the system.
The two supervisor engines in a redundant supervisor configuration have different responsibilities. The active supervisor engine is responsiblefor controllingthe system busand allline cards. All protocols arerunning onthe activesupervisorengine and it performs all packet forwarding. The standby supervisor engine does not communicate with the line cards. It receives packets from the network and populates its forwarding tables with this information but does not participate in any packet forwarding. The relevant protocols on the system are initialized, but not active, on the standby supervisor engine. The Cisco Catalyst 6500 Series supervisor engines are hot swappable and the standby supervisor engine can be installed in an active system without affecting network operation. Also important to note is that redundant supervisor engines do not perform load sharing. The active supervisor engine is providing the entire packet forwarding intelligence for the system (N+1 redundancy). If the active supervisor engine fails, the standby supervisor engine can maintain the same system load.
The standby supervisor engine polls the active supervisor engine via the Ethernet out-of-band channel (EOBC) every 5–10 milliseconds to monitor the online status of the active supervisor engine. The active supervisor engine may go offline for a variety of reasons such as hardware failures, system overload conditions, memory corruption issues, removal from chassis, or being reset by the operator. The standby supervisor engine detects this type of failure and becomes the new active supervisor engine. The Cisco Catalyst OS software on the supervisor engine is responsible for restoring the protocols, line cards, and forwarding engines to normal operation. This restoration takes place via a fast switchover or a high-availability switchover.

Supervisor Fast Switchover

Becausethe CiscoCatalyst OS High Availabilityfeatureis disabledby default, thealternativeis referredto as FastSwitchover. The Fast Switchover feature is the predecessor to the High Availability feature and as such is the supervisor switchover mechanism in place when high availability is disabled or not supported in the software version. This feature reduces the switchover time by skipping some events that would typically take place should a supervisor fail. Specifically, the fast switchovermechanism allowseach line card toskip the respective software downloadsand a portion of the diagnostics, which are normally a part of system re-initialization. The switchover still includes restarting all protocols at Layer 2 and above as well as resetting allports. The resulting switchoverperformance with default settings will take approximately 28 seconds plus the time it takes for the protocols to restart. As an example, a switch with the default time values for the Spanning-Tree Protocol took approximately 58seconds after the fast switchover to begin forwardingtraffic again. However, the time tobegin forwarding traffic after a fast switchover can be reduced by tuning the switch from the default settings. By enabling Portfast,
All contents are Copyright © 1992–2002 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.
Cisco Systems, Inc.
Page 5 of 19
disabling port channels (PagP), and turning trunking off for ports to which workstations are directly attached, the fast switchover time can be reduced to approximately 10 seconds. In a live network environment, these switchover times present a major disruption to network operations.

Supervisor High Availability Feature

The High Availability software feature of Cisco Catalyst OS further enhances the Cisco Catalyst 6500 Series hardware redundancy by also providing protocol redundancy. This feature includes stateful protocol redundancy and image versioning. The High Availability feature must be enabled via the CLI for these features to operate.
Sup-A> (enable) set system highavailability enable System high availability enabled.
As a general practice with redundant supervisors, it is recommended that the High Availability feature be enabled for normal operation.

Supervisor Stateful Protocol Redundancy

The stateful supervisor switchover is when the switchover time from the active to the standby supervisor is reduced to less than three seconds. This reduced downtime isachievedby synchronizingmany of the Layer 2, Layer 3, andLayer 4protocols between the active and standby supervisor engines and is called maintaining protocol state.
For stateful protocol redundancy between dual supervisor engines, a protocol state database is maintained on each supervisor engine for all protocols and featuresrequiring high-availabilitysupport. Most of these protocols areonly running on the active supervisor engine. In the eventof a high-availability switchover, the new activesupervisor engine can start the protocols from the updated database state, rather than the initialization state. This is how a redundant system can maintain stateful protocol redundancy and minimal network downtime when the active supervisor engine goes offline.
High AvailabilitySupported Feature—High availability iffully supported. The state of the feature is preservedbetween
the active and standby supervisor engines in the protocol database.
High AvailabilityCompatible Feature—High availability is not supported for these features. The protocol database for
these features is not synchronized between supervisor engines. The feature can be used if the High Availabilityfeature is enabled. For example, if GARP Multicast Registration Protocol (GMRP) and high availability were both enabled and a high-availabilitysupervisor enginefailover took place, the GMRP protocol would be restarted from the initialization state (non-stateful). The stateful protocol redundancy is still in place for the supported features if a compatible feature is enabled.
High AvailabilityIncompatible Feature—Highavailability is not supported. The protocol database for these features is
not synchronized between supervisor engines. The feature should not be enabled if the High Availability feature is enabled. These features are not supported with high availability enabled because incorrect behavior may result.
1
Important: Do not use these features if a high-availability system is required.
1. Layer 4 protocols include the Layer 4 information in extended IP access lists.
All contents are Copyright © 1992–2002 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.
Cisco Systems, Inc.
Page 6 of 19
Loading...
+ 13 hidden pages