Ethernet Switch Blade User's Guiderelease 3.2.2jpage ii
Legal Notices
The information in this document is subject to change without notice.
Hewlett-Packard makes no warranty of any kind with regard to this manual, including, but
not limited to, the implied warranties of merchantability and fitness for a particular
purpose. Hewlett- Packard shall not be held liable for errors contained herein or direct,
indirect, special, incidental or consequential damages in connection with the furnishing,
performance, or use of this material.
Restricted Rights Legend. Use, duplication or disclosure by the U.S. Government is subject
to restrictions as set forth in subparagraph (c) (1) (ii) of the Rights in Technical Data and
Computer Software clause at DFARS 252.227-7013 for DOD agencies, and subparagraphs
(c) (1) and (c) (2) of the Commercial Computer Software Restricted Rights clause at FAR
52.227-19 for other agencies.
Information in this document is provided in connection with Intel® products. No
license, express or implied, by estoppel or otherwise, to any intellectual property rights
is granted by this document. Except as provided in Intel’s Terms and Conditions of
Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any
express or implied warranty, relating to sale and/or use of Intel products including
liability or warranties relating to fitness for a particular purpose, merchantability, or
infringement of any patent, copyright or other intellectual property right. Intel
products are not intended for use in medical, life saving, or life sustaining applications.
Intel may make changes to specifications and product descriptions at any time, without
notice. HEWLETT-PACKARD COMPANY 3000 Hanover Street Palo Alto, California 94304
U.S.A.
Additional Copyright Notices. AdvancedTCA® is a registered trademark of the PCI
Industrial Computer Manufacturers Group. Linux® is a registered trademark of Linus
Torvalds. ZNYX Networks, RAIN, RAINlink, OpenArchitect®, CarrierClass and HotSwap
are trademarks or registered trademarks of ZNYX Networks in the United States and/or
other countries. All other marks, trademarks or service marks are the property of their
respective owners.
Ethernet Switch Blade User's Guiderelease 3.2.2jpage iii
About the Ethernet Switch Blade Manual
This manual includes everything you need to begin using the HP Ethernet Switch Blade with
OpenArchitect software, Release 3.2.2j.
Ethernet Switch Blade User's Guiderelease 3.2.2jpage iv
Table of Contents
Chapter 1 Overview of the Ethernet Switch Blade ...........................................................17
High Performance Embedded Switching...................................................................... 17
Using the S50layer3 Script.................................................................................. 98
Layer 3 Switch Using Multiple VLANs............................................................100
Using the S50multivlan Script...........................................................................100
To Modify the Layer 3 Multivlan Script ......................................................... 102
Modify the example script you copied into the /etc/rcZ.d directory. Adjust and
assign the number of IP addresses as applicable. In the example below, the IP
address is changed for the interface in the ifconfig command line of the script.
Editing the S50layer2 script can change the Ethernet Switch Blade Fabric Interface
default configuration. The S50Layer2 script and included example scripts
(/etc/rcZ.d/examples) can be used as templates to create custom scripts. The default
S50layer2 script configures the switch accordingly:..............................................163
The Ethernet Switch Blade is a 72-port AdvancedTCA® Hub and providing Gigabit Ethernet.
Up to 14 ATCA node boards may be addressed via the PICMG 3.0 Base Interface and via the
ATCA PICMG 3.1 fabric . The Base and Fabric switching domains are kept totally separate, both
on the physical layer and the software layer. The Ethernet Switch Blade provides a tightly
integrated modular switching platform that enables high-density solutions.
The Ethernet Switch Blade is actually two separate switches, one for the Base ports and one for
the fabric ports. There are two OpenArchitect® operating system images, one for each switch,
allowing the maximum in separation between the control signaling and the data. The modular
design provides great flexibility and control.
Ethernet Switch Blades can support a 10 Gigabit Ethernet Inter-Switch Link (ISL) for the Fabric
Interfaces, and a Gigabit Ethernet ISL for the Base Interface switches. Depending on the version
of OpenArchitect used, the ISL for the Fabric Interface switches may be operated at 10 Gigabits
per second and provide stacking features.
Linux-based OpenArchitect 3 runs on the embedded processors, providing a comprehensive
package for the management of Layer 2 and Layer 3 packet switching. VLAN management and
Layer 2-7 packet classification are also included with a user-friendly interface. OpenArchitect can
be used with a variety of IP routing protocols.
As part of Advanced TCA, the switch incorporates the PICMG 3.0 Intelligent Platform
Management Interface (IPMI) standard for Field Replaceable Unit FRU) management by the
Shelf Manager.
High Performance Embedded Switching
The Ethernet Switch Blade with OpenArchitect combines the performance of silicon-based
switching fabric with flexibility of software-managed routing policies. It provides Base fabric
PICMC 3.0 (1 Gigabit Ethernet ) links to each of the payload slots, plus two to four PICMC 3.1
in-band GigE ports to each node card, and GigE links to management ports and the second
switch. The Ethernet Switch Blade maintains the forwarding table on silicon, providing the
capability to switch and route at full line rate performance on every port.
Advanced TCA® Compliant
The Advanced TCA® standard developed by the PCI Industrial Computer Manufacturer Group
defines an embedded Ethernet environment for high availability chassis. This environment
includes two switch fabric slots that create a dual star Ethernet network to the 14 Base node slots.
Placing the Ethernet Switch Blade in a hub slot provides embedded Ethernet services to each
node card of the chassis. A standard HA configuration is one Ethernet Switch Blade placed in
each of the two hub slots in a chassis for creation of a redundant, high availability system.
The OpenArchitect software component – open source Linux, IP protocol stack, control
applications and the OA Engine – runs on two embedded PowerPC microprocessors.
OpenArchitect provides extensive managed IP routing protocols and other open standards for
switch management. Examples include network services; Virtual Redundant Router Protocol;
Routing Information Protocol; Open Shortest Path First; Border Gateway Protocol; Quality of
Service and Class of Service; access control lists; Simple Network Management Protocol MIBs,
Common Open Policy Services and web.
Extensible Customization of Routing Policies
The OpenArchitect software environment enables rapid porting of other UNIX/Linux-based
protocols, including open source software conforming to RFCs and other standards. It also
enables the development of application-specific protocol configuration scripts.
Powerful CarrierClass Features
The Ethernet Switch Blade has High Availability hardware features for advanced
telecommunication applications. The switch implements the PICMG 3.0 Full Hotswap support.
This feature provides field replaceable capabilities so a switch can fail and be replaced without
impacting the operational performance of a chassis.
The PICMG 3.0 Intelligent Platform Management Interface (IPMI) standard is also supported.
IPMI uses message-based interfaces that monitor the physical health characteristics of the
Ethernet Switch Blade. The switch provides operational status information to an IPMI
management application. End customers benefit with advanced notice of potential problems.
The Ethernet Switch Blade also implements the Media Dependent Interface called Auto MDI-X.
Auto MDI-X allows connections to any device, switches, hubs, or systems using a regular
straight-through or crossover Cat 5 cable. The RJ-45 port will auto detect and switch MDI/MDIX modes. This IEEE standard makes cabling – especially between switches – faster and less error
prone.
E-Keying is supported by the Ethernet Switch Blade.
Ethernet Port Layout
The Ethernet Switch Blade has a total of 72 switched Gigabit Ethernet ports. The base fabric is
connected via 24 Gigabit Ethernet ports and the data fabric is connected via 48 Gigabit Ethernet
ports. The Ethernet Switch Blade is actually composed of two separate switches, one for Base
port activity and another for fabric port activity. The Base ports ( control and signaling) are
switched on the Base switch, and the fabric ports ( data ) are switched on the fabric switch, which
provides total separation between system management or control packets, and customer data
packets.
You will find the Ethernet Switch Blade has a straightforward installation and configuration.
UNIX or Linux system management skills and some understanding of network protocols will be
required. Configure the Ethernet Switch Blades to your networking application before you
begin using the OpenArchitect switch.
OpenArchitect Switch Environment
The key elements of the OpenArchitect environment include two embedded Linux operating
systems, OpenArchitect-specific applications and libraries, plus, an innovative switch hardware
design.
OpenArchitect hardware is in many ways similar to typical switch architectures. The primary
difference in OpenArchitect is that the PCI bus that interfaces with the embedded processor and
the switch fabric is at a higher performance level than a typical switch (see Figure 1.1: Fabric
Switch Elements). The use of PCI creates a pipe of significant bandwidth between the processor
and the switch fabric.
The embedded processors, running Linux and the OpenArchitect processes, control the flow of all
traffic by maintaining the switch forwarding tables. These tables define the flow of the switch
traffic. Because they are on the switching chips, packets proceed at line rate.
OpenArchitect Software Structure
Figure 1.1: Fabric Switch Elements
OpenArchitect is based on an embedded Linux operating system and includes a number of ZNYX
Networks-supplied modules. The key element is the Linux routing table, which is crucial in a
The purpose of the routing table is to tell the packet forwarding software where to forward the
data packets. In Linux, the packet-forwarding algorithm is operated in software. Normally, the
routing tables are maintained by operator configuration and the various routing protocols that run
in the application environment of Linux.
OpenArchitect uses an innovative new approach for forwarding packets. It provides embedded
software daemons that replicate ( shadow) the Linux routing tables in the silicon-based
forwarding tables (see Figure 1.1: Fabric Switch Elements). In the OpenArchitect switching
environment, the switching chips do the real-time work in switching network packets. The switch
fabric consults its own forwarding tables for each incoming packet; and either filters or forwards
the packet to any egress port, the embedded CPU, or to any combination. The Linux routing
tables, running in software, are used to update the silicon-based tables. This provides both the
flexibility and control of the Linux software environment and the speed of dedicated switching
silicon.
The OpenArchitect environment includes additional features. For example, installing the
OpenArchitect switch gives you immediate implementation of Linux routing protocols. Also, you
have complete support of routing table updates and a standardized method for configuration.
Finally, you can quickly integrate bug fixes, protocol enhancements and additional protocol
implementations from the Linux community. You can also integrate OpenArchitect into other
Linux applications including VPN software, voice over IP protocols, Quality of Service, and
HTML configuration.
RAIN Management API (RMAPI) is a generic interface for passing control data. The
OpenArchitect libraries are implemented completely above RMAPI. The libraries provide a frontend to RMAPI to simplify application writing. Currently one library is implemented, a general
library called zlxlib. As the OpenArchitect application requirements grow, the existing library
will be expanded and additional libraries will be created.
OpenArchitect applications are used to program and configure the Ethernet Switch Blade. These
applications are implemented above the libraries and RMAPI.
The PICMG 3.1 standard defines an embedded Ethernet environment for Telco chassis. This
environment includes two switch fabric slots that create a dual star Ethernet network to the
fourteen node slots. Placing the Ethernet Switch Blade in a hub slot provides embedded Ethernet
services to each node card across the Packet Switching Backplane of the chassis. A standard
configuration is to place a Ethernet Switch Blade in each hub slot creating a redundant, high
availability system. This chapter provides information on the Ethernet Switch Blade port
connectors and LED indicators.
Connecting the Cables
Your switch setup may require some or all of the following types of cables: 10/100/1000 Port
Cabling
Category 5 cabling is required for all external ports. Be sure that your cable length is within the
minimum and maximum length restrictions for the Ethernet, otherwise you could experience
signal or data loss. All copper GigE ports on the Ethernet Switch Blade are auto-MDI sensing
and will automatically determine whether or not an MDI (straight-through) or MDI-X (crossover)
cable is attached.
Console Port Cabling
The switch console can be accessed via one RJ-45 10/100 service port located on the front panel
of the Ethernet Switch Blade.
NOTE: There are two switch portions that make up a Ethernet Switch Blade unit. Each
switch portion, Base and fabric, has its own console ports, and requires its own console
cable or OOB Ethernet cable.
The RS-232 configured RJ-45 connector console port on the front panel can be used to recover
from a system failure. It is used for maintenance only, and is generally not connected. Use a HP
console cable (P/N A6900-63006) provided with the HP bh5700 ATCA 14-Slot Blade Server, in
combination with a Modem Eliminator cable, to access the switch software through the console
port. Refer to the HP bh5700 ATCA 14-Slot Blade Server Installation Guide for additional
information.
Connecting to the Console Port
To attach the console cable to the OpenArchitect Base or fabric switch:
1. Plug the RJ-45 end of the console cable into the RJ-45 Console Port on the front.
2. Connect the Modem Eliminator cable to the DB-9 connector on the console cable.
3. Connect the other end of the Modem Eliminator cable to a standard COM port (9600, n,
8, 1).
4. Reinsert the switch into the shelf chassis and power up.
Use a terminal emulation program to access the switch console.
Out of Band Ports (OOB Ports)
Each switch, fabric and Base, in a Ethernet Switch Blade unit has out-of-band (OOB) Ethernet
ports on the front panel. This is an alternative maintenance port supplying Ethernet connectivity
instead of serial connectivity and is connected only when performing switch maintenance
activities. Use ifconfig to bring up and configure the OOB ports. The OOB ports are 100 full
duplex, not auto-sensing. The front OOB port is eth0, and the rear (not implemented with this
release) is eth1.
LED Reference
See Figure 2.1 for a schematic view of the front of a typical Ethernet Switch Blade board. Note
that there are out-of-band ports, RS232 ports, a USB port, and 10 Gig egress ports (not
implemented in this release). In-band ports from the Base and fabric switches have LED status
lights controlled from the LED Mode button. Press the button successively to display the Base
switch ports, fabric switch ports 0-23, and finally the fabric switch ports 24-47. There are
separate LEDs for the out-of-band ports, and the ATCA status functions.
High availability networking is achieved by eliminating any single point of failure through
redundant connectivity: Redundant cables, switches and network interfaces for hardware,
combined with HA software solutions on both the hosts and switches to control the HA hardware
and maintain connectivity. An HA solution called Surviving Partner is provided on the switch.
For host-side HA, the most common solution is to use the Linux bonding driver. HA solutions
like the Linux bonding driver present a single, virtual interface to the protocol stack while
managing multiple physical links. Figure 3.1: Host HA Architecture shows the relation of the
protocol stack, a bonding driver and physical ports.
Figure 3.1: Host HA
A failover between physical links can be made very quickly without requiring change to the IP or
MAC address of the virtual interface, effectively transparent from the applications point of view.
With redundant links from a switch (or switches) to the host, one link is maintained as the
ACTIVE link and the other as STANDBY. If the ACTIVE link were to go down, the STANDBY
becomes the new ACTIVE, while presenting the same virtual interface to the host.
NOTE: It is important that the bonding solution provide an active-backup mode. For the
Linux bonding driver set “mode == 1” see the http://sourceforge.net/projects/bonding/
documentation for more information. Use the recommendations for Linux kernel 2.4x not
2.6x.
Redundant connections provide an ACTIVE and STANDBY link to a switch, or provide
redundant links between more than one switch. In the case of more than one switch, a complete
HA solution requires a switch-based HA solution.
Surviving Partner
Surviving Partner is a switch-based HA solution. Surviving Partner runs on the switches to
provide transition of Layer 2 and Layer 3 switching functionality between two or more switches.
Surviving Partner is comprised of many interactive protocols and processes including VRRP,
zlmd, zlc, and others.
Since most end nodes use default router addresses, the change of the default router address during
a switch failover would require the end nodes to reconfigure. Layer 3 switches that failover must
maintain the default router address to maintain the end node's IP transparent failover. The Virtual
Router Redundancy Protocol (VRRP, RFC 2338) running in the Surviving Partner switches
provides transparent movement of the default router address. VRRP maintains the notion of a
Master switch and one or more Backup switches. This group of switches presents a virtual router
IP address that can be used by hosts on that net as their default route.
If a Backup switch determines the Master switch is no longer available, one of the Backup
switches will assume the role as Master. Physically, each switch maintains a link to the local
network. Only the Master switch answers to the default gateway, and the hosts on that net have
no need to relearn the router address.
In an HA configuration, the goal is to avoid any single point of failure. VRRP provides a good
mechanism to provide a static route for a local network, but a true HA configuration must also
provide redundant connections for the host. Providing a virtual router for the local network is not
enough. Take the simple case of two hosts on the local network with a connection to the virtual
router. Each host needs a connection to each physical switch participating in VRRP. In the
simplest configuration, each host would have one connection to the network. An HA solution
would include redundant connections from each host to each switch in the virtual router.
Combining the features of Surviving Partner on the switches and HA bonding drivers on the hosts
allows implementation of this true HA configuration.
zlmd
In addition to complete switch failover, single link failure must be properly handled. The Link
Monitor Daemon zlmd, monitors the link status of each port. If a link goes down, zlmd
communicates with the VRRP daemon (vrrpd) to change its priority. Changing the VRRP
priority results in movement of switching functionality. By combining zlmd with the zlc
application, links connected to hosts that have not failed can be deterministically moved to the
new master switch if desired. Supported modes include:
•switch - The switch with the greatest number of UP links becomes the Master for all
VLANs under HA management.
•Vlan - The switch with the greatest number of UP links in that particular VLAN becomes
the Master for that particular VLAN. If the switch has additional VLANs, they each
change independently.
•Port - The Master will remain the Master for that particular VLAN until all ports in that
VLAN are down. The Backup then becomes the new Master for that VLAN. Failed links
move their connectivity through the Backup Switch and the switch interconnect to reach
the Master Switch. This option alleviates the need to move all nodes to a new switch just
because a single link goes down.
NOTE: All modes require inclusion of the interconnect in the VLAN. The ISL
connection between the two Base switches is port 23 for the Ethernet Switch Blade. The
ISL connection between the two fabric slots in port 51.
When a switch fails, it must be replaced. The replacement switch will likely require proper
configuration. For transparent switch replacement, the newly replaced switch must learn its
configuration from its Surviving Partner.
In a simple failover scenario, Host A and Host B are configured with failover between two host
ports, one port connected to Switch A and the other connected to Switch B. Assume Switch A
provides connectivity between Host A and Host B. If Switch A fails, the active link on each host
moves over to the port connected to Switch B. Surviving Partner software on Switch B
recognizes that Switch A has failed, and assumes the role of switching traffic between Host A and
Host B. When the failed Switch A is replaced with a new Switch A', Switch A' will learn its
network configuration from the surviving partner Switch B. Switch A' is now ready as a backup
to Switch B in case of failure of Switch B.
This is achieved through the use of DHCP. When a switch becomes a VRRP Master, a DHCP
server is started with a pointer to a configuration file that contains configuration information for
its partners. The replacement switch comes up running DHCP client to retrieve its configuration.
Proper configuration of Surviving Partner requires coordinated configuration of many different
processes, including vrrpd, zlmd, zlc, and dhcpd. The daemon processes run scripts to
perform their actions. Because these scripts are complex and inter-dependent, a configuration
application called zspconfig is used to build them.
The basic steps to configuring Surviving Partner are:
1. Determine your desired configuration.
2. Modify the configuration file (
the default
3. Configure startup scripts or other scripts such as gated routing scripts and vrrp
configuration scripts.
4. Run
5.
Run zspconfig –u
zspconfig
zspconfig performs the job of building the scripts based on a provided input file locally, or
from a remote machine. A text-based configuration file provides input to zspconfig. Example
configuration files are included on the switch in /etc/rcZ.d/surviving_partner. The
result of zspconfig is to create several configuration files and runtime shell scripts, and
optionally start the Surviving Partner processes. Scripts are generated for configuring VLANs,
starting the network, and starting the vrrpd and zlmd daemons.
zspconfig can also used by sibling backup switches to retrieve configuration from the
Surviving Partner and start the vrrpd and zlmd daemons. zspconfig is generally only run
once to configure Surviving Partner.
) to use as input to the configuration utility (zspconfig).
The configuration and runtime scripts created are as follows:
•S70Surviving_partner Switch initialization script that is run at boot time. This
script will restart the switch with the original configuration given to zspconfig.
Optionally, zspconfig will run this script from the initial invocation.
•zsp.conf.<n> - zspconfig configuration file that contains the configuration of
the sibling backup switches. The <n> is used to distinguish potentially more than one
backup switch. This configuration file is placed in /tftpboot, and is retrieved via
DHCP during configuration of the backup switch by zspconfig with the “-u” option
or, by a replacement switch on boot up.
•vrrpd.conf - Configuration script for the VRRP daemon. This configuration is
used when the S70Surviving_partner script launches vrrpd. There is a line in this file for
each virtual router address vrrpd will manage.
•dhcpd.conf - Configuration script used by dhcpd when the switch becomes
master. dhcpd is also used to give replacement switches their configuration scripts.
Namely a zsp_.conf<n> file that can be input to zspconfig with the -u flag.
•dhclient.conf - If zspconfig is executed with the -u flag, a dhclient.conf file
is created, and then dhclient is used to retrieve a zspconfig configuration file from the
/tftpboot area of the Master switch.
•vrrpd.script - Runtime script that executes each time the vrrpd changes state.
This script starts and stops dhcpd, and toggles down RAINlink ports to force the
RAINlink nodes to a new Master switch.
•zlmd.script - Runtime script executed by zlmd when a link goes up or down. This
script modifies the priority of the vrrpd that in turn may cause the VRRP Master to move
from one sibling switch to another.
After the scripts are created, zspconfig may run the
S70Surviving_partner
script to start
the Surviving Partner tasks. The tasks started are vrrpd, zlmd, and dhcpd.
The vrrpd and zlmd daemons run scripts to perform their actions. When vrrpd changes state
between Master and Backup, it runs a script that starts and stops dhcpd. When zlmd sees a link
go up or down, it runs a script that communicates with vrrpd via vrrpconfig.
Example HA Switch Configuration
The following walks through a basic Surviving Partner configuration typical for an HA setup.
Assume an HA chassis with multiple hosts, such as single-board CPUs, and two switches
configured for Surviving Partner. Each of the hosts has two Base Ethernet ports providing a link
to each of the Base switches and up to four fabric Ethernet ports providing links to each of the
Fabric switches.
Each host runs Linux bonding drivers (or ZNYX OA Node software with embedded RAINlink)
with the ports configured for failover. An interlink provides communication between the Base
switches. Another interlink provides communication between the Fabric switches.