HP bh5700 User Manual

Page 1
HP bh5700 ATCA 14-Slot Blade Server
Ethernet Switch Blade
First Edition
Manufacturing Part Number: AD171-9603A
June 2006
Page 2
Ethernet Switch Blade User's Guide release 3.2.2j page ii
Page 3
Legal Notices
Hewlett-Packard makes no warranty of any kind with regard to this manual, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Hewlett- Packard shall not be held liable for errors contained herein or direct, indirect, special, incidental or consequential damages in connection with the furnishing, performance, or use of this material.
Restricted Rights Legend. Use, duplication or disclosure by the U.S. Government is subject to restrictions as set forth in subparagraph (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013 for DOD agencies, and subparagraphs (c) (1) and (c) (2) of the Commercial Computer Software Restricted Rights clause at FAR
52.227-19 for other agencies.
Information in this document is provided in connection with Intel® products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel’s Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice. HEWLETT-PACKARD COMPANY 3000 Hanover Street Palo Alto, California 94304 U.S.A.
Copyright Notice. Copyright ©2003 Hewlett-Packard Development Company, L.P. Reproduction, adaptation, or translation of this document without prior written permission is prohibited, except as allowed under the copyright laws.
Additional Copyright Notices. AdvancedTCA® is a registered trademark of the PCI Industrial Computer Manufacturers Group. Linux® is a registered trademark of Linus
Torvalds. ZNYX Networks, RAIN, RAINlink, OpenArchitect®, CarrierClass and HotSwap are trademarks or registered trademarks of ZNYX Networks in the United States and/or other countries. All other marks, trademarks or service marks are the property of their respective owners.
Ethernet Switch Blade User's Guide release 3.2.2j page iii
Page 4
About the Ethernet Switch Blade Manual
This manual includes everything you need to begin using the HP Ethernet Switch Blade with OpenArchitect software, Release 3.2.2j.
Ethernet Switch Blade User's Guide release 3.2.2j page iv
Page 5
Table of Contents
Chapter 1 Overview of the Ethernet Switch Blade ...........................................................17
High Performance Embedded Switching...................................................................... 17
Advanced TCA® Compliant.........................................................................................17
OpenArchitect Switch Management............................................................................. 18
Extensible Customization of Routing Policies..............................................................18
Powerful CarrierClass Features.....................................................................................18
Ethernet Port Layout..................................................................................................... 18
Ethernet Switch Blade Port Configuration..................................................................19
Base switch Quick Reference.............................................................................. 19
Fabric Switch Quick Reference........................................................................... 19
OpenArchitect Switch Environment............................................................................. 20
OpenArchitect Software Structure................................................................................ 20
Chapter 2 Port Cabling and LED Indicators...................................................................... 23
Connecting the Cables...................................................................................................23
Console Port Cabling................................................................................................23
Connecting to the Console Port................................................................................23
Out of Band Ports (OOB Ports).................................................................................... 24
LED Reference......................................................................................................... 24
Chapter 3 High Availability Networking...........................................................................27
Surviving Partner.......................................................................................................... 27
VRRP........................................................................................................................28
zlmd.......................................................................................................................... 28
Switch Replacement and Reconfiguration............................................................... 29
zspconfig...................................................................................................................29
Example HA Switch Configuration..........................................................................30
Modifying zsp.conf on the Base switch...............................................................31
Modifying zsp_vlan.conf on the Fabric Switch...................................................35
Configuring Surviving Partner......................................................................................42
Central Authority.......................................................................................................... 43
Chapter 4 Fabric Switch Configuration ............................................................................ 46
Two switches, two consoles..........................................................................................46
Connecting to the Fabric Switch Console.....................................................................46
OpenArchitect Configuration Procedure.......................................................................46
Changing the Shell Prompt........................................................................................... 47
Default Configuration Scripts...................................................................................47
Example Configuration Scripts................................................................................ 47
Overview of OpenArchitect VLAN Interfaces.........................................................48
Tagging and Untagging VLANs..........................................................................48
Switch Port Interfaces..........................................................................................49
Layer 2 Switch Configuration.......................................................................................49
Using the S50layer2 Script.................................................................................. 50
Ethernet Switch Blade User's Guide release 3.2.2j page v
Page 6
Rapid Spanning Tree................................................................................................ 50
To Enable Rapid Spanning Tree:.........................................................................51
Port Path Cost...................................................................................................... 51
Layer 3 Switch Configuration............................................................................. 52
Using the S50layer3 Script.................................................................................. 52
Layer 3 Routing Protocols with GateD ........................................................................54
Using the S55gatedRip1 Script................................................................................ 54
To Modify the GateD Scripts: ................................................................................. 56
Class of Service (COS) ................................................................................................ 57
Egress Queues.......................................................................................................... 57
Ingress Classification................................................................................................57
Marking and Re-marking......................................................................................... 58
Scheduling................................................................................................................ 58
ztmd Explained.................................................................................................... 58
zfilterd Explained..................................................................................................... 58
Running zfilterd........................................................................................................58
Restrictions on Implementation................................................................................59
Conflict Resolution..............................................................................................59
iptables and filtering............................................................................................ 60
Introduction..........................................................................................................60
Packet Walk......................................................................................................... 61
Filter Rules Specifications...................................................................................62
Specifying Source and Destination IP Addresses.................................................... 62
Specifying Protocol............................................................................................. 62
Specifying an ICMP Message Type.................................................................... 62
Specifying TCP or UDP ports............................................................................. 63
Specifying TCP flags...........................................................................................63
Specifying an Interface........................................................................................ 63
Filter Rule Targets............................................................................................... 63
Supported Targets................................................................................................63
Classical Targets..................................................................................................63
ZNYX Targets..................................................................................................... 63
ZACTION Examples........................................................................................... 64
Extensions to the default matches........................................................................64
tc and zqosd ............................................................................................................. 65
FIFO Queues (pfifo and bfifo disciplines)...........................................................65
PRIO and WRR queues....................................................................................... 67
The U32 Filter.......................................................................................................... 69
Combining Queuing Disciplines.............................................................................. 69
Handle Semantics................................................................................................ 70
COPS: Common Open Policy Service..........................................................................70
Protocol Architecture................................................................................................71
OpenArchitect PEP...................................................................................................71
Using pepd................................................................................................................72
Ethernet Switch Blade User's Guide release 3.2.2j page vi
Page 7
Chapter 5 Fabric Switch Administration........................................................................... 73
Setting the Root Password............................................................................................ 73
Adding Additional Users...............................................................................................73
Setting up a Default Route............................................................................................ 74
Name Service Resolution..............................................................................................74
DHCP Client Configuration..........................................................................................74
DHCP Server Configuration......................................................................................... 74
Network Time Protocol (NTP) Client Configuration................................................... 75
Network File System (NFS) Client Configuration........................................................75
NFS Server Configuration.............................................................................................76
Connecting to the Switch Using FTP............................................................................77
ftpd Server Configuration............................................................................................. 77
Connecting to the Switch Using TFTP......................................................................... 77
TFTPD Server Configuration........................................................................................77
SNMP Agent................................................................................................................. 78
Supported MIBS.......................................................................................................78
Supported Traps........................................................................................................79
SNMP and OpenArchitect Interface Definitions......................................................80
ifStackTable Entries.............................................................................................81
SNMP Configuration................................................................................................81
SNMP Applications..................................................................................................82
Port Mirroring............................................................................................................... 82
Link and LED Control.................................................................................................. 83
Link Event Monitoring..................................................................................................83
Chapter 6 Fabric Switch Maintenance...............................................................................84
Overview of the OpenArchitect switch boot process....................................................84
Saving Changes.............................................................................................................86
Modifying Files and Updating the Switch.................................................................... 86
Recovering from a System Failure................................................................................86
System Boots with a Console Cable............................................................................. 86
Booting with the –i option.............................................................................................87
System Hangs During Boot...........................................................................................88
Booting the Duplicate Flash Image...............................................................................88
Upgrading the OpenArchitect Image............................................................................ 88
Upgrading or Adding Files............................................................................................89
Excluding Saving Files to Flash............................................................................... 89
Upgrading the Switch Driver........................................................................................ 89
Using apt-get................................................................................................................. 90
Chapter 7 Base Switch Configuration................................................................................91
Two switches, two consoles..........................................................................................91
Connecting to the Base Switch Console....................................................................... 91
OpenArchitect Configuration Procedure..................................................................91
Changing the Shell Prompt.......................................................................................92
Default Configuration Scripts..............................................................................92
Ethernet Switch Blade User's Guide release 3.2.2j page vii
Page 8
Example Configuration Scripts............................................................................92
Overview of OpenArchitect VLAN Interfaces....................................................93
Tagging and Untagging VLANs..........................................................................94
Switch Port Interfaces..........................................................................................94
Layer 2 Switch Configuration.................................................................................. 94
Using the S50layer2 Script.................................................................................. 96
Rapid Spanning Tree................................................................................................ 96
To Enable Rapid Spanning Tree:.........................................................................96
Port Path Cost...................................................................................................... 97
Layer 3 Switch Configuration.................................................................................. 97
Using the S50layer3 Script.................................................................................. 98
Layer 3 Switch Using Multiple VLANs............................................................100
Using the S50multivlan Script...........................................................................100
To Modify the Layer 3 Multivlan Script ......................................................... 102
Modify the example script you copied into the /etc/rcZ.d directory. Adjust and assign the number of IP addresses as applicable. In the example below, the IP address is changed for the interface in the ifconfig command line of the script.
........................................................................................................................... 102
Layer 3 Routing Protocols with GateD ................................................................. 102
Using the Provided S55gatedRip1 Script.......................................................... 102
To Modify the GateD Scripts: .......................................................................... 104
Class of Service (COS) ..........................................................................................105
Egress Queues....................................................................................................105
Ingress Classification.........................................................................................105
Marking and Re-marking...................................................................................106
Scheduling......................................................................................................... 106
zcos......................................................................................................................... 106
zfilterd.....................................................................................................................106
ztmd................................................................................................................... 106
Running zfilterd................................................................................................. 107
Restrictions on Implementation.........................................................................107
Conflict Resolution............................................................................................107
iptables and filtering............................................................................................... 108
Introduction........................................................................................................109
Packet Walk....................................................................................................... 110
Filter Rules Specifications.................................................................................110
Specifying Source and Destination IP Addresses..............................................110
Specifying Protocol........................................................................................... 110
Specifying an ICMP Message Type.................................................................. 110
Specifying TCP or UDP ports........................................................................... 111
Specifying TCP flags.........................................................................................111
Specifying an Interface...................................................................................... 111
Filter Rule Targets............................................................................................. 111
Supported Targets..............................................................................................111
Ethernet Switch Blade User's Guide release 3.2.2j page viii
Page 9
Classical Targets................................................................................................111
ZNYX Targets................................................................................................... 112
ZACTION Examples......................................................................................... 112
Extensions to the default matches......................................................................113
tc: Traffic Control..................................................................................................113
Strict Priority Qdisc................................................................................................113
Weighted Round Robin Qdisc................................................................................114
FIFO Queues (pfifo and bfifo disciplines).........................................................114
Fifo Qdiscs..............................................................................................................115
Using Filters to Direct Packets to a COS Queue....................................................115
Protocol ip.............................................................................................................. 115
Protocol arp............................................................................................................ 116
Protocol all..............................................................................................................116
Matching Specific Ingress Ports.............................................................................116
Advanced Filtering – Policing................................................................................117
Examples............................................................................................................118
Policing Actions..................................................................................................... 118
u32 match selectors used in filters.........................................................................119
zqosd.......................................................................................................................120
PRIO and WRR queues..................................................................................... 121
The U32 Filter........................................................................................................ 123
Combining Queuing Disciplines............................................................................ 124
Handle Semantics.............................................................................................. 124
COPS: Common Open Policy Service................................................................... 124
Protocol Architecture.........................................................................................125
OpenArchitect PEP............................................................................................126
Using pepd......................................................................................................... 126
Chapter 8 Base Switch Administration............................................................................128
Setting the Root Password......................................................................................128
Adding Additional Users........................................................................................128
Setting up a Default Route..................................................................................... 129
Name Service Resolution....................................................................................... 129
DHCP Client Configuration................................................................................... 129
DHCP Server Configuration...................................................................................129
Network Time Protocol (NTP) Client Configuration.............................................130
Network File System (NFS) Client Configuration.................................................130
NFS Server Configuration......................................................................................131
Connecting to the Switch Using FTP..................................................................... 131
ftpd Server Configuration.......................................................................................132
Connecting to the Switch Using TFTP...................................................................132
TFTPD Server Configuration................................................................................. 132
SNMP Agent.......................................................................................................... 132
Supported MIBS................................................................................................ 132
Supported Traps.................................................................................................134
Ethernet Switch Blade User's Guide release 3.2.2j page ix
Page 10
SNMP and OpenArchitect Interface Definitions............................................... 134
ifStackTable Entries...........................................................................................135
SNMP Configuration......................................................................................... 135
SNMP Applications........................................................................................... 136
Port Mirroring.........................................................................................................136
Link and LED Control............................................................................................137
Link Event Monitoring........................................................................................... 137
Chapter 9 Base Switch Maintenance............................................................................... 138
Overview of the OpenArchitect switch boot process............................................. 138
Saving Changes...................................................................................................... 140
Modifying Files and Updating the Switch..............................................................140
Recovering from a System Failure......................................................................... 140
System Boots with a Console Cable.......................................................................140
Booting with the –i option......................................................................................141
System Hangs During Boot.................................................................................... 142
Booting the Duplicate Flash Image........................................................................ 142
Upgrading the OpenArchitect Image......................................................................142
Upgrading or Adding Files.....................................................................................143
Excluding Saving Files to Flash............................................................................. 143
Upgrading the Switch Driver..................................................................................143
Using apt-get.......................................................................................................... 144
Chapter 10 Connecting to the Ethernet Switch Blade..................................................... 145
Base Interface Hub System:........................................................................................ 145
Ethernet Interfaces: ................................................................................................145
Management Interfaces: ........................................................................................ 145
Fabric Interface Hub System: .....................................................................................146
Ethernet Interfaces: ................................................................................................146
Management Interfaces: ........................................................................................ 146
Connecting to the Base Interface................................................................................ 146
Base Interface Serial Port Connection....................................................................146
Base Interface Out-of-Band Ethernet Connection .................................................147
Connecting to the Fabric Interface .............................................................................148
Fabric Interface Serial Port Connection ................................................................ 148
Fabric Interface Out of Band Ethernet Connection ...............................................149
Chapter 11 Diagnosing a Failed Ethernet Switch Blade Activation ..............................150
Accessing the ShMM.................................................................................................. 152
Verifying Communications Between the ShMM and Switch................................ 152
Critical Threshold Error Reported..................................................................... 152
Analyzing Mstate information for the switch............................................................. 153
Checking the ekey Status From the Shelf Manager.................................................... 153
Chapter 12 Troubleshooting a Failed OpenArchitect Load.............................................155
Recovering from a System Failure .............................................................................157
Booting Without the Overlay File...............................................................................158
Ethernet Switch Blade User's Guide release 3.2.2j page x
Page 11
Booting the Duplicate Flash Image ............................................................................159
Chapter 13 Network Configuration Problems ............................................................... 160
Interface Overview......................................................................................................160
Physical Interfaces..................................................................................................160
Default Base Interface Configuration.....................................................................161
24 port, Layer 2 Switching, single VLAN.........................................................161
Default Fabric Interface Configuration.................................................................. 163
Editing the S50layer2 script can change the Ethernet Switch Blade Fabric Interface default configuration. The S50Layer2 script and included example scripts (/etc/rcZ.d/examples) can be used as templates to create custom scripts. The default
S50layer2 script configures the switch accordingly:..............................................163
Configuration Troubleshooting...................................................................................165
Determining ekey status for a specific slot................................................................. 165
Querying Base Interface ekey Status......................................................................167
Querying Fabric Interface ekey Status................................................................... 168
Network Connectivity Troubleshooting......................................................................170
No Connection........................................................................................................170
Diminished Network Throughput...........................................................................170
Connecting to Devices with Fixed Port Speeds ......................................................... 170
External Fault LED..................................................................................................... 170
Network Tests............................................................................................................. 171
Ping Test ................................................................................................................171
Traceroute Test.......................................................................................................172
Chapter 14 Isolating Hardware Failures.......................................................................... 173
Hardware Subsystem...................................................................................................176
Testing the FlashROMs...............................................................................................177
Testing the Switch Fabric............................................................................................178
Link Status for a single port................................................................................... 178
Link Status for a range of ports.............................................................................. 178
Testing the onboard RAM...........................................................................................179
Testing the Control Processor..................................................................................... 180
Hardware Fault....................................................................................................... 181
Software Error................................................................................................... 181
Chapter 15 High Availability Troubleshooting............................................................... 183
Spontaneous Failover Activity....................................................................................183
Unexpected Fail-back Activity...............................................................................183
Chapter 16 Switch Firmware Overview.......................................................................... 184
Checking the switch firmware version........................................................................184
3.1 Fabric Interface............................................................................................185
Updating the Switch Firmware................................................................................... 186
BootLoader Firmware Upgrade:.............................................................................186
OpenArchitect Firmware Upgrade:........................................................................ 186
IPMC Firmware Upgrade:......................................................................................187
Ethernet Switch Blade User's Guide release 3.2.2j page xi
Page 12
Chapter 17 Restoring the Factory Default Configuration................................................188
Chapter 18 Before Calling Support..................................................................................189
Appendix A Fabric Switch Command Man Pages........................................................ 191
vrrpconfig ...................................................................................................................192
vrrpd ........................................................................................................................... 194
zbootcfg ......................................................................................................................197
zconfig ........................................................................................................................199
zcos .............................................................................................................................207
zdog ............................................................................................................................211
zfilterd ........................................................................................................................ 213
zflash........................................................................................................................... 214
zl2, zl2mc, zl3host, zl3net, zvlan................................................................................ 216
zgvrpd .........................................................................................................................219
zl2d .............................................................................................................................221
zl3d .............................................................................................................................223
zlc ............................................................................................................................... 225
zlmd ............................................................................................................................228
zlogrotate ....................................................................................................................230
zmirror ........................................................................................................................231
zmnt.............................................................................................................................233
zpeer ........................................................................................................................... 235
zqosd .......................................................................................................................... 238
zrc ...............................................................................................................................240
zreg..............................................................................................................................241
zrld ............................................................................................................................. 243
zsnoopd ...................................................................................................................... 244
zspconfig .................................................................................................................... 246
zstack ..........................................................................................................................253
ztats............................................................................................................................. 258
zsync............................................................................................................................259
ztmd ............................................................................................................................261
brctl(8) ........................................................................................................................263
Appendix B Base Switch Command Man Pages...........................................................266
vrrpconfig ...................................................................................................................267
vrrpd ........................................................................................................................... 269
zbootcfg ......................................................................................................................272
zconfig ........................................................................................................................274
zcos .............................................................................................................................282
zdog ............................................................................................................................286
zffpcounter ................................................................................................................. 288
zfilterd......................................................................................................................... 292
zflash........................................................................................................................... 293
zgmrpd........................................................................................................................ 295
Ethernet Switch Blade User's Guide release 3.2.2j page xii
Page 13
zgr................................................................................................................................297
zgvrpd..........................................................................................................................300
zl2d..............................................................................................................................302
zl3d..............................................................................................................................304
zlc ............................................................................................................................... 306
zlmd ............................................................................................................................308
zlogrotate ....................................................................................................................310
zmirror ........................................................................................................................311
zmnt.............................................................................................................................314
zpeer ........................................................................................................................... 316
zqosd........................................................................................................................... 319
zrc ...............................................................................................................................321
zreg..............................................................................................................................322
zrld ............................................................................................................................. 324
zsnoopd ...................................................................................................................... 325
zspconfig .................................................................................................................... 328
zstack ..........................................................................................................................336
ztats............................................................................................................................. 340
zsync............................................................................................................................341
ztmd.............................................................................................................................343
brctl(8).........................................................................................................................345
Appendix C Intelligent Platform Management Interface ..............................................348
ISwitch-ShMC Interaction.......................................................................................... 348
Peripheral Management Controller Functional Support............................................. 349
Sensor Reading Example........................................................................................350
Structure of Standard IPMI Commands: From BMC to PMC....................................352
Structure of Standard IPMI Responses: From PMC to BMC..................................... 352
Event Generator ........................................................................................................ 353
IPMB Event message format............................................................................. 353
IPMI Event Message Definitions.......................................................................353
Field Replaceable Unit Inventory Device.............................................................. 353
IPMB Override/Local Status - Event Data 3 for the IPMB link........................354
Table of Figures
Figure 1.1: Fabric Switch Elements...................................................................................20
Figure 1.2: OpenArchitect Software Structure.................................................................. 22
Figure 2.1: LED Reference................................................................................................ 25
Figure 3.1: Host HA Architecture......................................................................................27
Figure 4.1: Fabric VLANs................................................................................................. 48
Figure 4.2: Firewall Flow ................................................................................................ 61
Figure 4.3: COPS Network Architecture........................................................................... 70
Figure 6.1: ROM Devices in Open Architect.................................................................... 84
Figure 6.2: Boot Flow Chart.............................................................................................. 85
Ethernet Switch Blade User's Guide release 3.2.2j page xiii
Page 14
Figure 6.3: Init Script Flow................................................................................................86
Figure 7.1: Multiple VLANs..............................................................................................94
Figure 7.2: Layer 2 Switch ................................................................................................95
Figure 7.3: Layer 3 Switch ................................................................................................99
Figure 7.4: Multiple VLAN Configuration......................................................................101
Figure 7.5: Firewall Flow ............................................................................................... 109
Figure 7.6: COPS Network Architecture........................................................................ 125
Figure 9.1: ROM Devices in OpenArchitect................................................................... 138
Figure 9.2: Booting up Process Flow..............................................................................139
Figure 9.3: Init Script Flow..............................................................................................140
Figure 10.1: Fabric and Base .......................................................................................... 145
Figure 10.2: Base Interface Serial Port............................................................................ 147
Figure 10.3: Fabric Interface Serial Ports........................................................................ 148
Figure 11.1: Ethernet Switch Blade Activation States.....................................................150
Figure 12.1: OpenArchitect Boot Process....................................................................... 156
Figure 12.2: ROM Devices in OpenArchitect................................................................. 157
Figure 18.1: ROM Devices in OpenArchitect................................................................. 190
Index of Tables
Table 5.1: Supported MIBs................................................................................................79
Table 5.2: Supported Traps................................................................................................80
Table 5.3: Link and SNMP Status..................................................................................... 81
Table 7.1: Port Path Cost................................................................................................... 97
Table 7.2: Policing Actions..............................................................................................119
Table 7.3: U Match Selectors...........................................................................................120
Table 8.1: Supported MIBs..............................................................................................134
Table 8.2: Supported Traps..............................................................................................134
Table 8.3: Physical Link Status on Base Switch..............................................................135
Table 11.1: Troubleshooting States................................................................................. 152
Table 13.1: Ethernet Switch Blade Backplane Interfaces (zre Ports).............................. 160
Table 13.2: Additional Interfaces.................................................................................... 161
Table C.1.: IPMI M States............................................................................................... 349
Table C.2: PMC Controller Support................................................................................ 349
Table C.3: GetSensorReading..........................................................................................350
Table C.4: GetSensorResonse..........................................................................................351
Table C.5: Standard IPMI Commands.............................................................................352
Table C.6: Standard IPMI Responses.............................................................................. 352
Table C.7: Event Message Format...................................................................................353
Table C.8: SEEPROM Space...........................................................................................354
Table C.9.: IPMB Override Status Data.......................................................................... 355
Ethernet Switch Blade User's Guide release 3.2.2j page xiv
Page 15
Ethernet Switch Blade User's Guide release 3.2.2j page 15
Page 16
Ethernet Switch Blade User's Guide release 3.2.2j page 16
Page 17
Chapter 1 Overview of the Ethernet Switch Blade
The Ethernet Switch Blade is a 72-port AdvancedTCA® Hub and providing Gigabit Ethernet. Up to 14 ATCA node boards may be addressed via the PICMG 3.0 Base Interface and via the ATCA PICMG 3.1 fabric . The Base and Fabric switching domains are kept totally separate, both on the physical layer and the software layer. The Ethernet Switch Blade provides a tightly integrated modular switching platform that enables high-density solutions.
The Ethernet Switch Blade is actually two separate switches, one for the Base ports and one for the fabric ports. There are two OpenArchitect® operating system images, one for each switch, allowing the maximum in separation between the control signaling and the data. The modular design provides great flexibility and control.
Ethernet Switch Blades can support a 10 Gigabit Ethernet Inter-Switch Link (ISL) for the Fabric Interfaces, and a Gigabit Ethernet ISL for the Base Interface switches. Depending on the version of OpenArchitect used, the ISL for the Fabric Interface switches may be operated at 10 Gigabits per second and provide stacking features.
Linux-based OpenArchitect 3 runs on the embedded processors, providing a comprehensive package for the management of Layer 2 and Layer 3 packet switching. VLAN management and Layer 2-7 packet classification are also included with a user-friendly interface. OpenArchitect can be used with a variety of IP routing protocols.
As part of Advanced TCA, the switch incorporates the PICMG 3.0 Intelligent Platform Management Interface (IPMI) standard for Field Replaceable Unit FRU) management by the Shelf Manager.
High Performance Embedded Switching
The Ethernet Switch Blade with OpenArchitect combines the performance of silicon-based switching fabric with flexibility of software-managed routing policies. It provides Base fabric PICMC 3.0 (1 Gigabit Ethernet ) links to each of the payload slots, plus two to four PICMC 3.1 in-band GigE ports to each node card, and GigE links to management ports and the second switch. The Ethernet Switch Blade maintains the forwarding table on silicon, providing the capability to switch and route at full line rate performance on every port.
Advanced TCA® Compliant
The Advanced TCA® standard developed by the PCI Industrial Computer Manufacturer Group defines an embedded Ethernet environment for high availability chassis. This environment includes two switch fabric slots that create a dual star Ethernet network to the 14 Base node slots. Placing the Ethernet Switch Blade in a hub slot provides embedded Ethernet services to each node card of the chassis. A standard HA configuration is one Ethernet Switch Blade placed in each of the two hub slots in a chassis for creation of a redundant, high availability system.
Ethernet Switch Blade User's Guide release 3.2.2j page 17
Page 18
OpenArchitect Switch Management
The OpenArchitect software component – open source Linux, IP protocol stack, control applications and the OA Engine – runs on two embedded PowerPC microprocessors. OpenArchitect provides extensive managed IP routing protocols and other open standards for switch management. Examples include network services; Virtual Redundant Router Protocol; Routing Information Protocol; Open Shortest Path First; Border Gateway Protocol; Quality of Service and Class of Service; access control lists; Simple Network Management Protocol MIBs, Common Open Policy Services and web.
Extensible Customization of Routing Policies
The OpenArchitect software environment enables rapid porting of other UNIX/Linux-based protocols, including open source software conforming to RFCs and other standards. It also enables the development of application-specific protocol configuration scripts.
Powerful CarrierClass Features
The Ethernet Switch Blade has High Availability hardware features for advanced telecommunication applications. The switch implements the PICMG 3.0 Full Hotswap support. This feature provides field replaceable capabilities so a switch can fail and be replaced without impacting the operational performance of a chassis.
The PICMG 3.0 Intelligent Platform Management Interface (IPMI) standard is also supported. IPMI uses message-based interfaces that monitor the physical health characteristics of the Ethernet Switch Blade. The switch provides operational status information to an IPMI management application. End customers benefit with advanced notice of potential problems.
The Ethernet Switch Blade also implements the Media Dependent Interface called Auto MDI-X. Auto MDI-X allows connections to any device, switches, hubs, or systems using a regular straight-through or crossover Cat 5 cable. The RJ-45 port will auto detect and switch MDI/MDI­X modes. This IEEE standard makes cabling – especially between switches – faster and less error prone.
E-Keying is supported by the Ethernet Switch Blade.
Ethernet Port Layout
The Ethernet Switch Blade has a total of 72 switched Gigabit Ethernet ports. The base fabric is connected via 24 Gigabit Ethernet ports and the data fabric is connected via 48 Gigabit Ethernet ports. The Ethernet Switch Blade is actually composed of two separate switches, one for Base port activity and another for fabric port activity. The Base ports ( control and signaling) are switched on the Base switch, and the fabric ports ( data ) are switched on the fabric switch, which provides total separation between system management or control packets, and customer data packets.
Ethernet Switch Blade User's Guide release 3.2.2j page 18
Page 19
Ethernet Switch Blade Port Configuration
Base switch Quick Reference
ShelfManager1
ShelfManager2
ISL channel ( Base node2 )
Base nodes 3-14
Base nodes 15,16
zre22
zre13
zre23
zre0-11
zre 20-21
Front panel
zre12, zre14, zre15
Fabric Switch Quick Reference
slot zre numbers
3
4
5
6
7
8
9
zre0-3
zre4-7
zre8-11
zre12-15
zre16-19
zre24-27
zre28-29
10
11
12
13
14
15
16
Inter-switch Link (ISL)
Front panel
zre30-31
zre32-33
zre34-35
zre36-37
zre38, zre39
zre40-41
zre42-43
zre51
zre20-23
Ethernet Switch Blade User's Guide release 3.2.2j page 19
Page 20
You will find the Ethernet Switch Blade has a straightforward installation and configuration. UNIX or Linux system management skills and some understanding of network protocols will be required. Configure the Ethernet Switch Blades to your networking application before you
begin using the OpenArchitect switch.
OpenArchitect Switch Environment
The key elements of the OpenArchitect environment include two embedded Linux operating systems, OpenArchitect-specific applications and libraries, plus, an innovative switch hardware design.
OpenArchitect hardware is in many ways similar to typical switch architectures. The primary difference in OpenArchitect is that the PCI bus that interfaces with the embedded processor and the switch fabric is at a higher performance level than a typical switch (see Figure 1.1: Fabric Switch Elements). The use of PCI creates a pipe of significant bandwidth between the processor and the switch fabric.
The embedded processors, running Linux and the OpenArchitect processes, control the flow of all traffic by maintaining the switch forwarding tables. These tables define the flow of the switch traffic. Because they are on the switching chips, packets proceed at line rate.
OpenArchitect Software Structure
Figure 1.1: Fabric Switch Elements
OpenArchitect is based on an embedded Linux operating system and includes a number of ZNYX Networks-supplied modules. The key element is the Linux routing table, which is crucial in a
Ethernet Switch Blade User's Guide release 3.2.2j page 20
Page 21
network-enabled Linux implementation.
The purpose of the routing table is to tell the packet forwarding software where to forward the data packets. In Linux, the packet-forwarding algorithm is operated in software. Normally, the routing tables are maintained by operator configuration and the various routing protocols that run in the application environment of Linux.
OpenArchitect uses an innovative new approach for forwarding packets. It provides embedded software daemons that replicate ( shadow) the Linux routing tables in the silicon-based forwarding tables (see Figure 1.1: Fabric Switch Elements). In the OpenArchitect switching environment, the switching chips do the real-time work in switching network packets. The switch fabric consults its own forwarding tables for each incoming packet; and either filters or forwards the packet to any egress port, the embedded CPU, or to any combination. The Linux routing tables, running in software, are used to update the silicon-based tables. This provides both the flexibility and control of the Linux software environment and the speed of dedicated switching silicon.
The OpenArchitect environment includes additional features. For example, installing the OpenArchitect switch gives you immediate implementation of Linux routing protocols. Also, you have complete support of routing table updates and a standardized method for configuration. Finally, you can quickly integrate bug fixes, protocol enhancements and additional protocol implementations from the Linux community. You can also integrate OpenArchitect into other Linux applications including VPN software, voice over IP protocols, Quality of Service, and HTML configuration.
RAIN Management API (RMAPI) is a generic interface for passing control data. The OpenArchitect libraries are implemented completely above RMAPI. The libraries provide a front­end to RMAPI to simplify application writing. Currently one library is implemented, a general library called zlxlib. As the OpenArchitect application requirements grow, the existing library will be expanded and additional libraries will be created.
Ethernet Switch Blade User's Guide release 3.2.2j page 21
Page 22
Linux Application
Level Software (routed, gated)
ZNYX RAIN Mgt API RMAPI
OpenArchitect Libraries
zlxlib and ztlib
OpenArchitect Application Level Software
(i.e., zconfig, zl3d, zl2d, zsync)
Linux
Protocol
Stack
Linux
Routing
Tables
Open Architect Driver
PCI Bus
Switch Fabric
Linux Application Environment
Linux Kernel
Figure 1.2: OpenArchitect Software Structure
OpenArchitect applications are used to program and configure the Ethernet Switch Blade. These applications are implemented above the libraries and RMAPI.
Ethernet Switch Blade User's Guide release 3.2.2j page 22
Page 23
Chapter 2 Port Cabling and LED Indicators
The PICMG 3.1 standard defines an embedded Ethernet environment for Telco chassis. This environment includes two switch fabric slots that create a dual star Ethernet network to the fourteen node slots. Placing the Ethernet Switch Blade in a hub slot provides embedded Ethernet services to each node card across the Packet Switching Backplane of the chassis. A standard configuration is to place a Ethernet Switch Blade in each hub slot creating a redundant, high availability system. This chapter provides information on the Ethernet Switch Blade port connectors and LED indicators.
Connecting the Cables
Your switch setup may require some or all of the following types of cables: 10/100/1000 Port Cabling
Category 5 cabling is required for all external ports. Be sure that your cable length is within the minimum and maximum length restrictions for the Ethernet, otherwise you could experience signal or data loss. All copper GigE ports on the Ethernet Switch Blade are auto-MDI sensing and will automatically determine whether or not an MDI (straight-through) or MDI-X (crossover) cable is attached.
Console Port Cabling
The switch console can be accessed via one RJ-45 10/100 service port located on the front panel of the Ethernet Switch Blade.
NOTE: There are two switch portions that make up a Ethernet Switch Blade unit. Each switch portion, Base and fabric, has its own console ports, and requires its own console cable or OOB Ethernet cable.
The RS-232 configured RJ-45 connector console port on the front panel can be used to recover from a system failure. It is used for maintenance only, and is generally not connected. Use a HP console cable (P/N A6900-63006) provided with the HP bh5700 ATCA 14-Slot Blade Server, in combination with a Modem Eliminator cable, to access the switch software through the console port. Refer to the HP bh5700 ATCA 14-Slot Blade Server Installation Guide for additional information.
Connecting to the Console Port
To attach the console cable to the OpenArchitect Base or fabric switch:
1. Plug the RJ-45 end of the console cable into the RJ-45 Console Port on the front.
2. Connect the Modem Eliminator cable to the DB-9 connector on the console cable.
3. Connect the other end of the Modem Eliminator cable to a standard COM port (9600, n, 8, 1).
Ethernet Switch Blade User's Guide release 3.2.2j page 23
Page 24
4. Reinsert the switch into the shelf chassis and power up.
Use a terminal emulation program to access the switch console.
Out of Band Ports (OOB Ports)
Each switch, fabric and Base, in a Ethernet Switch Blade unit has out-of-band (OOB) Ethernet ports on the front panel. This is an alternative maintenance port supplying Ethernet connectivity instead of serial connectivity and is connected only when performing switch maintenance activities. Use ifconfig to bring up and configure the OOB ports. The OOB ports are 100 full
duplex, not auto-sensing. The front OOB port is eth0, and the rear (not implemented with this release) is eth1.
LED Reference
See Figure 2.1 for a schematic view of the front of a typical Ethernet Switch Blade board. Note that there are out-of-band ports, RS232 ports, a USB port, and 10 Gig egress ports (not implemented in this release). In-band ports from the Base and fabric switches have LED status lights controlled from the LED Mode button. Press the button successively to display the Base switch ports, fabric switch ports 0-23, and finally the fabric switch ports 24-47. There are separate LEDs for the out-of-band ports, and the ATCA status functions.
Ethernet Switch Blade User's Guide release 3.2.2j page 24
Page 25
Ethernet Switch Blade User's Guide release 3.2.2j page 25
Figure 2.1: LED Reference
Page 26
Ethernet Switch Blade User's Guide release 3.2.2j page 26
Page 27
Chapter 3 High Availability Networking
Architecture
High availability networking is achieved by eliminating any single point of failure through redundant connectivity: Redundant cables, switches and network interfaces for hardware, combined with HA software solutions on both the hosts and switches to control the HA hardware and maintain connectivity. An HA solution called Surviving Partner is provided on the switch.
For host-side HA, the most common solution is to use the Linux bonding driver. HA solutions like the Linux bonding driver present a single, virtual interface to the protocol stack while managing multiple physical links. Figure 3.1: Host HA Architecture shows the relation of the protocol stack, a bonding driver and physical ports.
Figure 3.1: Host HA
A failover between physical links can be made very quickly without requiring change to the IP or MAC address of the virtual interface, effectively transparent from the applications point of view. With redundant links from a switch (or switches) to the host, one link is maintained as the ACTIVE link and the other as STANDBY. If the ACTIVE link were to go down, the STANDBY becomes the new ACTIVE, while presenting the same virtual interface to the host.
NOTE: It is important that the bonding solution provide an active-backup mode. For the Linux bonding driver set “mode == 1” see the http://sourceforge.net/projects/bonding/ documentation for more information. Use the recommendations for Linux kernel 2.4x not
2.6x.
Redundant connections provide an ACTIVE and STANDBY link to a switch, or provide redundant links between more than one switch. In the case of more than one switch, a complete HA solution requires a switch-based HA solution.
Surviving Partner
Surviving Partner is a switch-based HA solution. Surviving Partner runs on the switches to provide transition of Layer 2 and Layer 3 switching functionality between two or more switches. Surviving Partner is comprised of many interactive protocols and processes including VRRP, zlmd, zlc, and others.
Ethernet Switch Blade User's Guide release 3.2.2j page 27
Page 28
VRRP
Since most end nodes use default router addresses, the change of the default router address during a switch failover would require the end nodes to reconfigure. Layer 3 switches that failover must maintain the default router address to maintain the end node's IP transparent failover. The Virtual Router Redundancy Protocol (VRRP, RFC 2338) running in the Surviving Partner switches provides transparent movement of the default router address. VRRP maintains the notion of a Master switch and one or more Backup switches. This group of switches presents a virtual router IP address that can be used by hosts on that net as their default route.
If a Backup switch determines the Master switch is no longer available, one of the Backup switches will assume the role as Master. Physically, each switch maintains a link to the local network. Only the Master switch answers to the default gateway, and the hosts on that net have no need to relearn the router address.
In an HA configuration, the goal is to avoid any single point of failure. VRRP provides a good mechanism to provide a static route for a local network, but a true HA configuration must also provide redundant connections for the host. Providing a virtual router for the local network is not enough. Take the simple case of two hosts on the local network with a connection to the virtual router. Each host needs a connection to each physical switch participating in VRRP. In the simplest configuration, each host would have one connection to the network. An HA solution would include redundant connections from each host to each switch in the virtual router.
Combining the features of Surviving Partner on the switches and HA bonding drivers on the hosts allows implementation of this true HA configuration.
zlmd
In addition to complete switch failover, single link failure must be properly handled. The Link Monitor Daemon zlmd, monitors the link status of each port. If a link goes down, zlmd communicates with the VRRP daemon (vrrpd) to change its priority. Changing the VRRP priority results in movement of switching functionality. By combining zlmd with the zlc application, links connected to hosts that have not failed can be deterministically moved to the new master switch if desired. Supported modes include:
switch - The switch with the greatest number of UP links becomes the Master for all
VLANs under HA management.
Vlan - The switch with the greatest number of UP links in that particular VLAN becomes
the Master for that particular VLAN. If the switch has additional VLANs, they each change independently.
Port - The Master will remain the Master for that particular VLAN until all ports in that
VLAN are down. The Backup then becomes the new Master for that VLAN. Failed links move their connectivity through the Backup Switch and the switch interconnect to reach the Master Switch. This option alleviates the need to move all nodes to a new switch just because a single link goes down.
NOTE: All modes require inclusion of the interconnect in the VLAN. The ISL connection between the two Base switches is port 23 for the Ethernet Switch Blade. The ISL connection between the two fabric slots in port 51.
Ethernet Switch Blade User's Guide release 3.2.2j page 28
Page 29
Switch Replacement and Reconfiguration
When a switch fails, it must be replaced. The replacement switch will likely require proper configuration. For transparent switch replacement, the newly replaced switch must learn its configuration from its Surviving Partner.
In a simple failover scenario, Host A and Host B are configured with failover between two host ports, one port connected to Switch A and the other connected to Switch B. Assume Switch A provides connectivity between Host A and Host B. If Switch A fails, the active link on each host moves over to the port connected to Switch B. Surviving Partner software on Switch B recognizes that Switch A has failed, and assumes the role of switching traffic between Host A and Host B. When the failed Switch A is replaced with a new Switch A', Switch A' will learn its network configuration from the surviving partner Switch B. Switch A' is now ready as a backup to Switch B in case of failure of Switch B.
This is achieved through the use of DHCP. When a switch becomes a VRRP Master, a DHCP server is started with a pointer to a configuration file that contains configuration information for its partners. The replacement switch comes up running DHCP client to retrieve its configuration.
Proper configuration of Surviving Partner requires coordinated configuration of many different processes, including vrrpd, zlmd, zlc, and dhcpd. The daemon processes run scripts to perform their actions. Because these scripts are complex and inter-dependent, a configuration application called zspconfig is used to build them.
The basic steps to configuring Surviving Partner are:
1. Determine your desired configuration.
2. Modify the configuration file ( the default
3. Configure startup scripts or other scripts such as gated routing scripts and vrrp configuration scripts.
4. Run
5.
Run zspconfig –u
zspconfig
zspconfig performs the job of building the scripts based on a provided input file locally, or
from a remote machine. A text-based configuration file provides input to zspconfig. Example configuration files are included on the switch in /etc/rcZ.d/surviving_partner. The result of zspconfig is to create several configuration files and runtime shell scripts, and optionally start the Surviving Partner processes. Scripts are generated for configuring VLANs, starting the network, and starting the vrrpd and zlmd daemons.
zspconfig can also used by sibling backup switches to retrieve configuration from the Surviving Partner and start the vrrpd and zlmd daemons. zspconfig is generally only run once to configure Surviving Partner.
) to use as input to the configuration utility (zspconfig).
zspconfig
on the Master system.
on the Backup/Sibling system(s).
/etc/rcZ.d/surviving_partner/zsp_DC.conf
is
Ethernet Switch Blade User's Guide release 3.2.2j page 29
Page 30
The configuration and runtime scripts created are as follows:
S70Surviving_partner Switch initialization script that is run at boot time. This
script will restart the switch with the original configuration given to zspconfig. Optionally, zspconfig will run this script from the initial invocation.
zsp.conf.<n> - zspconfig configuration file that contains the configuration of
the sibling backup switches. The <n> is used to distinguish potentially more than one backup switch. This configuration file is placed in /tftpboot, and is retrieved via DHCP during configuration of the backup switch by zspconfig with the “-u” option or, by a replacement switch on boot up.
vrrpd.conf - Configuration script for the VRRP daemon. This configuration is
used when the S70Surviving_partner script launches vrrpd. There is a line in this file for each virtual router address vrrpd will manage.
dhcpd.conf - Configuration script used by dhcpd when the switch becomes
master. dhcpd is also used to give replacement switches their configuration scripts. Namely a zsp_.conf<n> file that can be input to zspconfig with the -u flag.
dhclient.conf - If zspconfig is executed with the -u flag, a dhclient.conf file
is created, and then dhclient is used to retrieve a zspconfig configuration file from the /tftpboot area of the Master switch.
vrrpd.script - Runtime script that executes each time the vrrpd changes state.
This script starts and stops dhcpd, and toggles down RAINlink ports to force the RAINlink nodes to a new Master switch.
zlmd.script - Runtime script executed by zlmd when a link goes up or down. This
script modifies the priority of the vrrpd that in turn may cause the VRRP Master to move from one sibling switch to another.
After the scripts are created, zspconfig may run the
S70Surviving_partner
script to start
the Surviving Partner tasks. The tasks started are vrrpd, zlmd, and dhcpd.
The vrrpd and zlmd daemons run scripts to perform their actions. When vrrpd changes state between Master and Backup, it runs a script that starts and stops dhcpd. When zlmd sees a link go up or down, it runs a script that communicates with vrrpd via vrrpconfig.
Example HA Switch Configuration
The following walks through a basic Surviving Partner configuration typical for an HA setup. Assume an HA chassis with multiple hosts, such as single-board CPUs, and two switches configured for Surviving Partner. Each of the hosts has two Base Ethernet ports providing a link to each of the Base switches and up to four fabric Ethernet ports providing links to each of the Fabric switches.
Each host runs Linux bonding drivers (or ZNYX OA Node software with embedded RAINlink) with the ports configured for failover. An interlink provides communication between the Base switches. Another interlink provides communication between the Fabric switches.
Ethernet Switch Blade User's Guide release 3.2.2j page 30
Page 31
When using a Linux Bonding driver on the node card, the bonding driver should be configured for Mode 1 (active/standby). See the Linux Bonding documentation at http://sourceforge.net/projects/bonding/ for complete information.
The two Base switches will be configured as Surviving Partners, using VRRP to form a single virtual interface to the hosts, as will the two Fabric switches. The ports can be configured many different ways, with blocks of ports configured as vans. The configuration is set up in the zsp configuration file, zsp.conf.
NOTE: The actual name on the system may change slightly from zsp.conf, depending on current release requirements.
Modifying zsp.conf on the Base switch
An example file for setting up zspconfig on an Ethernet Switch Blade is /etc/rcZ.d/surviving_partner/zsp.conf
. The following will document the default settings.
NOTE: It is unlikely that any installation will use this default script in production. You will have to modify it to suit your network design.
On Switch A (Master), make a backup copy of
zsp.conf
, and edit
zsp.conf
:
cd /etc/rcZ.d/surviving_partner/
cp zsp.conf zsp.conf.save
vi zsp.conf
The first section uses zconfig to create the VLANs. Many of these choices are determined by the physical configuration of the switch and ATCA backplane. For instance, the Base switch interconnect will always be port 23, and the shelf managers will be ports 22 and 13.
zconfig zhp0: vlan100 = zre23;
zconfig zhp1: vlan1 = zre0..11, zre20..21, zre23;
zconfig zre0..11, zre20..21 = untag1;
zconfig zre23 = untag100;
The next section sets up the physical IP addresses to use for the Master and the Backup switch. The Master provides the addresses to the Backup on a first come, first serve basis. Note that the physical IP address should be different from the virtual IP address that spans the pair of switches. Once configured, the pair appears as one connection point to other hosts on the VLAN. You need to supply an IP address for each interface on each switch. The first IP address on each line is the Master and the second is the Backup.
sibling_addresses: zhp0 = 100.0.0.30, 100.0.0.31 netmask
255.0.0.0;
Ethernet Switch Blade User's Guide release 3.2.2j page 31
Page 32
sibling_addresses: zhp1 = 10.0.0.30, 10.0.0.31 netmask 255.0.0.0;
Now configure the virtual address for each sibling group. We are going to create a virtual interface across one VLAN, but not for the interconnect. This provides a single point to connect/route to the VLANs.
vrrp_virtual_address: zhp1 = 10.0.0.42 netmask 255.0.0.0;
Next come port definitions, as defined on the zspconfig man page. Since our hosts are connected using the Linux bonding driver (or RAINlink), we will want to choose RAINlink on each of the ports in VLANs on each switch, and interconnect for the interconnect port on each switch,
The port definitions are:
interconnect - Ports connected between groups of Surviving Partner switches. VRRP
heartbeat messages are sent on the interconnect ports.
Crossconnect - Crossconnect ports are ports that are connected to other Surviving Partner
switches, that are not part of this Surviving Partner group. Crossconnect ports behave differently then bonding driver/RAINlink ports. The links are not brought down temporarily, and VRRP runs with the native MAC addresses to avoid MAC address duplication with the other VRRP group.
RAINlink - Ports connected to bonding driver/RAINlink enabled nodes. These ports contain
virtual addresses managed by VRRP. And during a failover event, the links are toggled down to force failover to the Master switch.
Route - Ports connected to upstream routers. VRRP does not manage virtual IP addresses for
these links. Routing protocols must be used to instruct up stream routers of a different path to get to the VRRP managed networks.
monitor_only - Ports that are monitored but do not have a virtual address managed on them.
They will not have their links brought down temporarily during a failover scenario. These ports are only monitored. If a problem occurs on this type of link it will cause a failover scenario.
configure_only - Ports are configured as per the zconfig commands, but do not participate
in the high availability network. Problems on these links will not cause a switch failover.
interconnect: zhp0;
RAINlink: zre0..11, zre20..21;
Next come special modes for VRRP for use when more than one pair of Ethernet Switch Blades are connected to another pair of Ethernet Switch Blades in a redundant configuration. The intent of these modes is to provided Spanning Tree like capabilities eliminating network looks between pairs of Surviving Partner configurations, as well as expedite address learning between the two pairs of switches:
vrrp_mode: RAINlink_xmit_on_failover;
Ethernet Switch Blade User's Guide release 3.2.2j page 32
Page 33
#vrrp_mode: block_crossconnect;
The next sections determines the failover mode between the Surviving Partner switches. There are three modes:
switch - Failover by switch. Failover from Master switch to Backup on any port
failure. The switch with the most links becomes the new Master. One port failure will cause the switch to failover.
vlan - Failover by VLAN. The switch with the most up links in the VLAN becomes
the Master of that VLAN. When VLANs failover and all VLAN masters are not located on a common switch, the interconnect link is used to carry data traffic, and could become saturated. The use of the interconnect for data traffic in a failover situation depends on the VLAN design, from one extreme where one VLAN could contain all ports to one port per VLAN at the other extreme.
port - Failover by port. The Master switch will remain the Master until all ports in the
VLAN are down. The Backup then becomes the new Master for that VLAN. Similar to VLAN failover, the interconnect link will carry data traffic in this mode, when ports failover.
failover_mode: switch;
Next, you can set VRRP_msg_rate and default priority. VRRP_msg_rate is the time in milliseconds between vrrp message transmissions over the interconnect link. The vrrp_def_priority is the default priority for both switches. The value is set to 254 and should not require change.
vrrp_msg_rate: 100; # In milliseconds
vrrp_def_priority: 254;
The following optional entries provide a mechanism to propagate files and/or startup scripts to sibling switches. An example might be startup scripts or scripts to configure gated. Example scripts are included to start gated with RIP1, RIP2, or OSPF setup. You must use absolute path names.
# start_script: Allows the user to add files and scripts that are moved
# to the slave switches when they do a zspconfig -u. An example might
# be the gated configuration script S55... Absolute path names are
# required. Multiple start_script commands can be used to move more than
# one file.
Ethernet Switch Blade User's Guide release 3.2.2j page 33
Page 34
#start_script:/etc/rcZ.d/SxxScript;
#start_script:/etc/rcZ.d/SyyScript;
# vrrpd_script: Allows the user to add scripts to be executed during
# vrrpd state transitions. These scripts are run from the end of the
# /etc/rcZ.d/surviving_partner/vrrpd.script file. The user provided
# script must be well behaved. If it crashes, or hangs or delays it will
# effect the SurvivingPartner performance. The script is not run in
# backround. If this is needed, have your script background itself.
#vrrpd_script: /etc/rcZ.d/surviving_partner/my_vrrpd_script;
#vrrpd_script: /etc/rcZ.d/surviving_partner/my_vrrpd_script2;
# gated_template: Allows the user to provide a template for the
# gated.conf file to be used by the sibling group.
#gated_template: /etc/rcZ.d/surviving_partner/gated.template
These entries are optional:
If you use the special failover modes vlan or port ( see above for details), you can also specify an individual address to be the default master, that is, that a port or VLAN should run on a specific switch when the vrrp priorities are equal between switches.
NOTE: VLAN or port mastering is not appropriate for switch mode and should not be attempted.
When addresses designated 'master' failover, they will return to their Master switch, whenever the link is repaired. If they are not designated 'master', they will remain at the backup switch after repairs.
If both switches are equal in priority for a VLAN, then the switch with theIPaddress designated 'master' will become Master for that VLAN.
Add the keyword “(master)” after one of the sibling_addresses. The local address comes first.
sibling_addresses: zhp1=10.0.0.30(master), 10.0.0.31
netmask 255.0.0.0;
Ethernet Switch Blade User's Guide release 3.2.2j page 34
Page 35
Once the configuration files are complete, run the zspconfig utility on the Master to configure all the scripts:
NOTE: This command can take 60 seconds or more with no screen output.
zspconfig –f zsp.conf
You will see output similar to this:
zspconfig -f zsp.conf ….
Would you like to install the Surviving Partner startup script[y,n,?] y
Would you like to start the Surviving Partner daemons without rebooting [y,n,?] y
Once configuration is complete, insure there are no superfluous S-type startup scripts in /etc/rcZ.d, and zsync your switch to save your configuration.
Now go to the backup switch and run zspconfig –u to get the appropriate configuration information from the Master,
zspconfig –u zhp0
Modifying zsp_vlan.conf on the Fabric Switch
An example file for setting up zspconfig on a Ethernet Switch Blade Fabric board is /etc/rcZ.d/surviving_partner
/zsp_vlan.conf
. Reference the descriptions in the
previous section for descriptions of each configuration section.
# Sample configuration is based on the idea that there are separate VLANs
# for the multiple connections to a slot.
#
# zhp0: Interconnect VLAN
# zhp1..4: Data interface VLANs, configured such that Option 2
# slots have 2 VLANs connected to them and Option 3 slots have
# 4 VLANs connected to them.
#
Ethernet Switch Blade User's Guide release 3.2.2j page 35
Page 36
# This script will likely need modification for your particular
# network setup.
#
# In this example the Egress ports, zre20..23 and zre48..50 are
# not managed by HA since how, or if, these ports are managed by HA is
# dependent on the external devices they are connected to. Non-HA
# egress ports can be brought up through conventional means by adding
# an S-script to /etc/rcZ.d. If the ports are to be managed by HA, they
# can be added to an existing VLAN(zhp) or a new VLAN(zhp) can be created
# If a new VLAN(zhp) is to be managed by HA, add a zconfig, sibling_address,
# and vrrp_virtual_address configuration line and define the port type as
# appropriate.
#
# The interconnect port is needed in the VLANs connected to the
# RAINlink ports.
#
zconfig zhp0: vlan100 = zre51;
zconfig zhp1: vlan1 = zre0, zre4, zre8, zre12, zre16, zre24, zre28, zre30, zre32, zre34, zre36, zre38, zre40, zre42, zre51;
zconfig zhp2: vlan2 = zre1, zre5, zre9, zre13, zre17, zre25, zre29, zre31, zre33, zre35, zre37, zre39, zre41, zre43, zre51;
zconfig zhp3: vlan3 = zre2, zre6, zre10, zre14, zre18, zre26, zre51;
zconfig zhp4: vlan4 = zre3, zre7, zre11, zre15, zre19, zre27, zre51;
Ethernet Switch Blade User's Guide release 3.2.2j page 36
Page 37
zconfig zre0, zre4, zre8, zre12, zre16, zre24, zre28, zre30, zre32, zre34, zre36, zre38, zre40, zre42 = untag1;
zconfig zre1, zre5, zre9, zre13, zre17, zre25, zre29, zre31, zre33, zre35, zre37, zre39, zre41, zre43 = untag2;
zconfig zre2, zre6, zre10, zre14, zre18, zre26 = untag3;
zconfig zre3, zre7, zre11, zre15, zre19, zre27 = untag4;
zconfig zre51 = untag100;
# Recommend using vrrp_mode RAINlink_xmit_on_failover.
zl3d zhp1 zhp2 zhp3 zhp4;
# First address is our address. The remaining addresses
# are handed out to the siblings on a first come first serve
# basis in the order specified. Each zconfig'ured interface
# should have sibling addresses specified.
sibling_addresses: zhp0 = 100.0.0.30, 100.0.0.31 netmask
255.0.0.0;
sibling_addresses: zhp1 = 10.0.0.30, 10.0.0.31 netmask 255.0.0.0;
sibling_addresses: zhp2 = 11.0.0.30, 11.0.0.31 netmask 255.0.0.0;
sibling_addresses: zhp3 = 12.0.0.30, 12.0.0.31 netmask 255.0.0.0;
sibling_addresses: zhp4 = 13.0.0.30, 13.0.0.31 netmask 255.0.0.0;
# The virtual address spans the sibling group, giving hosts and routers
# a single point to connect to or a single point to use as a router. A
# virtual address should not be specified for the interconnect interface.
Ethernet Switch Blade User's Guide release 3.2.2j page 37
Page 38
vrrp_virtual_address: zhp1 = 10.0.0.42 netmask 255.0.0.0;
vrrp_virtual_address: zhp2 = 11.0.0.42 netmask 255.0.0.0;
vrrp_virtual_address: zhp3 = 12.0.0.42 netmask 255.0.0.0;
vrrp_virtual_address: zhp4 = 13.0.0.42 netmask 255.0.0.0;
# Port definitions
# Define to what the ports are connected. Specifications can be
# by zhp or zre name. The zhp name is a shortcut to specify the
# entire port group associated with that interface. In the end
# these definitions are on a port by port basis. Note: zhp and
# zre names cannot be mixed on the same line.
#
# Shelf manager ports should be defined as monitor_only. monitor_only ports
# are used in failover calculations, but the failover mechanism is left
# to the software running on the shelf managers cards. The use of the
# term "crossconnect" in these HA scripts is not the same as
the use
# in ATCA shelf managers.
interconnect: zhp0;
RAINlink: zre0..19, zre24..43;
############################# Special Modes #############################
# VRRP modes
# The block_crossconnect mode causes the equivalent of STP blocking on the
Ethernet Switch Blade User's Guide release 3.2.2j page 38
Page 39
# crossconnect ports of the VRRP Backup. The block_crossconnect mode is
# meant as a replacement for STP, however, the switches connected to the
# crossconnect ports must be Ethernet Switch switches running Surviving Partner.
#
# The RAINlink_xmit_on_failover mode requires that the OpenNode blades
# connected to RAINlink ports transmit a packet when failing over, so that
# The Layer 2 tables will learn the new port/MACaddress relationship. An
# example is the SNAP_BCAST_MODE in RAINlink or a gratuitous ARP.
vrrp_mode: RAINlink_xmit_on_failover;
#vrrp_mode: block_crossconnect;
# failover modes
# switch-failover, VLAN-failover or port-failover are mutually exlusive. They
# describe what occurs if a port fails. For switch-failover, if any port
# fails, all functionality of the current switch is moved to the backup
# For vlan-failover, if a port fails in the vlan then all the ports that
# are a member of that VLAN are failed over. For port­failover, each port
# can failover independently. For vlan and port failover the
# interconnect will need to be used to maintain connectivity, requiring
# all VLANs to include the interconnect ports.
Ethernet Switch Blade User's Guide release 3.2.2j page 39
Page 40
failover_mode: port;
# VRRP_msg_rate is the time in milliseconds between transmissions
# VRRP messages on the interconnect. The VRRP protocol requires the
# absence of 3 VRRP messages before concluding that the remote switch
# has failed. The msg_rate must match the msg_rate of all siblings.
# Anything other than multiples of seconds is non-conformant
# with the VRRP specification and will only run with ZNYX supplied
# vrrpd.
vrrp_msg_rate: 100; # In milliseconds
vrrp_def_priority: 254;
# start_script: Allows the user to add files and scripts that are moved
# to the slave switches when they do a zspconfig -u. An example might
# be the gated configuration script S55... Absolute path names are
# required. Multiple start_script commands can be used to move more than
# one file.
#start_script:/etc/rcZ.d/SxxScript;
#start_script:/etc/rcZ.d/SyyScript;
# board_synchronization_mode: Coordinate the HA events between
the Base and
Ethernet Switch Blade User's Guide release 3.2.2j page 40
Page 41
# Fabric portions of the 7100 switch. The actual coordination is dependent on the
# setting of the board_synchronization_mode and the failover_mode. In
# switch failover_mode the number of up links in both switch planes is
# considered. In vlan and port failover mode they are not. In all
# failover_modes, if the data plane or fabric plane switch reboots or
# power cycles, the HA partner will take mastership for all VLANs in
# both planes. board_synchronization is off by default. "basic" is the
# only supported mode at this time. The same mode must be set in both
# the base and fabric switches.
#board_synchronization_mode: basic;
# vrrpd_script: Allows the user to add scripts to be executed during
# vrrpd state transitions. These scripts are run from the end of the
# /etc/rcZ.d/surviving_partner/vrrpd.script file. The user provided
# script must be well behaved. If it crashes, or hangs or delays it will
# effect the SurvivingPartner performance. The script is not run in
# backround. If this is needed, have your script background itself.
#vrrpd_script: /etc/rcZ.d/surviving_partner/my_vrrpd_script;
#vrrpd_script: /etc/rcZ.d/surviving_partner/my_vrrpd_script2;
Ethernet Switch Blade User's Guide release 3.2.2j page 41
Page 42
# gated_template: Allows the user to provide a template for the
# gated.conf file to be used by the sibling group.
#gated_template: /etc/rcZ.d/surviving_partner/gated.template
Once the configuration files are complete, run the zspconfig utility on the Master to configure all the scripts:
NOTE: This command can take 60 seconds or more with no screen output.
zspconfig –f zsp.conf
You will see output similar to this:
zspconfig -f zsp.conf ….
Would you like to install the Surviving Partner startup script[y,n,?] y
Would you like to start the Surviving Partner daemons without rebooting [y,n,?] y
Once configuration is complete, insure there are no superfluous S-type startup scripts in /etc/rcZ.d, and zsync your switch to save your configuration.
Now go to the backup switch and run zspconfig –u to get the appropriate configuration information from the Master,
zspconfig –u zhp0
Configuring Surviving Partner
The
S60SP_startup
installing the looking for a Master switch configuration. The
It first looks for a local file file exists, it is used to configure the switch. Only the originally configured switch or Central
Authority should contain this file. See Central Authority later in this Chapter for more information.
Next it uses zspconfig –u to attempt to contact a running Master switch to retrieve the proper configuration. This is the normal case for a replacement switch.
S60SP_startup
script is useful in setting up proper switch replacement. By factory
script in replacement switches, the replacement switches will boot
S60SP_startup
/etc/rcZ.d/surviving_partner/zsp.primary.conf
works as follows:
. If this
Ethernet Switch Blade User's Guide release 3.2.2j page 42
Page 43
Finally, it lets the currently saved the case of a power up of an already configured backup switch when the other HA switch is unavailable. This case could occur after losing power to the entire chassis.
S70Surviving_Partner
script execute. This case would be
Central Authority
Modifications can be made to the that is not part of the Surviving Partner pair. The third machine is referred to as the Central Authority.
Setup requires a DHCP daemon configuration file on the Central Authority and a dhclient configuration file for each of the two Surviving Partner switches in the pair. The format of the DHCP daemon configuration file is dependent on the machine and operating system being used. An example can be obtained from the Surviving Partner primary switch in the location
/etc/rcZ.d/surviving_partner/dhcpd.conf
This configuration will contain configuration for only one of the two Surviving Partner switches. It must be edited. For example:
subnet 100.0.0.0 netmask 255.0.0.0 {
option broadcast-address 100.255.255.255;
host ZNYX1 {
fixed-address 100.0.0.31;
option dhcp-client-identifier "ZNYX";
option vendor-encapsulated-options
"zsp_conf.1";
S60SP_startup
script to use a third machine running DHCP
.
}
}
A second host entry must be added with unique information.
subnet 100.0.0.0 netmask 255.0.0.0 {
option broadcast-address 100.255.255.255;
host PRIMARY {
fixed-address 100.0.0.30;
option dhcp-client-identifier "PRIMARY";
option vendor-encapsulated-options
Ethernet Switch Blade User's Guide release 3.2.2j page 43
Page 44
"zsp.primary.conf";
}
host SECONDARY {
fixed-address 100.0.0.31;
option dhcp-client-identifier
"SECONDARY";
option vendor-encapsulated-options
"zsp.secondary.conf";
}
}
The zsp.primary.conf and location on the machine, often
zsp.secondary.conf
files can be retrieved from the Surviving Partner switches. This is the
zsp.secondary.conf
/tftpboot
. The
zsp.primary.conf
files must be placed in the tftp
and
configuration that will be given to the switches. It is recommended that the from the primary as follows:
The
zsp.conf
file created by hand on the primary is moved to
/tftpboot/zsp.primary.conf
the Central Authority.
Move
/tftpboot/zsp_DC.conf.1
/tftpboot/zsp.secondary.conf
Create
dhclient.conf
files on the Surviving Partner switches. Examples can be found in
/etc/rcZ.d/surviving_partner/dhclient.conf
file on the primary created by zspconfig to
on the Central Authority.
. As an example:
send dhcp-client-identifier "ZNYX";
request vendor-encapsulated-options;
require vendor-encapsulated-options;
Modify the
dhclient.conf
file on the primary switch as follows:
send dhcp-client-identifier "PRIMARY";
zsp.conf
be taken
on
request vendor-encapsulated-options;
require vendor-encapsulated-options;
Modify the
dhclient.conf
send dhcp-client-identifier "SECONDARY";
Ethernet Switch Blade User's Guide release 3.2.2j page 44
file on the secondary switch as follows:
Page 45
request vendor-encapsulated-options;
require vendor-encapsulated-options;
The last step is to modify the startup scripts that run zspconfig to use the -c option. The -c option allows you to provide a default. For example, the
dhclient.conf
S60SP_startup
script line that reads:
script rather then having zspconfig create a
echo y n | zspconfig -t 10 -su zhp0 > /dev/null 2>&1
Can be modified to
echo y n | zspconfig -c /etc/rcZ.d/surviving_partner/dhclient.new.conf -t 10 -su zhp0 > /dev/null 2>&1
If you use
S60SP_startup
, the /
etc/rcZ.d/surviving_partner/zsp.primary.conf
file should not exist. This way the S60SP_startup script will first look at the Central Authority. If the Central Authority is down, then it will use its current configuration.
Ethernet Switch Blade User's Guide release 3.2.2j page 45
Page 46
Chapter 4 Fabric Switch Configuration
Two switches, two consoles
There are two separate switch portions in the Ethernet Switch Blade units, the base switch and the fabric switch. The fabric switch handles the data traffic for the ATCA rack over ports 0-47. It runs the Ethernet Switch Blade software. Two or four GigE connections are provided to node cards using the ATCA backplane.
Connecting to the Fabric Switch Console
You can connect to the fabric switch console using a telnet connection or with a console cable. Use the procedure below for a telnet connection. See Connecting to the Console Port, for instructions.
Connect an Ethernet cable to the host and the switch. The OOB port is not active in the default configuration. You can connect to the fabric OOB port on the front panel.
Work from a host on the 10.0.0.0 network.
The OpenArchitect switch is pre-configured with address 10.0.0.43. Telnet to 10.0.0.43.
telnet 10.0.0.43
After you are connected, enter the login name
OpenArchitect
ZX7100-OA<release no.>#
login: root
root
. No password is required.
OpenArchitect Configuration Procedure
Layer 2 and Layer 3 switch configurations can be accomplished with a few simple commands. Once you have configured your switch, the commands should be placed into a start up configuration script. Like most Linux systems, the OpenArchitect switch boot process runs initialization commands and scripts in
/etc/init.d/rcS
uppercase “S” in alphabetical order. Any configuration scripts you create should be named in the standard Linux/Unix manner, starting with an uppercase “S” and numbered in the sequence you would like them executed. The final step once the switch has been properly configured is to use the zsync command to save all files into flash for reloading.
which in turn executes all scripts located in
/etc/init.d/
. In particular, OpenArchitect runs
/etc/rcZ.d
starting with an
Ethernet Switch Blade User's Guide release 3.2.2j page 46
Page 47
Changing the Shell Prompt
You may use standard bash shell procedures to change the prompts on your base switches. Many sites choose a system that distinguishes among the individual switches at their location. The same rules apply for saving your choice (zsync) as for all other configuration changes.
Default Configuration Scripts
As shipped the following scripts are run from /etc/rcZ.d as the switch boots up:
NOTE: These default scripts will change in later releases. Use them as examples.
S20stack -
Script that calls zstack to combine the two BCM56504 24-port switch fabric chips
into a single 48 port virtual switch. zstack must be run before any other switch configuration.
S50layer2 -
Script that sets up a basic Layer 2 switch. All 48 ports are set up on one VLAN. This configuration script is appropriate for a Ethernet Switch Blade. It may need to be modified for other models.
Example Configuration Scripts
Example scripts are provided that can be used as templates. Use one of the scripts located in the
switch configuration for the switch is located in the script file
The following scripts are included (each is examined in more detail later in the appropriate section describing common Layer 2 and Layer 3 configurations):
/etc/rcZ.d/examples
directory to help you configure the switch. The default
/etc/rcZ.d/S50layer2
S50layer2 -
Script which sets up a basic Layer 2 switch. All 48 ports are set up on one VLAN. This is a copy of the script in /etc/rcZ.d that is loaded in the default configuration.
S50layer3 -
Script which sets up a basic Layer 3 switch. All 48 ports are set up on individual IP networks (VLANs). Layer 3 switching is enabled.
.
S50multivlan -
Script which sets up multiple untagged VLANs. (See Using the S50layer3
Script) Layer 3 switching is enabled.
S55gatedRip1 -
Script which is used with a Layer 3 switch and calls the GateD daemon to
enable RIP 1 routing protocol.
S55gatedRip2 -
Script which is used with a Layer 3 switch and calls the GateD daemon to
enable RIP 2 routing protocol.
S55gatedOspf -
Script which is used with a Layer 3 switch and calls the GateD daemon to
enable OSPF routing protocol.
Ethernet Switch Blade User's Guide release 3.2.2j page 47
Page 48
Overview of OpenArchitect VLAN Interfaces
A zhp device is associated with one VLAN. zhp may have one or more physical ports and their associated zre devices. A VLAN from the viewpoint of the switch is a logical mapping of ports based on intended use. The primary purpose of a VLAN is to isolate traffic and enable communication to flow more efficiently within groups of mutual interest. The switch is used to bridge from one VLAN to another. Figure 4.1: Fabric VLANs is an example of a custom layer2 VLAN network structure in a fabric switch.
In the Figure 4.1, four VLANs for each fabric switch are used to organize traffic. This is just one example of how a layer 2 switch could be configured with the fabric switch.
Tagging and Untagging VLANs
The OpenArchitect switch is capable of switching VLAN tagged and untagged data packets. VLAN tagged packets conform to the 802.1q specification and the packet header contains an additional four bytes of VLAN tag information. A given port can be specified to accept VLAN tagged or untagged traffic. Internally, all traffic for a particular VLAN is treated as tagged traffic.
Ethernet Switch Blade User's Guide release 3.2.2j page 48
Figure 4.1: Fabric VLANs
Page 49
Switch Port Interfaces
For each switch port, OpenArchitect creates a separate interface with its own MAC address called a ZNYX raw Ethernet ( each in band port. You cannot directly access or modify the
During the initial power up of the switch, the default configuration creates a Layer 2 switch. The Layer 2 configuration places the zre interfaces in one VLANs The number after zre represents the corresponding switch port number (that is, zre1 represents port 1 on the switch).
zre
). After the initial power up, 48
zre
interfaces are created, one for
zre
interfaces.
zhp
interface. See Figure 4.1: Fabric
Layer 2 Switch Configuration
The steps to build a Layer 2 switch involve creating groups of switch ports in VLANs (Layer 2 switching domains) and bringing the interfaces up. zconfig creates the VLAN group of switch ports as well as a network interface. Use ifconfig(1M) on the network interface to bring up the VLAN group.
A startup script called VLAN ( assigned the IP address of 10.0.0.43 to allow access to the switch. The VLAN is assigned an IP
address. The
## Create a single untagged vlan (i.e. interface), consisting
# of the 48 Gigabit Ethernet ports Layer 2 forwarding enabled
# Put the ISL in its own vlan to avoid loops
#
/usr/sbin/zconfig zhp0: vlan1=zre0..50
/usr/sbin/zconfig zre0..50=untag1
/usr/sbin/zconfig zhp1: vlan2=zre51
/usr/sbin/zconfig zre51=untag2
sleep 1
#
# Assign the ZNYX default IP address 10.0.0.43 to the
zhp0
) for all ports
S50layer2
/etc/rcZ.d/S50layer2
. The ISL is assigned its own VLAN. The interface to the host is then
script does the following:
is executed at boot time creating one untagged
# zhp0 interface and start it
#
ifconfig zhp0 10.0.0.43 netmask 255.255.255.0 broadcast
10.0.0.255 up
Ethernet Switch Blade User's Guide release 3.2.2j page 49
Page 50
ifconfig zhp1 0.0.0.0
#
# At this point the system will act as a Layer 2 switch
# across all ports. Also, the system will accept telnet()
# connections on 10.0.0.43 on any port. Script(s) may then
# be run to reinitialize the system and modify its
# configuration.
Using the S50layer2 Script
The
S50layer2
script can be used as an example, and edited to customize your Layer2 setup.
The default script may not match your physical port configuration. In that case you will have to alter the script to suit your circumstances. For example, to reconfigure the IP address on your Layer 2 switch,
Open the
S50layer2
file in the Linux vi editor.
Change the IP address value listed under the Linux ifconfig(1M) command line.
Save your changes by running OpenArchitect zsync.
zsync
Reboot the switch.
Rapid Spanning Tree
The Rapid Spanning Tree Protocol (RSTP) configures a simply connected active topology from the arbitrarily connected components of a Bridged Local Area Network. RSTP participants use a simple dialog carried in packets called Bridge Protocol Data Units (BPDUs) for finding the shortest path between two networks and for eliminating loops from the topology. If nodes attached to ports fail or are added or deleted, the topology dynamically changes to accommodate the new configuration. If your network topology is such that there is no real redundancy or chance for loops, you do not need to turn on Spanning Tree.
zl2d is a shell script used to create Linux bridges consisting of the name of the previously created zhp device or devices preceded with a "b" (for example, if you are creating a Bridge device from zhp0, the resulting device would be bzhp0). zl2d then starts a background task that monitors the port information of the Linux bridge at a specified interval and updates the Spanning Tree state fields in the hardware when necessary.
brctl(8) is called by zl2d for configuring certain RSTP parameters. For an explanation of these parameters, see the IEEE 802.1d specification, or reference the brctl(8) man page in Appendix A. The following demonstrates a simple example of setting up a Layer 2 switch and starting RSTP.
Ethernet Switch Blade User's Guide release 3.2.2j page 50
Page 51
To Enable Rapid Spanning Tree:
Create a VLAN containing the ports that will be a part of the Linux bridge running Rapid Spanning Tree. This example will use ports 0-3 (untagged):
zconfig zhp0: vlan1=zre0..3
zconfig zre0..3=untag1
Create a bridge device from the zhp device,
zl2d start zhp0
A Bridge device named bzhp0 should now exist consisting of ports zre0 through zre3 with Spanning Tree enabled. To view the bridge device, use the brctl command,
brctl show brctl showbr bzhp0
Port Path Cost
Each port has an associated cost that contributes to the total cost of the path to the Root Bridge when the port is the root port. The smaller the cost, the better the path. The Ethernet Switch Blade uses the following IEEE 802.1D recommendations based on the connection speed of your port:
Port Path Cost
Link Speed Recommended Value Recommended Range
10 Mb/s 100 50-600
100 Mb/s 19 10-60
1 Gb/s 4 3-10
10 Gb/s TBD TBD
To change the port path, use the brctl setpathcost option. For example, to set the port priority to a value consistent with a gigabit interface,
brctl setpathcost bzhp0 zre1 4
Ethernet Switch Blade User's Guide release 3.2.2j page 51
Page 52
Layer 3 Switch Configuration
The previous section outlines the Layer 2 switch configuration that is automatically configured when you initially bring up the OpenArchitect switch. In order to communicate between Layer2 interfaces, you must properly setup routing.
The steps to build a Layer 2 switch involve creating a group of switch ports in a VLAN (or Layer 2 switching domain) and bringing that interface up. zconfig creates the VLAN group of switch ports as well as a network interface. Use ifconfig(1M) on the network interface to bring up the VLAN group with Layer 2 switching. Layer 3 routing information is then used to route between the Layer 2 network devices.
Take a simple example of two VLANs configured on the switch, each with four ports. First teardown any existing configuration,
zconfig –t
Use zconfig to create two new VLANs, each with four ports, and untag them,
zconfig zhp0: vlan1=zre1..4
zconfig zre1..4=untag1 zconfig zhp1: vlan2=zre5..8
zconfig zre5..8=untag2
Now, use ifconfig to assign each zhp interface an IP address,
ifconfig zhp0 10.0.0.1
ifconfig zhp1 11.0.0.1
At this point, the Linux host has enough information to route between the networks of the directly attached interfaces, 10.0.0.0 via zhp0, and 11.0.0.0 via zhp1.
The next step is to enable the zl3d daemon to move that routing information from the host to the Ethernet Switch Blade switching tables in silicon. Once enabled, zl3d will monitor the Linux routing tables for changes in configuration and update the switch silicon tables. Start zl3d to update the switch tables:
zl3d zhp0 zhp1
The Ethernet Switch Blade switch is now configured as a Layer3 switch that can route between two Layer2 devices in silicon.
Using the S50layer3 Script
To modify the configuration to a Layer 3 switch, remove the
/etc/rcZ.d
directory, and replace it with the example script file,
S50layer2
S50layer3
file from the
.
Ethernet Switch Blade User's Guide release 3.2.2j page 52
Page 53
In the
S50layer3
script separate VLANs are set up for each port. The VLANs, are labeled as zhp0..zhpn. Each VLAN is associated with an individual zre interface. There is always a one to one connection between VLANs and zhp interfaces. Remember, zre and
zhp
interfaces can begin with a zero value but a VLAN cannot (that is, zhp0 has zre0 on vlan1, zhp1 has zre1 on vlan2). Each zhp interface is assigned a separate IP address in the example script.
The S50layer3 script executes the following commands:
Runs zconfig command to create 48 untagged VLANs (one for each switch port).
/usr/sbin/zconfig zhp0..47: vlan1..48=zre0+
/usr/sbin/zconfig zre0..47=untag1+
NOTE : Double periods (..) after vlan1 and untag1 are used to indicate a range of values. The plus (+) sign after zre1 is a wildcard character that means auto-incremented and
causes each zhp interface to hold only one zre (that is, zhp0 has zre1 on vlan1, zhp1 has zre1 on vlan2).
Runs the Linux ifconfig(1M) command for each interface to assign default IP addresses (10.0.0.43-10.0.47.43), sets the netmask and brings up the interfaces.
ifconfig zhp0 10.0.00.42 netmask 255.255.255.0 up
ifconfig zhp1 10.0.01.42 netmask 255.255.255.0 up
ifconfig zhp2 10.0.02.42 netmask 255.255.255.0 up
.
.
.
ifconfig zhp21 10.0.45.42 netmask 255.255.255.0 up
ifconfig zhp22 10.0.46.42 netmask 255.255.255.0 up
ifconfig zhp23 10.0.47.42 netmask 255.255.255.0 up
Runs the OpenArchitect zl3d. The zl3d application monitors the Linux routing tables
and updates the switch routing tables for each interface configured above.
/usr/sbin/zl3d zhp0..47
zl3d initially creates and adds each zhp interface (VLAN) to the switch routing tables. The zhp0..zhp47 is shorthand for the list of interfaces (zhp0, zhp1, …, zhp47) to monitor
with zl3d.
To Modify the Layer 3 Script
Modify the example script you copied into the /etc/rcZ.d directory. Adjust and assign
Ethernet Switch Blade User's Guide release 3.2.2j page 53
Page 54
the number of IP addresses as applicable. In the example below, the IP address is changed for the interface in the ifconfig command line of the script.
From:
ifconfig zhp0 10.0.0.43 netmask 255.255.255.0 broadcast
10.0.0.255 up
To:
ifconfig zhp0 193.08.1.1 netmask 255.255.255.0 broadcast
193.08.1.255 up
Adjust the number of
zhp
interfaces, that are added to the routing tables, depending on the
number of VLANs you are adding for your network. Include any other details, as applicable.
Run the OpenArchitect zsync command to save your changes.
zsync
Reboot the switch.
After rebooting, your switch works from your customized Layer 3 configuration.
Layer 3 Routing Protocols with GateD
An advanced networking configuration may require using the GateD software platform for deployment of Routing Information Protocols (RIP 1 or RIP 2) and Open Shortest Path First (OSPF) protocols. Once you’ve configured your Layer2 and Layer3 devices, start gated.
Using the S55gatedRip1 Script
To use GateD protocol with the switch, you need to copy two files into the same directory as your Layer 3 configuration file. From the
and its corresponding GateD configuration file (for example,
gated.conf.rip1
).
/etc/rcZ.d/examples
folder, copy the example script file
S55gatedRip1
and
The example startup script executes the following commands (S55gatedRip1 is used as an example):
Starts GateD with Rip1 using gated.conf.rip1 as the configuration file:
/usr/sbin/gated –f /etc/rcZ.d/gated.conf.rip1
The GateD conf file specifies the following configuration commands:
Implements the passive function so GateD is prevented from rerouting information to a
different interface if insufficient information is received.
interface 10.0.0.43 passive
Ethernet Switch Blade User's Guide release 3.2.2j page 54
Page 55
interface 10.0.1.42 passive
interface 10.0.2.42 passive
.
.
.
interface 10.0.13.42 passive
interface 10.0.14.42 passive
interface 10.0.15.42 passive
Defines the netmask used in the interface.
define 10.0.0.43 netmask 255.255.255.0;
define 10.0.1.42 netmask 255.255.255.0;
define 10.0.2.42 netmask 255.255.255.0;
.
.
.
define 10.0.13.42 netmask 255.255.255.0;
define 10.0.14.42 netmask 255.255.255.0;
define 10.0.15.42 netmask 255.255.255.0;
Sets the RIP1 protocol to open.
};
rip1 yes{
Shuts off sending and receiving packets from all interfaces.
interface all noripin noripout
Opens sending and receiving packets for selected interfaces.
interface 10.0.0.43 ripin ripout version 1;
interface 10.0.1.43 ripin ripout version 1;
interface 10.0.2.43 ripin ripout version 1;
.
Ethernet Switch Blade User's Guide release 3.2.2j page 55
Page 56
.
.
interface 10.0.13.43 ripin ripout version 1;
interface 10.0.14.43 ripin ripout version 1;
interface 10.0.15.43 ripin ripout version 1;
Imports routes learned through the RIP protocol.
import proto rip {
all;
};
Exports all directly connected routes and routes learned from the RIP protocol.
export proto rip {
proto direct }
all;
};
proto rip {
all;
};
To Modify the GateD Scripts:
Copy two GateD files, the OpenArchitect "S" file and its corresponding directory (that is, same directory as the Layer 3 configuration file.
For RIP1:
cp /etc/rcZ.d/examples/S55gatedRip1 /etc/rcZ.d
cp /etc/rcZ.d/examples/gated.conf.rip1 /etc/rcZ.d
Or for RIP2:
cp /etc/rcZ.d/examples/S55gatedRip2 /etc/rcZ.d
cp /etc/rcZ.d/examples/gated.conf.rip2 /etc/rcZ.d
S55gatedRip1
and
gated.conf.rip1
). Notice the files are placed in the
conf
file, into the rcZ.d
Ethernet Switch Blade User's Guide release 3.2.2j page 56
Page 57
Or for OSPF:
cp /etc/rcZ.d/examples/S55gatedOspf /etc/rcZ.d
cp /etc/rcZ.d/examples/gated.conf.ospf /etc/rcZ.d
Open and make configuration changes to the listed
conf
file to coincide with the current Layer 3 configuration (that is, adjust IP addresses and number of interfaces available). See GateD documentation if you have questions regarding the
Run the OpenArchitect zsync command to save your changes. Be sure your changes are
conf
file.
correct:
Zsync
Reboot the switch. After rebooting, your switch operates as a Layer 3 switch with GateD
routing.
Class of Service (COS)
This following section provides information on using the OpenArchitect switch to provide Class of Service (COS) support. The switching fabric architecture defines the scope of the COS parameters. Some apply to an individual port, and others apply to the whole switch. It is important for the user to understand the scope of the parameters to ensure that the expected behavior occurs.
Egress Queues
The Ethernet Switch Blade fabric switch provides 1 to 8 COS queues per egress port, and for packets destined to the CPU from the switching fabric. By default, a freshly booted OpenArchitect switch has a single queue per egress port (and the CPU).
Ingress Classification
Incoming packets are mapped to queues based on their priority tags. The built-in behavior of the Ethernet Switch Blade uses the 802.1p tag within a packet as the queue selector. There is one COS to queue selector map per port.
By using the Linux iptables utility and zfilterd with ztmd, the queue selection can be based on any information in the first 64 bytes of the IP packet header. The default OpenArchitect
switch behavior has all COS values mapping to a single queue on each of the egress ports.
A default priority for an untagged packet can be assigned for each port. By default, these incoming priority values are all mapped to COS queue 0. To change the default priority for untagged packets, or to define the mapping from priority values to COS queues, use the zcos command (refer to Appendix A).
Ethernet Switch Blade User's Guide release 3.2.2j page 57
Page 58
Marking and Re-marking
The OpenArchitect switch can mark or remark packets using the TOS field or 802.1p tag. This is also controlled through the Linux iptables utility.
Scheduling
The servicing of configured queues by the switching fabric is referred to as scheduling. The OpenArchitect switch has three built-in scheduling algorithms. The type of scheduling algorithm used is implied, rather than being explicitly specified, based on the number of queues and which options are configured. The following scheduling algorithms are provided:
First In First Out (FIFO) – When only one queue is configured per port, packets are serviced in the order in which they arrive. This is the default for the OpenArchitect switch.
Strict Priority – This algorithm is used when more than one queue is provisioned on the port. The highest priority queue, which is also the highest numbered one, is always serviced first (Example: If four queues are configured, queue three is of higher priority than queue zero). As long as there are packets in the highest priority queue, the lower priority queues are not serviced. The danger is that higher priority traffic could block lower priority traffic.
Weighted Round Robin (WRR) – This algorithm is similar to Strict Priority scheduling, but it provides fairness with quanta for each queue. Each queue is assigned a number of packets, known as weight, that it is allowed to transmit before it yields to a lower priority queue. Note that with WRR, the priorities of the queues are dependent on the weights allocated. A higher priority queue with a smaller weight will get less wire-time than a lower priority queue configured with a larger weight. The relative weights used for priority queues on a port can be set using the zcos command (this is a switch-wide parameter).
ztmd Explained
ztmd is a traffic management daemon which accepts messages from traffic filtering and quality of service applications and sets up the hardware.
zfilterd Explained
zfilterd is a daemon that intercepts filtering rules entered by the user via iptables, checks them for validity and then passes them on to ztmd for entry in the switch.
Running zfilterd
Before starting zfilterd, ztmd must be running. Your can start both from within a script, or directly from the command line. For example,
ztmd
zfilterd
iptables rules can be entered at any time. If your iptables filtering rules set is extensive,
Ethernet Switch Blade User's Guide release 3.2.2j page 58
Page 59
you may want to move your set of iptables commands to a start up script to run upon initialization. This could be accomplished by creating a standalone "S" script and placing that script into /etc/rcZ.d
.
Restrictions on Implementation
Several restrictions exist on the rules that can be implemented on the FFP hardware. These include:
Actions
DROP the packet. ACCEPT the packet.
Output Port
Should be specified if the action is ACCEPT, if no output port is specified, an IRULE table entry is generated for every port.
Field values
If specified as ranges, they must be on power of two boundaries.
Negation
Can only be used for icmp, tcp, or udp fields.
Fields supported are: Source IP address, destination IP address, IP protocol, TCP or UDP source port or destination port, ICMP type, and TCP flags bits (such as SYN).
The input port and output port may also be specified as either zre<n>, where <n> is one of the 48 physical ports, or as zhp<n>, where the zhp interface used must be previously defined using zconfig.
A restriction on the fields supported is the size of the IMASK table. There are only 16 entries per port available, which means only 16 combinations of fields can be used at any time.
Conflict Resolution
There are differences from the expected behavior of implementing iptables in a host: Although the rules are taken from the FORWARD and INPUT chains, they are applied to all
packets, including those destined for the local CPU. The order of application of the rules is not necessarily the order in which they appear in the chains. If a rule uses a mask that is less restrictive than another rule, it will be applied first. The last rule that is matched determines the action that will take place. For example, the rules:
iptables -a FORWARD -i zhp3 -j DROP
iptables -a FORWARD -i zhp3 -o zhp1 -p tcp --dport
smtp -j ACCEPT
result in SMTP packets received on any port in zhp3 to be sent for any port in zhp1; all other packets from zhp3 would be dropped. The order of the two rules in the FORWARD chain does not matter.
Ethernet Switch Blade User's Guide release 3.2.2j page 59
Page 60
On the other hand, in the following sequence of rules, the position of the rule that drops SYN packets is important. Since the set of fields it examines is not a subset of the fields examined by the ACCEPT rules, and visa versa, the ordering rule given above does not apply. In this case, the order it is applied will be the same as its position in the FORWARD chain, and all packets which are TCP SYN packets from zhp5 for zhp3 will be DROPPED, even if they also match one of the ACCEPT rules.
iptables -a FORWARD -i zhp5 -o zhp3 -j DROP
iptables -a FORWARD -i zhp5 -o zhp3 -p tcp --sport smtp -j ACCEPT
iptables -a FORWARD -i zhp5 -o zhp3 -p udp --sport domain -j ACCEPT
iptables -a FORWARD -i zhp5 -o zhp3 -p tcp --sport domain -j ACCEPT
iptables -a FORWARD -i zhp5 -o zhp3 -p tcp --sport www -j ACCEPT
iptables -a FORWARD -i zhp5 -o zhp3 -p tcp --sport 23 -j ACCEPT # rsync
iptables -a FORWARD -i zhp5 -o zhp3 -p tcp --syn -j DROP
iptables and filtering
iptables is a firewall management user-space utility used in conjunction with the Linux 2.4 kernels, and takes advantage of the netfilter 2.4 kernel code. iptables is extended with a few more targets to support the hardware filtering functionality used in the chips on the Ethernet Switch Blade (fabric board). Generally, all of the iptables functionality is usable with a few minor extensions.
A more detailed source on iptables can be found at:
http://www.netfilter.org/
Almost all the contents described here are derived from there.
There are also many tutorials and iptables manipulation tools, both graphical and command line. This is expressive of the Open Architect concept. A good place to start is:
http://freshmeat.net/search/?q=iptables
Introduction
Firewall rules are stored in tables. These tables are sometimes also known as firewall chains or just chains. Tables normally store rules for what are known as hooks, which can be looked as packet-path junctions. There are five defined hooks: PRE-ROUTE, POST-ROUTE, INPUT, OUTPUT and FORWARDING. The example below illustrates the default chains on boot up.
Ethernet Switch Blade User's Guide release 3.2.2j page 60
Page 61
By default, INPUT, FORWARD and OUTPUT chains are installed on boot up. Additional rules
Preroute
Output
Post Route
Input
Forward
Local Process
Outgoing
Incoming
Routing Decision
can be installed for the other chains. Additionally, one can write software extensions to add more chains. Figure 4.2 provides an illustration of the Firewall Flow.
Figure 4.2: Firewall Flow
When a packet reaches a circle in the diagram, that chain is examined to decide the fate of the packet. Two basic fates of a packet are defined as DROP and ACCEPT. If the chain says to DROP the packet, it is killed there; however, if the chain says to ACCEPT the packet, it continues traversing the diagram, ultimately terminating at an application or getting forwarded out of the box. There are additional actions which may be applied to packets. These are described in the "Supported Targets" section.
A chain is a checklist of rules. Each rule is checked against the packet header and if a rule matches, action is taken. If the rule doesn't match the packet, then the next rule in the chain is consulted. Finally, if there are no more rules to consult, then the kernel looks at the chain default policy to decide what to do. In a security-conscious system, this policy usually tells the kernel to DROP the packet.
In the Ethernet Switch Blade product, both the FORWARD chain hook, and the INPUT chain hook (packets destined for the CPU) are implemented in hardware. The rest of the hooks are in software in the Linux kernel. An extension of the FORWARD hook also resides in software. It is important to note that this is in sync with routing being implemented in hardware with software assist for exception handling. Under general circumstances, when routing happens in hardware, only the FORWARD chain is traversed. Under exceptional handling of an incoming packet, one can force the full software traversal. As a router you do not really care about the other hooks except in the situation where you have some special handling, in which case a policy would force the packet to be sent to the CPU for further processing.
NOTE: This is also how one would extend the OA packet munging capabilities (for example, introduce NAT).
Packet Walk
When a packet comes in via one of the interface ports, the Ethernet Switch Blade makes a routing decision. If the packet was destined for the Ethernet Switch Blade fabric switch itself or if the
Ethernet Switch Blade User's Guide release 3.2.2j page 61
Page 62
send to CPU action is specified, it is sent to the INPUT chain for further processing. If there is no valid way to forward the packet, it is dropped. If the switch is configured to forward the packet, it is sent to the FORWARD chain.
Next the hardware FORWARD chain is walked. If there is a rule inserted that matches the packet headers, then it is looked up next. The inserted policy will decide the packets fate.
In essence, a filter rule will be used to scan the packet data for certain characteristics. Upon a match a selected 'target' is executed. The target decides what should happen to the packet.
Filter Rules Specifications
A rule could be added (-a) to a chain, deleted (-D) from a chain, replaced (-R) from a chain or inserted (-I) in a specific position in a chain. Each rule specifies a set of conditions the packet must meet, and what to do if it meets them ('what to do' is referred to as a `target').
Here's an example filter rule:
iptables -a FORWARD -p UDP -s 0/0 -d 10.0.0.1/32 --source-port 53 -j DROP
This adds to the FORWARD chain the rule: "If you see UDP packets (-p UDP) from anywhere (-s 0/0) going to host 10.0.0.1 (-d 10.0.0.1/32) with a source port number 53 (--source-port 53) then the target is to DROP (-j DROP). More details on rule specifications follow.
Specifying Source and Destination IP Addresses
Source ( -s, specified in four ways. The most common way is to use the full name, such as www.linuxhq.com
Netmasks can be applied to IP addresses to specify ranges, like199.95.207.0/24 or
199.95.207.0/255.255.255.0 Both specify any IP address from 199.95.207.0 to 199.95.207.255 inclusive. To specify an all-inclusive IP address /0 can be used, like: -s or rule we use above applies this trick. Note however that the effect above is the same as not specifying the -s option at all.
Specifying Protocol
The protocol can be specified with the -p (or know the numeric protocol values for IP) or a name for the special cases of
Case does not matter, so
Specifying an ICMP Message Type
If the protocol is ICMP, the --icmp-type option can be used to match a specific message type, for example, --icmp-type ping
--source or --src
. The second way is to specify the IP address such as 127.0.0.1.
) and destination (-d,
tcp
works as well as
--destination or --dst
--protocol
TCP
.
) flag. Protocol can be a number (if you
) IP addresses can be
localhost
-d 0/0
TCP, UDP
or
. The example
or
ICMP
.
Ethernet Switch Blade User's Guide release 3.2.2j page 62
Page 63
The type can be preceded by ! to match any message except the type listed, for example, -­icmp-type ! 1
Specifying TCP or UDP ports
If the protocol is TCP or UDP, the -s ( or --sport) and -d (or --dport) options specify the TCP or UDP ports to match.
A range of ports can be specified by giving the first and last ports separated by a :, as in -­dport 0:1023. It is also possible to precede the port specification with a ! to match all ports which are not included in the range, for example, --sport ! 0:1023. However, the range of ports must be a power of two, starting with a port number which is a multiple of the range.
Specifying TCP flags
If the protocol is TCP, a match on particular TCP flags is specified by listing the flag names; for example, -p tcp --syn.
Specifying an Interface
The -i (or
--in-interface
) and -o (or
--out-interface
) options specify the name of an interface to match. An interface is the physical device the packet came in on (-i) or is going out on (-o). You can use the ifconfig command to list the `up' interfaces (for example, working at the moment).
As a special case, an interface name ending with a + will match all interfaces, whether they currently exist or not, which begin with that string. For example, to specify a rule which matches all zhp interfaces, the
-i zhp+
option would be used.
Filter Rule Targets
As mentioned above the -j construct within a rule specifies which target is to be used in filter rule to define a target.
Supported Targets
The following are the supported targets. The switch has many additional targets that are software based (example Network Address Translation or generic connection tracking).
Classical Targets
DROP This drops the packet.
ACCEPT Accepts the packet
ZNYX Targets
ZACTION This is the ZNYX Action target.
Parameters for ZACTION:
Ethernet Switch Blade User's Guide release 3.2.2j page 63
Page 64
--drop Drops the packet
--accept Accepts the packet
--set-prio <val>Set the 802.1p priority to <val>
--use-prio <val>Use queue priority <val>
--copy-cpu Send the packet to the CPU. This will force the full installed chains traversal in software
--set-eport <val> Redirect the packet to port <val>
--set-mport <val> Mirror the packet to port <val>
--set-tos <val> Set the IP-Precedence bits in the TOS field of the IP header to <val>
--set-dscp <val>Set the 6-bit DSCP in the TOS field of the IP header to <val>.
Options with any of these ZACTION parameters:
--counter <val> Increment classifier hit counter <val>
--arp Not an action, match only ARP packets.
-i option can be used to specify ingress port or VLAN,
-d specifies target IP address,
-p specifies arp operation as request (1) or response (2).
For arp response, the -o field can be used to specify the egress port.
ZACTION Examples
Send all tcp packets arriving on zhp5 out port 2:
iptables -a FORWARD -i zhp5 -p tcp -j ZACTION --set-eport 2
Send all tcp packets arriving on zhp5 to the CPU (software).
iptables -a FORWARD -i zhp5 -p tcp -j ZACTION --copy-cpu
Set the 802.1p priority to 3 on all tcp packets arriving on zhp5.
iptables -a FORWARD -i zhp5 -p tcp -j ZACTION --set-prio 3
Extensions to the default matches
These are described in the Linux packet filtering HOWTO at:
http://netfilter.org/documentation/index.html#documentation-howto
Ethernet Switch Blade User's Guide release 3.2.2j page 64
Page 65
FORWARDING Chain supports all of them.
tc and zqosd
tc, which stands for Traffic Control, is a mechanism for enabling Quality of Service on Linux. tc uses three functional objects: queuing disciplines, which comprise queuing and scheduling
algorithms such as FIFO queues, priority queues, RED queues, and token buckets; classes, which are leafs in queuing discipline hierarchies; and filters, such as u32 filters and route filters. In addition to these three building blocks, tc also includes policers and meters, which may be associated with filters.
The functional elements of tc may be combined to produce complex QoS rules. For example, a packet may be matched to a filter, metered, policed as in-profile or out-of-profile, remarked,
mapped to a FIFO queue, and transmitted by a priority scheduler. tc is very flexible in the data paths that it allows.
The utility zqosd is a daemon that monitors Linux QoS policy and shadows the policy rules into a hardware configuration. When zqosd is running, tc rules are translated into hardware rules.
NOTE: This document does not detail all of the capabilities of the tc command, rather it explicitly mentions only features that are supported by OpenArchitect-based switches.
The examples that follow assume that the switch is running the standard Layer 2 start-up script,
/etc/rcZ.d/examples/S50layer2, with all ports placed in a single VLAN, zhp0. Note that
this assumption is implied only by the fact that changes to zhp0 are shown to configure all ports. Neither tc nor zqosd is limited by the interface setup. Each utility works on either VLANs (zhp) or ports (zre).
FIFO Queues (pfifo and bfifo disciplines)
The simplest configuration for tc involves no classes or filters, and only a single FIFO queue. With tc, queue sizes may be specified in bytes or packets. The first example defines a packet­limited FIFO. This example begins with only tc and then illustrates tc in conjunction with zqosd.
As a first step, confirm that no tc configuration is active on the switch, by listing any queue disciplines:
tc qdisc ls
The command should return nothing. Now, add a single packet-limited FIFO queue to zhp0 and confirm that it has been installed to software:
tc qdisc add dev zhp0 handle 100:0 root pfifo limit 32
tc qdisc ls
The output should display the following,
Ethernet Switch Blade User's Guide release 3.2.2j page 65
Page 66
qdisc pfifo 100: dev zhp0 limit 32p
The tc command is applied to a device, so
dev zhp0
must be specified. Note that a VLAN,
such as zhp0, and a port, such as zre0, are each treated as devices. Breakdown of the options:
handle 100:0
Defines the handle for the queuing discipline. This handle may be used to reference the pfifo
queue. Note that the handle is included with the output of the qdisc ls command. (100:0 and 100: are equivalent in tc.) The choice of handle is significant for zqosd.
root
Tells tc that this is the base queuing discipline for the device, not a child of another queuing
discipline.
pfifo limit 32
Specifies a packet-limited FIFO queue with an upper bound of 32 packets.
Now, delete the queuing discipline from zhp0 and confirm that it has been removed:
tc qdisc del dev zhp0 root
tc qdisc ls
Thus far, tc has been used without zqosd. It is not sufficient to install software rules on the OpenArchitect switch though, because the normal case is for packets to be switched in hardware.
For that reason, zqosd must be used to shadow tc configuration into hardware. Like zfilterd, zqosd works with ztmd, which provides the actual hardware interaction.
If ztmd is not already running, start it:, then initiate the zqosd daemon with no parameters:
ztmd
zqosd
Now, repeat the same tc command as before, to install a packet-limited FIFO queue:
tc qdisc add dev zhp0 handle 100:0 root pfifo limit 32
When this command is processed, zqosd detects the state change and generates output.
For each port belonging to zhp0, the queue size has changed to 32 packets. Under the default switch configuration, all ports other than the CPU port belong to zhp0; so all queues other than the CPU queue are affected.
As before, remove the tc configuration with the command:
tc qdisc del dev zhp0 root
Note that zqosd detects this state change. In fact, examining the CoS configuration on the switch reveals that the queue sizes have reverted to their default values.
Ethernet Switch Blade User's Guide release 3.2.2j page 66
Page 67
The byte-limited FIFO queue case differs only slightly from the packet-limited FIFO case. The syntax is almost identical. In hardware the limit is based on 128-byte cells. The specified byte limit is divided by 128 to determine the cell limit. Always specify a byte limit of at least 128 bytes to avoid setting the queue length to zero.
For example, to set the byte limit for zhp0 to 4096,
tc qdisc add dev zhp0 handle 100:0 root bfifo limit 4096
Tear down any installed rules before proceeding with the next example:
tc qdisc del dev zhp0 root
PRIO and WRR queues
The FIFO examples used a single queue for each interface. In fact, the Ethernet Switch Blade fabric switch is capable of attaching 1 to 8 queues to each port, with either priority or weighted round robin (WRR) scheduling, and classification based on a priority map.
In tc, the prio queuing discipline establishes multiple queues and specifies their associated priority map. Although WRR support is not part of the standard tc distribution, it has been added to the prio discipline.
The final example in this document illustrates WRR. A strict priority scheduler is a simpler case that can be constructed easily from this example.
Examine the existing CoS settings on the switch, noting the number of queues per port, queue sizes, scheduling parameters, and priority map. Each of these values changes with this test.
The full set of commands to install four queues, a priority map, and weights is as follows:
tc qdisc add dev zhp0 handle 100:0 root prio bands 4 priomap 1 2 2 2 3 3 3 3 1 1 1 1 1 1 1 1 wrr 1 2 4 6
tc qdisc add dev zhp0 parent 100:1 pfifo limit 120
tc qdisc add dev zhp0 parent 100:2 pfifo limit 100
tc qdisc add dev zhp0 parent 100:3 pfifo limit 80
tc qdisc add dev zhp0 parent 100:4 pfifo limit 60
The first command attaches a queuing discipline as the root discipline for zhp0, with a handle of “100:0,” as in the FIFO cases. The “prio” option identifies the type of queuing discipline.
Priority scheduling implies multiple queues and the “bands 4” parameters specify that there are four queues.
The priority map may be read from left to right as Priority n maps to Queue q, where n is the
Ethernet Switch Blade User's Guide release 3.2.2j page 67
Page 68
index of the list element (numbering from 0) and q is the value specified by that element. So, this example would read:
Priority 0 maps to Queue 1
Priority 1 maps to Queue 2
Priority 2 maps to Queue 2
Priority 3 maps to Queue 2
Priority 4 maps to Queue 3
Note that the tc priority map applies to a 4-bit field. With the Ethernet Switch Blade, the priority map refers to the 802.1p tag, which is a 3-bit field. When translating this tc rule to hardware, only Priorities 0 through 7 are significant; the other eight priorities are ignored.
The parameters wrr 1 2 4 6 specify that WRR scheduling is being used and assigns a relative weight to each queue. The weights are treated as numbers of packets to be sent from each queue. In this example, if the queues have sufficient packets, queue 1 will have twice as many packets sent as queue 0, queue 2 will have four times as many, and queue 3 will have six times as many. wrr parameters are scaled such that the maximum value is no more than 15. values which would be 0 are set to 1:
Queue 0 has a weight of 1000 bytes
Queue 1 has a weight of 2000 bytes
Queue 2 has a weight of 4000 bytes
Queue 3 has a weight of 6000 bytes
The remaining commands each define a packet-limited FIFO queue. As with all previous tc examples, these queues are created on device zhp0. However, unlike all previous examples, they are not created as root disciplines for the device. Instead, the “parent” option identifies them as child queues of the prio discipline.
For example, “parent 100:1” identifies that queue as the first child of the prio discipline (Queue
0), because the prio discipline’s handle is 100:0.
After running each of those commands, again examine the CoS parameters. As with the simple FIFO example, queue sizes change to 32 packets. In addition, though, the number of queues changes to 4 for each port in zhp0. Furthermore, the weights have changed for each queue, as have the queue mappings.
To test the strict priority case, simply remove the
wrr 1 2 4 6 options from the first tc
command. Note that all queue disciplines in this test may be cleared by deleting the root discipline, as before:
tc qdisc del dev zhp0 root
Ethernet Switch Blade User's Guide release 3.2.2j page 68
Page 69
The U32 Filter
The U32 filter provides the capability to match on fields in the L2, L3 or L4 header of a packet. Each match rule gives the location of the field to be tested, which is always a 32 bit word, a mask selecting the bits to be tested, and a value which is to be matched by the packet field. Many matches can be specified in one tc filter command. Only if all matches succeed does the filter
match. In that case, the flowid field identifies the classid of the class this packet belongs in.
The following tc commands put all icmp packets in class 100:10, packets from IP address
1.2.3.4 in class 100:20. Packets for IP address 1.2.3.4 in class 100:20, and arp reply packets in class 100:30. The last filter illustrates using an offset from the beginning of the protocol header,
along with a mask, to locate the field to be matched
tc filter add dev zhp0 protocol ip parent 100:0 u32 match ip protocol 1 0xff flowid 100:10
tc filter add dev zhp0 protocol ip parent 100:0 u32 match ip src 1.2.3.4/32 flowid 100:20
tc filter add dev zhp0 protocol ip parent 100:0 u32 match ip dst 1.2.3.4/32 flowid 100:20
tc filter add dev zhp0 protocol arp parent 100:0 u32 match u32 2 0xffff at +4 flowid 100:30
Combining Queuing Disciplines
Any of the queue length limiting disciplines can be used with the bandwidth management queue disciplines, by defining them with the handle of one of the classes as their parent. For the htb queueing discipline, each class has an explicit handle specified when it is defined. For the prio queueing discipline, including wrr, each band is a class; their handles are formed from the handle of the prio qdisc by appending a minor number of 1 to n for the n bands. For example, the following commands define two strict priority queues for port zre5, with the lower priority queue limited to 32 kb and the higher priority queue limited to 32 kb:
tc qdisc add dev zre5 root handle 100:0 prio bands 2 priomap 0 0 0 0 1 1 1 1
tc qdisc add dev zre5 parent 100:1 handle 110:0 bfifo limit 32kb
tc qdisc add dev zre5 parent 100:2 handle 120:0 bfifo limit 32kb
These translation rules handle conversions of individual rules from tc entries into hardware entries. They do not explain the results of creating rules that are individually supported; but which do not make sense in conjunction.
Ethernet Switch Blade User's Guide release 3.2.2j page 69
Page 70
Although the translation rules handle some inconsistency between software and hardware, a user
PDP
PEP PEP
PEP
must define a combination of rules that is reasonable in hardware, to ensure predictable results.
Handle Semantics
All examples have illustrated zqosd copying tc rules into hardware. In fact, the zqosd utility also enables the user to add tc rules that remain only in software. This selection is based on handles. zqosd processes all supported queue disciplines and filters with handles between 100:0 and 200:FFFF.
COPS: Common Open Policy Service
The Common Open Policy Service (COPS) is a protocol for distributing networking policy to devices such as switches and routers. COPS allows a single Policy Decision Point (PDP) to distribute policy to multiple Policy Enforcement Points (PEPs). A PDP acts as a server for PEP clients. Figure 4.3 Provides an illustration of the COPS Network Architecture.
Figure 4.3: COPS Network
Architecture
A PDP contains all of the policy rulers for its associated PEPs. A PDP typically stores rules in a data and is a dedicated server, not a forwarding device.
A PEP is any network device that has to enforce policy decisions. For example, a switch that restricts network access or prioritizes traffic fits the definition of a Policy Enforcement Point. A PEP makes no policy decision. It simply applies policy that receives from its PDP.
COPS uses a connection-based query and response mechanism. The following scenario illustrates PEP-PDP communication:
A PEP comes online and opens a connection to its PDP.
After a connection has been established, the PEP transmits state information to the PDP.
The PDP uses that state information to determine what policy is applicable for the PEP.
Ethernet Switch Blade User's Guide release 3.2.2j page 70
Page 71
The PDP sends that policy to the PEP.
The PEP installs the policy and applies it to future traffic.
As long as COPS is running, a connection between the PEP and PDP should stay open. A PEP could query a PDP at any time asking for a policy decision. Alternatively, an administrator could modify the policy on a PDP, which would then push any policy changes to its PEPs.
Protocol Architecture
The COPS protocol is broken into several components. The base layer is the COPS protocol itself, which defines the messaging format. This protocol defines how communication is handled without specifying the details of the message data.
The base COPS protocol is then used by different client types. These client types apply the COPS messaging scheme to particular types of data. The currently standardized client types deal with the RSVP model (COPS-RSVP) and provisioning model (COPS-PR).
The COPS-RSVP scheme is designed around the requirement that a PEP will have to query a PDP in response to events. An RSVP PEP is constantly listening for resource reservation requests and relaying those requests to its PDP.
By contrast, the provisioning model is based on longer lasting policy. The expectation is that policy should be administratively defined at the PDP and pushed to the PEPs as needed. OpenArchitect is a COPS-PR client.
The most common use of COPS-PR is for distributing Differentiated Services (Diffserv) policy. Diffserv is concerned with such Quality of Service elements as queues and schedulers.
OpenArchitect PEP
The OpenArchitect PEP implementation is known as pepd. The pepd utility is based on:
RFC 2478: Common Open Policy Service (COPS)
RFC 3084: COPS Usage for Policy Provisioning
RFC 3159: Structure of Policy Provisioning Information
RFC 3289: Management Information Base (MIB) for the Differentiated Services Architecture
Internet Draft: Differentiated Services Quality of Service Policy Information Base (latest version draft-ietf-diffserv-pib-09)
Internet Draft: Framework Policy Information Base (latest version draft-ietf-rap-frameworkpib-
09)
A Policy Information Base (PIB) defines the representation of a particular data set. For example, the Diffserv PIB specifies the structures used to represent all Diffserv elements. PIBs are functionally equivalent to Management Information Bases (MIBs) such as those used by SNMP. The OA PEP has implemented those portions of the Diffserv and Framework PIBs that are supported by the underlying switch architecture.
Ethernet Switch Blade User's Guide release 3.2.2j page 71
Page 72
The pepd utility requires a PDP that has implemented the above RFCs and drafts. Until all draft standards are approved, the certain COPS-PR data types will not be assigned OIDs. pepd uses non-standard OIDs for the unassigned values.
Using pepd
The pepd utility works by connection to a PDP, informing the PDP of its roles, and installing any rules that the PDP has for those roles. Configuration information should be specified in a configuration file, specified on the command line with the –f option.
pepd –f <full_path_and_filename>
A sample configuration file is listed below:
PDP address: 10.0.0.11 PDP port: 3288 PEPID: some-id Role-If: a zre1,zre2,zre3,zre4
where,
PDP address: The IP address of the PDP. Default is loopback (127.0.0.1)
PDP port: The destination port on which to open a COPS connection. Default is 3288.
PEPID: The PEP Identifier
Role-If: A mapping of roles to interfaces. The name of the role is followed by a comma-
delineated list of interfaces. Multiple role-interface mappings are defined through multiple Role­If declarations.
Ethernet Switch Blade User's Guide release 3.2.2j page 72
Page 73
Chapter 5 Fabric Switch Administration
One of the main benefits of the OpenArchitect switch is that it runs Linux, so much of the switch administration is already familiar to most network or system administrators. It is a good idea to complement these instructions with a standard Linux reference guide, such as Linux Network Administrator’s Guide available from O’Reilly. Below are brief descriptions of some of the more routine administrative task pertinent to the switch.
Setting the Root Password
The switch is shipped with a default user root and no password. To set the root password, use the password command:
ZX7100-OA<release no.># passwd
Changing password for root
Enter the new password (minimum of 5, maximum of 8 characters)
Please use a combination of upper and lower case letters and numbers.
Enter new password:
Re-enter new password:
Password changed.
ZX7100-OA<release no.>#
NOTE: Even when just changing the password, you need to save the file system overlay with the zsync command, or you will lose your changes upon reboot.
Adding Additional Users
Additional users can be added with the adduser command. Additional users are desirable for connecting to the switch via ftpd and other daemons that require a login other than root and a password. To create a user named guest, run adduser
ZX7100-OA<release no.># adduser guest
Changing password for guest
Enter the new password (minimum of 5, maximum of 8 characters)
Please use a combination of upper and lower case letters and numbers.
Ethernet Switch Blade User's Guide release 3.2.2j page 73
Page 74
Enter new password:
Re-enter new password:
Password changed.
ZX7100-OA<release no.># zsync
ZX7100-OA<release no.>#
Setting up a Default Route
If you wish to access the switch from some place other than a directly attached network, you may want to setup a default route. Use the route command to set a default gateway.
route add default gw 10.0.0.254
Put the entry into the reboot.
/etc/init.d/rcS
startup script to automatically set a default route upon
Name Service Resolution
Name service lookups will be done locally using which name server to use by including an entry in
/etc/hosts
/etc/resolv.conf
. You can also tell the switch
.
DHCP Client Configuration
A utility is included to dynamically determine the IP address of the OpenArchitect switch interfaces. To set the the IP address dynamically, execute the command,
dhclient zhp0
The default device name, zhp0, works with the default configuration of the OpenArchitect switch and will attempt to obtain an IP address from the local DHCP server. To use DHCP to set your IP addresses automatically on boot up, uncomment the the following line in
/etc/init.d/rcS
by removing the # sign
/usr/sbin/dhclient zhp0
DHCP Server Configuration
The OpenArchitect switch includes a DHCP server. To start the DHCP server, configure
/
etc/dhcpd.conf
Ethernet Switch Blade User's Guide release 3.2.2j page 74
for your network, and run
Page 75
dhcpd
Consult Linux Network administration manuals for more information on DHCP and configuration options.
To use DHCP to set your IP addresses automatically on boot up, uncomment the the following line in
/etc/init.d/rcS
dhcpd
by removing the # sign
Network Time Protocol (NTP) Client Configuration
NTP is a protocol for setting the real time clock on a system. There are numerous primary and secondary servers available on the network. For more NTP information, and a list of available NTP servers, see the following URL:
http://www.ntp.org/
You will need to have your network settings properly configured to reach an available NTP server on your local network or the internet. To set the time and date, execute ntpdate with the
server of your choice. For example,
ntpdate –u ntp.ucsd.edu
The –u is required if the OpenArchitect switch is operating behind some types of firewalls.
If you wish for ntpdate to set your date and time automatically each time you boot,
uncomment the example ntpdate command line in ntpdate returns the Universal Time (UTC, formerly Greenwich Mean Time, or GMT). To display the localtime, set the TZ variable to the appropriate name and the number of hours offset from UTC. For instance,
export TZ=PST8
for Pacific Standard Time offset from UTC by 8 hours. To set an environment variable, add the entry to /etc/profile. Remember to zsync to make your changes permanent.
/etc/init.d/rcS
by removing the # sign.
Network File System (NFS) Client Configuration
The OpenArchitect switch includes an NFS client for mounting remote file systems. You will need to start NFS server processes in order to use NFS. You will need to start the following servers:
/sbin/portmap
Ethernet Switch Blade User's Guide release 3.2.2j page 75
Page 76
/sbin/rpc.statd
/usr/sbin/rpc.mountd -r
Once the above servers are started, you can mount a remote NFS file system.
mount rhost:nfs_file_system local_mount_point
If the remote NFS file system you’re mounting is on an OA switch, you should mount with caching disabled.
mount rhost:nfs_file_system –o noac local_mount_point
All the necessary servers are included in To automatically start all NFS client services each time you boot, uncomment the NFS Client servers. Go to the
removing the # sign.
/sbin/portmap
/sbin/rpc.statd
/usr/sbin/rpc.mountd -r
You can also include commands to mount remote NFS file systems at boot time. There is an example line included at the appropriate location in the mount command included for your particular configuration.
NOTE: A “sleep” of 5 seconds is included to allow time for the links to come up prior to attempting the mount.
sleep 5
mount 10.0.0.1:/nfs –t nfs –o noac /mnt
/etc/init.d/rcS
/etc/init.d/rcS
file. Uncomment the following command lines by
/etc/init.d/rcS
but are commented out by default.
. Uncomment and alter
NFS Server Configuration
The switch also contains an NFS server so that you can mount the switch file system from other systems. To enable the NFS server, first follow the steps to enable the NFS client. Then, edit
/etc/exports
Administrator’s Guide (or man pages) regarding options for exported file systems. Generally, an
entry in
/nfs *.localdomain.com(ro)
Ethernet Switch Blade User's Guide release 3.2.2j page 76
to include the file systems you wish to export. Consult a standard Linux Network
/etc/exports
looks like the following:
Page 77
Now start nfsd to export the mount points and begin answering requests from remote clients.
/sbin/rpc.nfsd –r
To export file systems automatically on boot, edit
/sbin/rpc.nfsd
/sbin/rpc.nfsd -r
command line by removing the #.
/etc/init.d/rcS
, uncomment the
Connecting to the Switch Using FTP
Use ftp to transfer files to or from the switch. See the Linux Reference Guide for details of the ftp command. In general, you can use ftp to connect to any system running an ftp server,
including other OpenArchitect switches, to either get (transfer files from the remote host to the switch) or put (transfer files from the switch to the remote host) files.
ftp <remote_host>
ftpd Server Configuration
The switch itself can also be configured to run an FTP server (ftpd). See the Linux Reference Guide for details of the ftpd command. You will need to add a user to the switch in order to connect via ftp from a remote host, since root is not allowed ftp access. See the earlier section in this chapter regarding how to add a user. The ftp daemon is started by default. If you wish to shutdown the ftp daemon, comment out the
betaftpd
line in
/etc/init.d/rcS
.
Connecting to the Switch Using TFTP
Trivial File Transfer Protocol or tftp, is a very simple protocol used to transfer files. It is designed to be small and easy to implement. Therefore, it lacks most of the features of a regular FTP, like user authentication. You can use ftpd to connect to any system running a tftp server (tftpd) including other OpenArchitect switches.
tftp <remote_host>
TFTPD Server Configuration
The tftp server is started by inetd(8) using the configuration set up in The use of tftp(1) does not require an account or password on the remote system. Due to the lack of authentication information, tftpd will allow only publicly readable files to be accessed. The default location of these files is
/tftpboot
.
Ethernet Switch Blade User's Guide release 3.2.2j page 77
/etc/inetd.conf
.
Page 78
SNMP Agent
Simple Network Management Protocol (SNMP) is the defacto standard for network management. An SNMP agent maintains a structure of data for a network device in a virtual information database, called a Management Information Base (MIB). A network management station is capable of accessing the MIB of the network device to monitor and configure the network device.
The OpenArchitect switch utilizes the NET-SNMP (formerly UCD-SNMP) agent core. Additional information on the agent can be found at: http://www.net-snmp.com. The OpenArchitect switch agent will respond to SNMPv1, SNMPv2, and SNMPv3 requests.
Protocols supported on the OpenArchitect switch by gated, such as RIP and OSPF communicate with SNMP agent via the SNMP Multiplexing (SMUX) protocol.
Supported MIBS
OpenArchitect includes MIB support as documented by each of the RFCs listed. The MIBs themselves are located on the switch in the /usr/share/snmp/mibs directory.
Supported MIBs
RFC 1155: Structure and Identification of Management Information for TCP/IP-based
Internets
RFC 1227: SNMP MUX Protocol and MIB
RFC 1493: Definitions of Managed Objects for Bridges (obsoletes RFC 1286)
RFC 1657: Definitions of Managed Objects for the Fourth Version of the Border
Gateway Protocol (BGP-4) using SMI-V2
RFC 1724: RIP Version 2 MIB Extension (obsoletes RFC 1389)
RFC 1850: OSPF Version 2 Management Information Base (obsoletes RFC 1253,
which obsoletes RFC 1252, which obsoletes RFC 1248)
RFC 2011: SNMPv2 Management Information Base for the Internet Protocol Using
SMIv2
RFC 2012: SNMPv2 Management Information Base for the Transmission Control
Protocol Using SMIv2
RFC 2012: SNMPv2 Management Information Base for the User Datagram Protocol
Using SMIv2
RFC 2013: Management Information Base for Network Management of TCP/IP-
based internets: MIB-II (obsoletes RFC 1213, which obsoletes RFC
1158)
RFC 2021: Remote Network Monitoring Management Information Base Version 2
RFC 2096: IP Forwarding Table MIB
RFC 2571: An Architecture for Describing SNMP Management Frameworks
RFC 2572: Message Processing and Dispatching for the Simple Network
Management Protocol (SNMP)
Ethernet Switch Blade User's Guide release 3.2.2j page 78
Page 79
Supported MIBs
RFC 2573: SNMP Applications
RFC 2574: User-based Security Model (USM) for version 3 of the Simple Network
Management Protocol (SNMPv3)
RFC 2575: View-based Security Model (VACM) for version 3 of the Simple Network
Management Protocol (SNMP)
RFC 2576: Coexistence between Version 1, Version 2 and Version 3 of the Internet-
standard Network Management Framework
RFC 2665: Definitions of Managed Objects for Ethernet-like Interfaces
RFC 2674: Definitions of Managed Objects for Bridges with Traffic Classes,
Multicast Filtering and Virtual LAN Extensions
RFC 2742: Definitions of Managed Objects for Extensible SNMP Agents
RFC 2787: Definitions of Managed Objects for the Virtual Router Redundancy
Protocol
RFC 2819: Remote Network Monitoring Management Information Base
RFC 2863: The Interfaces Group MIB (obsoletes RFC 2233, which obsoletes RFC
1573, which obsoletes RFC1229)
RFC 2932: IPv4 Multicast Routing MIB
RFC 3165: Definitions of Managed Objects for the Delegation of Management
Scripts
RFC 3231: Definitions of Managed Objects for Scheduling Management Operations
ZNYX Networks Private MIB
UCD-SNMP Enterprise MIB
Custom ZNYX MIB to support software and hardware features not covered by standard MIBs. The Private MIBs are ZX7100BASE.MIB AND ZX7100FABRIC.MIB, pointed to by ZNYX-H.MIB.
UCD-SNMP MIB related to management and monitoring of the LINUX host
Table 5.1: Supported MIBs
Supported Traps
Upon certain events, the OpenArchitect switch can be configured to send notification of the event, called an SNMP Trap out to a defined recipient/manager or managers. Traps are not issued in real time. OpenArchitect will send SNMP traps for the following conditions:
Ethernet Switch Blade User's Guide release 3.2.2j page 79
Page 80
Supported Traps
SNMPv2-MIB: coldStart
SNMPv2-MIB: authenticationFailure
IF-MIB: linkUp
IF-MIB: linkDown
UCD-SNMP-MIB: ucdShutdown
RMON-MIB: risingAlarm
RMON-MIB: fallingAlarm
VRRP: vrrpTrapNewMaster
VRRP: vrrpTrapAuthFailure
EGP (rfc1213): egpNeighborLoss
BGP4-MIB: bgpEstablished
BGP4-MIB: bgpBackwardTransition
Table 5.2: Supported Traps
SNMP and OpenArchitect Interface Definitions
OpenArchitect, defines three types of devices:
zre physical port
zrl trunk of ports
zhp interface consisting of ports (zres) and trunks of ports (zrls)
A
zrl
(trunk device) is treated as an aggregate of its constituent aggregate of its immediately contributing sub-interfaces ( up a trunk do not contribute to the
The administrative status of a
zre
zhp
and
.
zhp
are independent of each other. If the administrative
zre's
zres
and
(ports). A
zrl's
). The ports that make
zhp
is an
status is down, then the operational status will be down independent of the underlying link state. You must ifconfig up the
zres
to see the operational link status for a
zre
. When the administrative status is up, the operational status is dependent on the underlying physical state. For example, if status given the administrative status is up on
zhp0
contains
zre1
and
zre2
the following would be true for the operational
zre1, zre2
, and
zhp0
:
Ethernet Switch Blade User's Guide release 3.2.2j page 80
Page 81
Link and SNMP Status
Physical Link Status SNMP Operational Status
zre1 zre2 zre1 zre2
down down down down down
down up down up up
up down up down up
up up up up up
Table 5.3: Link and SNMP Status
The administrative status is directly controlled by ifconfig up/down. The administrative status of the zhps and zres do not affect each other.
ifStackTable Entries
In the actual ifStackTable as shown in the MIB walk the following two OIDs (which denote ifIndexes) show the relationships.
ifMIB.ifMIBObjects.ifStackTable.ifStackEntry.ifStackStatus.0.1 = active(1)
zhp0
ifMIB.ifMIBObjects.ifStackTable.ifStackEntry.ifStackStatus.0.2 = active(1)
If they are X.Y then
if X = 0 there is nothing above this interface
if Y = 0 there is nothing below this interface
otherwise interface X has interface Y as a logical constituent.
SNMP Configuration
The SNMP agent is called snmpd and is started by default from the Linux boot up script
/etc/rcZ.d/S75snmpd
Configuration of the OpenArchitect switch SNMP agent is the same as configuration of any standard Linux host that uses the NET-SNMP agent. Configuration information for persistent data and security information is kept in location, which for the OpenArchitect switch is location to change sys information such as the syslocation and syscontact, as well as permissions such as the rocommunity or rwcommunity.
NOTE: For NET-SNMP agents, these objects (sysLocation.0, sysContact.0 and sysName.0) ordinarily are read-write. However, specifying the value for one of these
objects by giving the appropriate token in snmpd.conf makes the corresponding object read-only, and attempts to set the value of the object will result in a notWritable error
. If you do not wish to start snmpd, remove
snmpd.conf
under the default SNMP configuration
/usr/share/snmp. snmpd.conf
etc/rcZ.d/S75snmpd
.
is the
Ethernet Switch Blade User's Guide release 3.2.2j page 81
Page 82
response.
The processing for link up and link down traps is now user configurable. As the default, traps conform to RFC2863, meaning the trap contents will include:
ifIndex, ifAdminStatus and ifOperstatus
You can alter this behavior by specifying:
cisco_link_traps on
If cisco_link_traps are turned on as described then link up and link down traps will have a cisco-like format and the trap contents will include:
ifDescr and ifType
Examine and edit Information in forced to reread its configuration. See the standard Linux man page for details.
/usr/share/snmp/snmpd.conf
/usr/share/snmp/snmpd.conf
is only read at startup - or when the daemon is
appropriately for your configuration.
snmpd.conf
for more
SNMP Applications
The OpenArchitect switch includes the snmpget, snmpwalk, and snmpset applications you can use these standard Linux utilities to test your SNMP agent. For example,
snmpwalk localhost –c public
walks the entire MIB of the localhost (that is, OpenArchitect switch) starting at the top of the MIB. See the Linux Reference Man Pages for the usage of the SNMP utilities.
MIB values are decoded from their numerical representations into readable text by parsing MIBs located in directory and zsync to save across reboots.
the
/usr/share/snmp/mibs/
directory. If you need to add a MIB, add it to that
Port Mirroring
zmirror sets packet mirroring from a given set of ports to a given port. Turning on packet mirroring causes a copy of the packet to be sent to the mirror_to port. There is only one mirror_to port, and no limitation on mirror_from ports. Use the zmirror command in the following way,
zmirror mirror_from mirror_to
After executing the following three commands, packets received on ports 0, 1 and 2 would be
Ethernet Switch Blade User's Guide release 3.2.2j page 82
Page 83
mirrored (copied and transmitted) to port 12. This mirroring would be in addition to any Layer 3 or Layer 2 switching.
zmirror zre0 zre12
zmirror zre1 zre12
zmirror zre2 zre12
To clear the current mirroring use the -t option. The -e option can be used to indicate that packets being sent on a given port should be copied to the mirror_to port. For example if the
-e option is used as follows, the packets transmitted, as opposed to received, on ports 0, 1 or 2 would be mirrored to port 12.
zmirror -e zre0 zre12
zmirror -e zre1 zre12
zmirror -e zre2 zre12
Link and LED Control
The zlc application sets the link speed and state of individual ports of the switch, or display their current state. It can also set or clear the extract led or the internal fault led, or to set a port down or up. To force the link on port 0 down,
zlc zre1 down
To check the status of a link,
zlc zre1 query
To check the status of all links,
zlc zre0..51 query
Link Event Monitoring
The zlmd application is intended to run as a daemon, waiting for a configured event to occur and then running the program configured for that event. The events monitored are changes in the link status at any of the in-band ports of the switch, the start of removal of the switch from the ATCA backplane, or the cancellation of the removal before it actually takes place. The program can be a shell script that initiates appropriate actions to respond to the event.
Ethernet Switch Blade User's Guide release 3.2.2j page 83
Page 84
Chapter 6 Fabric Switch Maintenance
Boot ROM
on Device 0
zmon
Free space
Offset 7f000
dev
bootstring
Application
Flash 1 on
Device 1
Offset 0
initrd
Linux and
its file
system
Free space
overlay
file system
Application
Flash 2 on
Device 2
Offset 0
initrd
(exact copy
as in
Application
Flash 1)
Linux and its
file system
Free space
This chapter includes basic information about the OpenArchitect switch environment including an overview of the file system structure, modifying and updating switch files, upgrading the switch driver and kernel, and implementing a system recovery.
Overview of the OpenArchitect switch boot process
The OpenArchitect switch is equipped with a Random Access Memory (RAM) disk and three Read-Only Memory (ROM) devices, including, a boot ROM and two application flash devices.
Figure 6.1: ROM Devices in Open Architect
The boot ROM is located on device 0 and contains the OpenArchitect zmon application that operates as a boot loader and includes a device bootstring. Device 1 contains the application
flash 1 image of the Linux operating system and the OpenArchitect overlay file system. Application flash 1 is the primary working image for the switch. Device 2 contains the application flash 2 that is an exact copy of application flash 1. You would only boot from this device if application flash 1 is corrupted and you need to restore the switch to the factory-shipped configuration.
Ethernet Switch Blade User's Guide release 3.2.2j page 84
Page 85
Bootloader examines the
bootstring in the boot
ROM
Determines
if the boot string
is dev1
Loads image from Flash 1
to RAM
Yes
No
Boot into
zmon bootloader
No
Yes
Determines
if the boot String
is dev2
Loads image from Flash 2
to RAM
Begins
execution of
RAM image
Under normal circumstances, the booting up process follows the process outlined in Figure 6.2. During boot up, the zmon bootloader reads the device bootstring to locate and validate the correct application image to load. The bootstring command is in the following format:
Figure 6.2: Boot Flow Chart
boot : X | [<options>] X represents the device value 0, 1 or 2
The boot process opens and uncompresses the initrd image onto the RAM disk. Then zmon begins booting the Linux image. After Linux boots, the init process executes the
/
etc/init.d/rcS
Flow). The
script which, in turn, executes
/etc/rcZ.d/rc
files are the switch configuration files (for example, S50layer2).
Ethernet Switch Blade User's Guide release 3.2.2j page 85
script runs S* files in
/etc/rcZ.d/rc
/etc/rcZ.d
, with the start parameter. The S*
(see Figure 6.3: Init Script
Page 86
Figure 6.3: Init Script
/etc/init.d/rcS
/etc/rcZ.d/rc
S* S* S*
Flow
Saving Changes
Any modifications made to the scripts for your particular configuration must be properly saved or your changes are lost when you reboot. The file system for the switch only exists in memory. A rewritable overlay is contained within the upper four megabytes of the first application flash.
Modifying Files and Updating the Switch
Any file in OpenArchitect can be added, deleted or modified, with the exception of
/usr/sbin/zmnt, /lib/modules/zfm_c.o
reboot by running the script zsync.
A directory /.zsync overlaying process. The user should not modify the files in this directory or unpredictable results may occur.
contains database files used by zsync for managing the file system
,
and the
/tmp
directory. Files are saved across a system
/sbin/init
,
Recovering from a System Failure
If the switch does not function after you initially change or reconfigure the image, you have several options for recovering from an error. First, try to telnet into the switch. If you are
successful, remember to run zsync after fixing your problem.
If you cannot telnet, attach a console cable to the switch. Bring down the system and properly attach the console cable, see Connecting to the Console Port .
System Boots with a Console Cable
After attaching the system console cable, if the system boots, fix the problem that does not allow you to telnet to the box, run zsync, and reboot. The problem is likely to be in the
Ethernet Switch Blade User's Guide release 3.2.2j page 86
Page 87
configuration files contained in
/ e t c / r c Z . d
In order to telnet into the box, there must be a
configured interface with a proper IP address. For example, zhp0 is configured with the IP address 10.0.0.43 in the factory default configuration.
Booting with the –i option
If you cannot telnet into the switch and Linux fails to boot, it is likely that a change saved by zsync has left the switch in an inaccessible state. To allow users to recover from mistakes saved
in the overlay file system, a boot argument of –i passed to the init process will stop the untarring of the saved overlay files. As a result, the system boots to the factory-shipped
configuration.
Connect through the console port. During boot up, the system displays the Linux boot string.
Linux/PPC load: for 5 seconds. During the 5 second pause, enter the boot option -i and press Return
Linux/PPC load: root=/dev/ram init=/sbin/init -i
Initiating the –i option of zbootcfg.
zbootcfg –d 1 –i
Reboot the system. After the reboot, clear the –i option from the boot string. Enter the
following command:
zbootcfg –d 1
The reboot command will also take -i as an option and pass it to the Linux boot,
reboot -i
When the system boots, the overlay file system is returned to the factory-installed
configuration. At this point, you have a few options.
Run zsync and the factory-installed system will be restored to your flash.
CAUTION: All changes you have made and saved prior to the zsync command will be lost.
Restore particular files from the existing overlay. Use the zmnt command to mount the
overlay in a designated directory and copy back just the changes you want to keep from the existing overlay. For example, if you wanted to recover your existing overlay, use zmnt to mount the overlay in a designated directory, like
copy
/tmp/etc/hosts
zmnt /tmp
to
/etc/hosts
. Lastly, use zsync to save your changes.
/etc/hosts
file from the
/tmp
, then
cp /tmp/etc/hosts /etc/hosts
Ethernet Switch Blade User's Guide release 3.2.2j page 87
Page 88
zsync /etc/hosts
Reboot the system.
System Hangs During Boot
After attaching the system console cable, if the system hangs during boot, try booting with the –i option as described in the previous section. It is possible that important Linux system files became corrupted and incorrectly saved in the flash overlay. Use zmnt as described in the previous section to fix or remove the problem files from the overlay. If the system will not boot with the –i option, refer to Booting the Duplicate Flash Image section in this chapter.
Booting the Duplicate Flash Image
Another recovery method, if Linux fails to boot, is to temporarily boot the factory-installed duplicate image located in the second flash device.
Connect through the console port.
When you see the number counter appear after the zmonitor … banner, press any key on the console keyboard to enter the zmon application.
At the monitor prompt, type
boot:2
You should see the counter again, but the system should boot into the secondary kernel. If you have difficulties booting, contact Hewlett-Packard technical support.
At this point, follow the Upgrading the OpenArchitect Image section to put a new RAM disk image in the application flash 1.
IMPORTANT: Be sure not to program flash 2, since currently this is your only bootable image.
The command to program flash 1 should be similar to the following command. The image name may be slightly different depending on the model of switch and version of the image:
zflash –d 1 rdr7100.zImage.initrd
:
Upgrading the OpenArchitect Image
Use telnet, or preferably, attach a console cable to the switch, and login to the switch. If you are connecting via telnet, be aware that the upgrade process will reset the switch to the default IP address of 10.0.0.43, so you will have to be able to reach 10.0.0.43.
Ethernet Switch Blade User's Guide release 3.2.2j page 88
Page 89
Download the OpenArchitect image to a local system.
The OpenArchitect image is very close to the limit of free space available on a default system so you may need to clear some space prior to downloading the OpenArchitect image to the switch. Check for free space with the df command.
One of the easiest ways to create free space is to remove /usr/sbin/gated. The application will be replaced during the update procedure. Once you have enough free space, proceed.
From the switch console, ftp the new OpenArchitect (rdr) image from the local system to
your switch.
The switch has two flash available: Device 1 and device 2. Use the zflash command to
write the new OpenArchitect image into the first flash device.
NOTE: Make sure that Surviving Partner is not running before using zflash. The delays incurred while zflash writes the flash can cause the Surviving Partner daemons to think there is a failure, resulting in link oscillation.
zflash -d 1 <image_file>
The image
file will be something named similar to the following,
zflash -d 1 rdr7000.zImage.initrd
Upgrading or Adding Files
Follow the procedure below to upgrade or add a new file to the switch. Place the file you are adding or upgrading into the appropriate location in the file system. Save the file in the overlay directory area on the application flash by running zsync.
zsync
After running zsync, the file is saved to the flash for future reboots.
Excluding Saving Files to Flash
Specific files or directories can be excluded from saving to flash by zsync by including an entry in
/etc/exclude
order to save those files to flash with zsync.
. Likewise, existing entries in
/etc/exclude
such as
/tmp
can be removed in
Upgrading the Switch Driver
The switch driver upgrade process is the same as a file upgrade. However, more caution should be taken since the driver module is likely to be the method by which you are logging into the system. If the switch driver has a problem, you will need to have a console cable to recover. To upgrade a switch driver, replace the file /
Ethernet Switch Blade User's Guide release 3.2.2j page 89
lib/modules/if_zxe.o
, run zsync and reboot.
Page 90
Using apt-get
apt-get is a utility created by the Debian Linux community to allow remote fetching and installation of software stored in a repository in Debian package format. It allows users to keep
their software up-to-date with the latest binaries, and install new software without the need to recompile.
Users may create their own repositories and add entries in /etc/apt/sources.list ( empty by default ) for their private access methods to their private repository. See
http://www.debian.org for complete APT documentation.
Ethernet Switch Blade User's Guide release 3.2.2j page 90
Page 91
Chapter 7 Base Switch Configuration
At this point, the OpenArchitect Ethernet Switch Blade should be installed and powered up for the first time. This chapter helps you connect and configure the base switch by presenting command line examples as well as a discussion of the example configuration scripts. You may configure the fabric switch independently from the base switch.
Two switches, two consoles
There are two separate switches in the Ethernet Switch Blade. The base switch handles traffic among base ports 0-23. These ports are reserved for control functions on the ATCA rack such as connecting to IPMI (shelf managers), and connecting each node card to control and monitoring devices.
Connecting to the Base Switch Console
You can connect to the switch console using a telnet connection or with a console cable. Use the procedure below for a telnet connection. See Connecting to the Console Port, for instructions.
Connect an Ethernet cable to the host and the switch.
Configure a host on the 10.0.0.0 network.
The OpenArchitect switch is pre-configured with address 10.0.0.42. telnet to 10.0.0.42.
telnet 10.0.0.42
After you are connected, enter the login name
ZX6000-OA login: root
ZX6000-OA<release no.>#
root
. No password is required.
OpenArchitect Configuration Procedure
Switch configurations can be accomplished with a few simple commands. Once you’ve configured your switch, the commands should be placed into a start up configuration script. Like most Linux systems, the OpenArchitect switch boot process runs initialization commands and scripts in
executes all scripts located in Any configuration scripts you create should be named in the standard Linux/Unix manner,
starting with an uppercase “S” and numbered in the sequence you would like them executed. The final step once the switch has been properly configured is to use the zsync command to save all
/etc/init.d/
. In particular, OpenArchitect runs
/etc/rcZ.d
starting with an uppercase “S” in alphabetical order.
/etc/init.d/rcS
which in turn
Ethernet Switch Blade User's Guide release 3.2.2j page 91
Page 92
files into flash for reloading.
Changing the Shell Prompt
You may use standard bash shell procedures to change the prompts on your base switches. Many sites choose a system that distinguishes among the individual switches at their location. The same rules apply for saving your choice (zsync) as for all other configuration changes.
Default Configuration Scripts
As shipped the following scripts are run from /etc/rcZ.d as the switch boots up:
S20stack
switch fabric chips into a single 24 port virtual switch. zstack must be run before any other switch configuration.
S30e1000
ports.
S40vpd
Data (VPD) area if necessary.
-
Script that calls zstack to combine the two BCM5695 twelve-port
-
Script that loads the e1000 driver module for the Out-of-Band Ethernet
-
Script that checks the current OA version, and loads into the Vital Product
S50layer2
-
Script that sets up a basic Layer 2 switch. All 24 10/100/1000 ports are
set up on one IP network (VLAN). The ISL is set up in its own vlan.
Example Configuration Scripts
Example scripts are supplied that can be used as templates. Use one of the scripts located in the switch configuration for the switch is located in the script file
/etc/rcZ.d/examples
directory to help you configure the switch. The default
/etc/rcZ.d/S50layer2
.
The following scripts are included. Each is examined in more detail later in the appropriate section describing common Layer 2 and Layer 3 configurations:
S50layer2
-
Script which sets up a basic Layer 2 switch. All 24 10/100/1000 ports are set up on one IP network (VLAN). This is a copy of the switch in /etc/rcZ.d that is loaded in the default configuration.
S50layer2sp
-
Script which sets up a basic Layer 2 switch. All 24 10/100/1000 ports
are set up on one IP network (VLAN), and turns on bridge support for Spanning Tree.
S50layer3
-
Script which sets up a basic Layer 3 switch. All 24 10/100/1000 are set up on individual IP networks (VLANs). Layer 3 switching is enabled.
Ethernet Switch Blade User's Guide release 3.2.2j page 92
Page 93
S50multivlan
-
Script which sets up multiple untagged VLANs. The first VLAN includes the first ten 10/100/1000 ports, the next contains the last ten 10/100/1000 ports, the third VLAN contains two 10/100/1000 ports, the last VLAN contains the last two 10/100/1000 ports. Layer 3 switching is enabled.
S55gatedRip1
-
Script which is used with a Layer 3 switch and calls the GateD daemon to enable RIP 1 routing protocol.
S55gatedRip2
-
Script which is used with a Layer 3 switch and calls the GateD daemon to enable RIP 2 routing protocol.
S55gatedOspf
-
Script which is used with a Layer 3 switch and calls the GateD
daemon to enable OSPF routing protocol.
Overview of OpenArchitect VLAN Interfaces
When you initially boot up the switch, one virtual host port is automatically created by OpenArchitect to enable interaction between the software and hardware. This initial host port, called ZNYX Host Port (zhp), is a network interface that provides communication between all 24 in-band ports. Therefore, linking to any port on the switch enables you to connect with OpenArchitect.
A zhp device is associated with one Virtual Local Area Network (VLAN). A virtual local area network (VLAN) is a logical mapping of workstations and network devices on some basis other
than geographic location (for example, by department, type of user, or primary application). The primary purpose of a VLAN is to isolate traffic and enable communication to flow more efficiently within groups of mutual interest. VLANs reduce the time it takes to implement workstation and network moves, adds and changes. The switch is used to bridge from one VLAN to another. Figure 7.1 is an illustration of multiple VLANs.
Ethernet Switch Blade User's Guide release 3.2.2j page 93
Page 94
Figure 7.1: Multiple VLANs
Tagging and Untagging VLANs
The OpenArchitect switch is capable of switching VLAN tagged and untagged data packets. VLAN tagged packets conform to the 802.1q specification and the packet header contains an additional four bytes of VLAN tag information. A given port can be specified to accept VLAN tagged or untagged traffic. Internally, all traffic for a particular VLAN is treated as tagged traffic.
Switch Port Interfaces
For each switch port, OpenArchitect creates a separate interface with its own MAC address called a ZNYX raw Ethernet (
each in band port. You cannot directly access or modify the
During the initial power up of the switch, the default configuration creates a Layer 2 switch. The Layer 2 configuration places all of the zre interfaces in the same
after zre represents the corresponding switch port number (that is, zre1 represents port 1 on the switch).
zre
). After the initial power up, 24
zre
interfaces are created, one for
zre
interfaces.
zhp
interface. The number
Layer 2 Switch Configuration
The steps to build a Layer 2 switch involve creating a group of switch ports in a VLAN (or Layer 2 switching domain) and bringing that interface up. zconfig creates the VLAN group of switch
ports as well as a network interface. Use ifconfig(1M) on the network interface to bring up the VLAN group. Figure 7.2 provides an illustration of a Layer 2 Switch connection.
Ethernet Switch Blade User's Guide release 3.2.2j page 94
Page 95
Figure 7.2: Layer 2 Switch
Linux IP
zhp0
VLAN 1
24 10/100/1000 Ports
10.0.0.42
zre0 zre1 zre2 zre22zre20 zre23
. . . . . .
During the initial power up, a startup script called time creating a single untagged VLAN (IP interface labeled as
/etc/rcZ.d/S50layer2
zhp0
) which includes all Ethernet
is executed at boot
and gigabit ports as one Layer2 switch. The interface to the host is then assigned the IP address of
10.0.0.42 to allow access to the switch. The
Uses zconfig to create and configure a single, untagged VLAN that contains all 24 switch
S50layer2
script does the following:
ports.
/usr/sbin/zconfig zhp0: vlan1=zre0..23
/usr/sbin/zconfig zre0..23=untag1
Uses ifconfig(1M) to assign the IP address 10.0.0.42 to the interface.
/usr/sbin/ifconfig zhp0 10.0.0.42 up
To create another VLAN that only contained the two ports, first use zconfig from the command to build the VLAN and create a network interface for the host.
zconfig zhp1: vlan2=zre20,zre21
Then, bring up the interface with ifconfig(1M):
ifconfig zhp1 193.08.1.1 up
Note that ports zre20 and zre21 are members of both vlan1 and vlan2, and that they are tagged for vlan2. A port cannot be untagged for more than one VLAN. You can view the configured VLANs with zconfig.
zconfig -a
Ethernet Switch Blade User's Guide release 3.2.2j page 95
Page 96
Using the S50layer2 Script
The
S50layer2
example, to reconfigure the IP address on your Layer 2 switch,
script can be used and example, or edited to customize your Layer2 setup. For
Open the
Change the IP address value listed under the Linux ifconfig(1M) command line.
Save your changes by running OpenArchitect zsync.
Reboot the switch.
S50Layer2
file in the Linux vi editor.
Rapid Spanning Tree
The Rapid Spanning Tree Protocol (RSTP) configures a simply connected active topology from the arbitrarily connected components of a Bridged Local Area Network. RSTP participants use a simple dialog carried in packets called Bridge Protocol Data Units (BPDUs) for finding the shortest path between two networks and for eliminating loops from the topology. If nodes attached to ports fail or are added or deleted, the topology dynamically changes to accommodate the new configuration. If your network topology is such that there is no real redundancy or chance for loops, you do not need to turn on Spanning Tree.
zl2d is a shell script used to create Linux bridges consisting of the name of the previously created zhp device or devices preceded with a "b" (for example, if you are creating a Bridge device from zhp0, the resulting device would be bzhp0). zl2d then starts a background task that monitors the port information of the Linux bridge at a specified interval and updates the Spanning Tree state fields in the hardware when necessary.
brctl(8) is called by zl2d for configuring certain RSTP parameters. For an explanation of these parameters, see the IEEE 802.1d specification, or reference the brctl(8) man page in Appendix A. The following demonstrates a simple example of setting up a Layer 2 switch and starting RSTP.
To Enable Rapid Spanning Tree:
Create a VLAN containing the ports that will be a part of the Linux bridge running Rapid Spanning Tree. This example will use ports 0-3 (untagged):
zconfig zhp0: vlan1=zre0..3
zconfig zre0..3=untag1
Create a bridge device from the zhp device,
zl2d start zhp0
A Bridge device named bzhp0 should now exist consisting of ports zre0 through zre3 with Spanning Tree enabled. To view the bridge device, use the brctl command,
Ethernet Switch Blade User's Guide release 3.2.2j page 96
Page 97
brctl show brctl showbr bzhp0
Port Path Cost
Each port has an associated cost that contributes to the total cost of the path to the Root Bridge when the port is the root port. The smaller the cost, the better the path. The Ethernet Switch Blade uses the following IEEE 802.1D recommendations based on the connection speed of your port:
Port Path Cost
Link Speed
10 Mb/s 100 50-600
100 Mb/s 19 10-60
1 Gb/s 4 3-10
Recommended Value
Recommended Range
Table 7.1: Port Path Cost
To change the port path, use the brctl setpathcost option. For example, to set the port priority to a value consistent with a gigabit interface,
brctl setpathcost bzhp0 zre1 4
Layer 3 Switch Configuration
The previous section outlines the Layer 2 switch configuration that is automatically configured when you initially bring up the OpenArchitect switch. In order to communicate between Layer2 interfaces, you must properly setup routing.
The steps to build a Layer 2 switch involve creating a group of switch ports in a VLAN (or Layer 2 switching domain) and bringing that interface up. zconfig creates the VLAN group of switch ports as well as a network interface. Use ifconfig(1M) on the network interface to bring up the VLAN group with Layer 2 switching. Layer3 routing information is then used to route between the Layer2 network devices.
Take a simple example of two VLANs configured on the switch, each with four ports. First teardown any existing configuration,
zconfig –t
Use zconfig to create two new VLANs, each with four ports, and untag them,
zconfig zhp0: vlan1=zre1..4
zconfig zre1..4=untag1
Ethernet Switch Blade User's Guide release 3.2.2j page 97
Page 98
zconfig zhp1: vlan2=zre5..8
zconfig zre5..8=untag2
Now, use ifconfig to assign each zhp interface an IP address,
ifconfig zhp0 10.0.0.1
ifconfig zhp1 11.0.0.1
At this point, the Linux host has enough information to route between the networks of the directly attached interfaces, 10.0.0.0 via zhp0, and 11.0.0.0 via zhp1.
The next step is to enable the ZNYX zl3d daemon to move that routing information from the host to the base switch switching tables in silicon. Once enabled, zl3d will monitor the Linux routing tables for changes in configuration and update the switch silicon tables. Start zl3d to update the switch tables:
zl3d zhp0 zhp1
The base switch switch is now configured as a Layer3 switch that can route between two Layer2 devices in silicon.
Using the S50layer3 Script
To modify the configuration to a Layer 3 switch, remove the
/etc/rcZ.d
In the
S50layer3 file, each port is assigned its own Virtual Local Area Network (VLAN)
interface (port interfaces are labeled as with an individual but a VLAN cannot. Each zre interface is assigned a separate IP address in the example script (see Figure 7.3).
directory, and replace it with the example script file,
zhpN
, where
zhp
interface. Remember, zre and
N is an integer). Each VLAN is associated
zhp
S50layer2
S50layer3
interfaces can begin with a zero value
file from the
.
Ethernet Switch Blade User's Guide release 3.2.2j page 98
Page 99
Linux IP
zre1
VLAN 2
zre4
zre3
VLAN 3
zre2
VLAN 4
zre11
VLAN 11
zre7
VLAN 7
zre9
VLAN 9
zre6
VLAN 6
zre8
VLAN 8
zre10
VLAN 10
zre5
VLAN 5
VLAN 14
zre12
VLAN 12
zre13
VLAN 13
zhp0 - zhp23
zre20
VLAN 15
Each vlan interface (zhp) has only one switch port (zre)
VLAN 1
zre0
zre15
VLAN16
zre16
VLAN17
zre17
VLAN18
zre19
VLAN20
VLAN19 VLAN23
VLAN22
VLAN21
zre21
zre22
zre18
VLAN24
zre23
zre14
Figure 7.3: Layer 3 Switch
The S50layer3 script executes the following commands:
Runs zconfig command to create 24 untagged VLANs (one for each switch port).
/usr/sbin/zconfig zhp0..23: vlan1..24=zre0+
/usr/sbin/zconfig zre0..23=untag1+
NOTE: Double periods (..) after vlan1 and untag1 are used to indicate a range of values. The plus (+) sign after zre1 is a wildcard character that means auto-incremented and causes each zhp interface to hold only one zre (that is, zhp0 has zre1 on vlan1, zhp1 has zre1 on vlan2).
Runs the Linux ifconfig(1M) command for each interface to assign default IP
addresses (10.0.0.42-10.0.23.42), sets the netmask and brings up the interfaces.
ifconfig zhp0 10.0.00.42 netmask 255.255.255.0 up
ifconfig zhp1 10.0.01.42 netmask 255.255.255.0 up
ifconfig zhp2 10.0.02.42 netmask 255.255.255.0 up
.
.
.
ifconfig zhp21 10.0.21.42 netmask 255.255.255.0 up
ifconfig zhp22 10.0.22.42 netmask 255.255.255.0 up
ifconfig zhp23 10.0.23.42 netmask 255.255.255.0 up
Ethernet Switch Blade User's Guide release 3.2.2j page 99
Page 100
Runs the OpenArchitect zl3d. The zl3d application monitors the Linux routing tables
and updates the switch routing tables for each interface configured above.
/usr/sbin/zl3d zhp0..23
zl3d initially creates and adds each zhp interface (VLAN) to the switch routing tables. The
zhp0..zhp23 is shorthand for the list of interfaces (zhp0, zhp1, …, zhp23) to monitor with zl3d.
To Modify the Layer 3 Script
Modify the example script you copied into the /etc/rcZ.d directory. Adjust and
assign the number of IP addresses as applicable. In the example below, the IP address is changed for the interface in the ifconfig command line of the script.
From:
ifconfig zhp0 10.0.0.42 netmask 255.255.255.0 broadcast 10.0.0.255 up
To:
ifconfig zhp0 193.08.1.1 netmask 255.255.255.0 broadcast
193.08.1.255 up
Adjust the number of
zhp
interfaces, that are added to the routing tables, depending on the number of VLANs you are adding for your network. Include any other details, as applicable.
Run the OpenArchitect zsync command to save your changes.
zsync
Reboot the switch.
After rebooting, your switch works from your customized Layer 3 configuration.
Layer 3 Switch Using Multiple VLANs
An example script is also provided for setting up multiple VLANs each with multiple ports.
Using the S50multivlan Script
The Layer 3 switch example file, S50multivlan, is included to help you configure multiple
VLANs to a Layer 3 switch. A VLAN can include one or more switch ports. In the
S50multivlan
VLAN 1, zhp0: for the first set of six ports, zre0-zre5
VLAN 2, zhp1: for the second set of six ports, zre6-zre11
VLAN 3, zhp2: for the third set of six ports, zre12-zre17
file, four VLANs are created (see 4):
Ethernet Switch Blade User's Guide release 3.2.2j page 100
Loading...