Allied Telesis AlliedWare OS User Manual

AlliedWareTM OS
How To |
Configure EPSR (Ethernet Protection Switching Ring) to Protect a Ring from Loops

Introduction

Putting a ring of Ethernet switches at the core of a network is a simple way to increase the network’s resilience—such a network is no longer susceptible to a single point of failure. However, the ring must be protected from Layer 2 loops. Traditionally, STP-based technologies are used to protect rings, but they are relatively slow to recover from link failure. This can create problems for applications that have strict loss requirements, such as voice and video traffic, where the speed of recovery is highly significant.
This How To Note describes a fast alternative to STP: Ethernet Protection Switching Ring (EPSR). EPSR enables rings to recover rapidly from link or node failures—within as little as 50ms, depending on port type and configuration. This is much faster than STP at 30 seconds
1
or even RSTP at
to 3 seconds.

What information will you find in this document?

This How To Note begins by describing EPSR in the following sections:
"How EPSR Works" on page 3
"Establishing a Ring" on page 4
"Detecting a Fault" on page 5
"Recovering from a Fault" on page 5
"Restoring Normal Operation" on page 7
Next it gives step-by-step configuration details and examples in the following sections:
"How To Configure EPSR" on page 8
"Example
1
: A Basic Ring" on page 11
C613-16092-00 REV D
"Example 2: A Double Ring" on page 14
www.alliedtelesis.com

Which products and software versions does it apply to?

"Example 3: EPSR and RSTP" on page 17
"Example 4: EPSR with Nested VLANs" on page 20
"Example 5: EPSR with management stacking" on page 23
"Example 6: EPSR with an iMAP" on page 26
Next, it discusses important implementation details in the following sections:
"Classifiers and Hardware Filters" on page 29
"Ports and Recovery Times" on page 30
"IGMP Snooping and Recovery Times" on page 31
"Health Message Priority" on page 31
Finally, it ends with troubleshooting information in the following sections:
"EPSR State and Settings" on page 32
"SNMP Traps" on page 34
"Counters" on page 35
"Debugging" on page 36
Which products and software versions does it apply to?
This How To Note applies to the following Allied Telesis switches:
AT-8948, x900-48FE, x900-48FE-N, AT-9924T, AT-9924SP, and AT-9924T/4SP switches,
running software version 2.8.
AT-9924Ts, x900-24XT, and x900-24XT-N switches running software version 3.
later
EPSR is also available on the following Allied Telesis switches, running the AlliedWare Plus OS software version 5.2.
1
or later:
SwitchBlade x908
x900 series
For information about using the AlliedWare Plus OS, see the AlliedWare Plus Note, How To Configure EPSR (Ethernet Protection Switching Ring) to Protect a Ring from Loops. This Note is available from www.alliedtelesis.com/resources/literature/howto_plus.aspx.
The implementation on the above switches is also compatible with EPSR on Allied Te l e s i s ’ Multiservice Access Platforms (iMAPs).
1
or later
1.1
or
Page 2 | AlliedWare™ OS How To Note: EPSR
EPSR Components
EPSR domain:
A protection scheme for an Ethernet ring that consists of one or more data VLANs and a control VLAN.
Master node:
The controlling node for a domain, responsible for polling the ring state, collecting error messages, and controlling the flow of traffic in the domain.
Transit node:
Other nodes in the domain.
Ring port:
A port that connects the node to the ring. On the master node, each ring port is either the primary port or the secondary port. On transit nodes, ring ports do not have roles.
Primary port:
A ring port on the master node. This port determines the direction of the traffic flow, and is always operational.
Secondary port:
A second ring port on the master node. This port remains active, but blocks all protected VLANs from operating unless the ring fails. Similar to the blocking port in an STP/RSTP instance.
Control VLAN:
The VLAN over which all control messages are sent and received. EPSR never blocks this VLAN.
Data VLAN
A VLAN that needs to be protected from loops. Each EPSR domain has one or more data VLANs.
D
a
t
a
V
L
A
N
_
2
D
a
t
a
V
L
A
N
_
1
C
o
n
t
r
o
l
V
L
A
N
Master
Node
Transit
Node
1
Transit
Node
4
Transit
Node
2
Data VLAN_1
Control VLAN
Primary Port
Transit
Node
3
epsr-basic-ring
Control VLAN
Data VLAN_2
P
SSecondary Port
Control VLAN is forwarding Data VLAN is forwarding
End User Ports
Data VLAN_2
Control VLAN is forwarding Data VLAN is blocked
Data VLAN_1
P
S
End User Ports
End User Ports
End User Ports
End User Ports

How EPSR Works

How EPSR Works
EPSR operates on physical rings of switches (note, not on meshed networks). When all nodes and links in the ring are up, EPSR prevents a loop by blocking data transmission across one port. When a node or link fails, EPSR detects the failure rapidly and responds by unblocking the blocked port so that data can flow around the ring.
In EPSR, each ring of switches forms an EPSR domain. One of the domain’s switches is the master node and the others are transit nodes. Each node connects to the ring via two ports.
One or more data VLANs sends data around the ring, and a control VLAN sends EPSR messages. A physical ring can have more than one EPSR domain, but each domain operates as a separate logical group of VLANs and has its own control VLAN and master node.
On the master node, one port is the primary port and the other is the secondary port. When all the nodes in the ring are up, EPSR prevents loops by blocking the data VLAN on the secondary port.
The master node does not need to block any port on the control VLAN because loops never form on the control VLAN. This is because the master node never forwards any EPSR messages that it receives.
The following diagram shows a basic ring with all the switches in the ring up.
Page 3 | AlliedWare™ OS How To Note: EPSR
How EPSR Works

Establishing a Ring

Once you have configured EPSR on the switches, the following steps complete the EPSR ring:
1. The master node creates an EPSR Health message and sends it out the primary port. This increments the master node’s Transmit: Health counter in the show epsr count command.
2. The first transit node receives the Health message on one of its two ring ports and, using a hardware filter, sends the message out its other ring port.
Note that transit nodes never generate Health messages, only receive them and forward them with their switching hardware. This does not increment the transit node’s Transmit: Health counter. However, it does increment the Transmit counter in the show switch port command.
The hardware filter also copies the Health message to the CPU. This increments the transit node’s Receive: Health counter. The CPU processes this message as required by the state machines, but does not send the message anywhere because the switching hardware has already done this.
3. The Health message continues around the rest of the transit nodes, being copied to the CPU and forwarded in the switching hardware.
4. The master node eventually receives the Health message on its secondary port. The master node's hardware filter copies the packet to the CPU (which increments the master node’s Receive: Health counter). Because the master received the Health message on its secondary port, it knows that all links and nodes in the ring are up.
When the master node receives the Health message back on its secondary port, it resets the Failover timer. If the Failover timer expires before the master node receives the Health message back, it concludes that the ring must be broken.
Note that the master node does not send that particular Health message out again. If it did, the packet would be continuously flooded around the ring. Instead, the master node generates a new Health message when the Hello timer expires.
Page 4 | AlliedWare™ OS How To Note: EPSR

Detecting a Fault

Master Node States
Complete:
The state when there are no link or node failures on the ring.
Failed:
The state when there is a link or node failure on the ring. This state indicates that the master node received a Link-Down message or that the failover timer expired before the master node’s secondary port received a Health message.
Transit Node States
Idle:
The state when EPSR is first configured, before the master node determines that all links in the ring are up. In this state, both ports on the node are blocked for the data VLAN. From this state, the node can move to Links Up or Links Down.
Links Up:
The state when both the node’s ring ports are up and forwarding. From this state, the node can move to Links Down.
Links Down:
The state when one or both of the node’s ring ports are down. From this state, the node can move to Pre­forwarding
Pre-forwarding:
The state when both ring ports are up, but one has only just come up and is still blocked to prevent loops. From this state, the transit node can move to Links Up if the master node blocks its secondary port, or to Links Down if another port goes down.
EPSR uses a fault detection scheme that alerts the ring when a break occurs, instead of using a spanning tree­like calculation to determine the best path. The ring then automatically heals itself by sending traffic over a protected reverse path.
EPSR uses the following two methods to detect when a transit node or a link goes down:
Master node polling fault detection
To check the condition of the ring, the master node regularly sends Health messages out its primary port, as described in "Establishing a
Page 5 | AlliedWare™ OS How To Note: EPSR
Ring" on page 4. If all links and nodes in the ring are
up, the messages arrive back at the master node on its secondary port.
This can be a relatively slow detection method, because it depends on how often the node sends Health messages.
Note that the master node only ever sends Health messages out its primary port. If its primary port goes down, it does not send Health messages.
Transit node unsolicited fault detection
To speed up fault detection, EPSR transit nodes directly communicate when one of their interfaces goes down. When a transit node detects a fault at one of its interfaces, it immediately sends a Link­Down message over the link that remains up. This notifies the master node that the ring is broken and causes it to respond immediately.

Recovering from a Fault

Fault in a link or a transit node

When the master node detects an outage somewhere in the ring, using either detection method, it restores traffic flow by:
1. declaring the ring to be in a Failed state
2. unblocking its secondary port, which enables data VLAN traffic to pass between its primary and secondary ports
3. flushing its own forwarding database (FDB) for the two ring ports
4. sending an EPSR Ring-Down-Flush-FDB control message to all the transit nodes, via both its primary and secondary ports
The transit nodes respond to the Ring-Down-Flush-FDB message by flushing their forwarding databases for each of their ring ports. As the data starts to flow in the ring’s
How EPSR Works
How EPSR Works
Master
Node
Transit
Node
1
Transit
Node
4
Transit
Node
2
Control VLAN
Ring-Down-Flush-FDB Message
Transit
Node
3
Control VLAN is forwarding Data VLANs are forwarding
Control VLAN is forwarding Data VLANs move from blocking to forwarding
Data ports move from fowarding to blocking
Transit Node Link-Down Message
Master Node Health Message
P
S
epsr-broken-ring
1
2
3
1
2
3
new configuration, the nodes (master and transit) re-learn their layer 2 addresses. During this period, the master node continues to send Health messages over the control VLAN. This situation continues until the faulty link or node is repaired.
For a multidomain ring, this process occurs separately for each domain within the ring.
The following figure shows the flow of control frames when a link breaks.

Fault in the master node

If the master node goes down, the transit nodes simply continue forwarding traffic around the ring—their operation does not change.
The only observable effects on the transit nodes are that:
They stop receiving Health messages and other messages from the master node.
The transit nodes connected to the master node experience a broken link, so they send
Link-Down messages. If the master node is down these messages are simply dropped.
Neither of these symptoms affect how the transit nodes forward traffic.
Once the master node recovers, it continues its function as the master node.
Page 6 | AlliedWare™ OS How To Note: EPSR
How EPSR Works

Restoring Normal Operation

Master Node

Once the fault has been fixed, the master node’s Health messages traverse the whole ring and arrive at the master node’s secondary port. The master node then restores normal conditions by:
1. declaring the ring to be in a state of Complete
2. blocking its secondary port for data VLAN traffic (but not for the control VLAN)
3. flushing its forwarding database for its two ring ports
4. sending a Ring-Up-Flush-FDB message from its primary port, to all transit nodes.

Transit Nodes with One Port Down

As soon as the fault has been fixed, the transit nodes on each side of the (previously) faulty link section detect that link connectivity has returned. They change their ring port state from Links Down to Pre-Forwarding, and wait for the master node to send a Ring-Up-Flush-FDB control message.
Once these transit nodes receive the Ring-Up-Flush-FDB message, they:
flush the forwarding databases for both their ring ports
change the state of their ports from blocking to forwarding for the data VLAN, which
allows data to flow through their previously-blocked ring ports
The transit nodes do not start forwarding traffic on the previously-down ports until after they receive the Ring-Up-Flush-FDB message. This makes sure the previously-down transit node ports stay blocked until after the master node blocks its secondary port. Otherwise, the ring could form a loop because it had no blocked ports.

Transit Nodes with Both Ports Down

The Allied Telesis implementation includes an extra feature to improve handling of double link failures. If both ports on a transit node are down and one port comes up, the node:
1. puts the port immediately into the forwarding state and starts forwarding data out that port. It does not need to wait, because the node knows there is no loop in the ring— because the other ring port on the node is down
2. remains in the Links Down state
3. starts a DoubleFailRecovery timer with a timeout of four seconds
4. waits for the timer to expire. At that time, if one port is still up and one is still down, the transit node sends a Ring-Up-Flush-FDB message out the port that is up. This message is usually called a “Fake Ring Up message”.
Sending this message allows any ports on other transit nodes that are blocking or in the Pre­forwarding state to move to forwarding traffic in the Links Up state. The timer delay lets the device at the other end of the link that came up configure its port appropriately, so that it is ready to receive the transmitted message.
Note that the master node would not send a Ring-Up-Flush-FDB message in these circumstances, because the ring is not in a state of Complete. The master node’s secondary port remains unblocked.
Page 7 | AlliedWare™ OS How To Note: EPSR

How To Configure EPSR

How To Configure EPSR
This section first outlines, step-by-step, how to configure EPSR. Then it discusses changing
the settings for the control VLAN, if you need to do this after initial configuration.

Configuring EPSR

1. Connect your switches into a ring
EPSR does not in itself limit the number of nodes that can exist on any given ring. Each switch
1
can participate in up to
If you already have a ring in a live network, disconnect the cable between any two of the nodes before you start configuring EPSR, to prevent a loop.
2. On each switch, configure EPSR
6 rings.
On each switch, perform the following configuration steps. Configuration of the master node and each transit node is very similar.
i. Configure the control VLAN
This step creates the control VLAN and adds the ring ports to it as tagged ports.
Enter the commands:
create vlan=control-vlan-name vid=control-vid
add vlan=control-vid port=ring-ports frame=tagged
Note that you can use trunk groups for the ring ports.
ii. Configure the data VLAN
This step creates the data VLAN (or VLANs—you can have as many as you want) and adds the ring ports as tagged ports.
Enter the commands:
create vlan=data-vlan-name vid=data-vid
add vlan=data-vid port=ring-ports frame=tagged
The two ring ports must belong to the control VLAN and all data VLANs.
Page 8 | AlliedWare™ OS How To Note: EPSR
How To Configure EPSR
iii. Remove the ring ports from the default VLAN
If you leave all the ring ports in the default VLAN (vlan1), they will create a loop, unless
1
is part of the EPSR domain. To avoid loops, you need to do one of the following:
vlan
make vlan
remove the ring ports from vlan
remove at least one of the ring ports from vlan
We do not recommend this option, because the action you have taken is less obvious when maintaining the network later.
In this How To Note, we remove the ring ports from the default VLAN. Use the command:
delete vlan=1 port=ring-ports
1
a data VLAN, or
1
, or
1
on at least one of the switches.
iv. Configure the EPSR domain
This step creates the domain, specifying whether the switch is the master node or a transit node. It also specifies which VLAN is the control VLAN, and on the master node which port is the primary port.
Enter one of the following commands:
On the master node:
create epsr=name mode=master controlvlan=control-vlan-name
primaryport=port-number
On each transit node:
create epsr=name mode=transit controlvlan=control-vlan-name
This step also adds the data VLAN to the domain. Enter the command:
add epsr=name datavlan=data-vlan-name
v. Enable EPSR
This step enables the domain on each switch. Enter the command:
enable epsr=name
3. Configure other ports and protocols as required
On each switch, configure the other ports and protocols that are required for your network.
Page 9 | AlliedWare™ OS How To Note: EPSR
How To Configure EPSR

Modifying the Control VLAN

You cannot modify the control VLAN while EPSR is enabled. If you try to remove or add ports to the control VLAN, the switch generates an error message as follows:
Manager> delete vlan=1000 port=1 Error (3089409): VLAN 1000 is a control VLAN in EPSR and cannot be modified
Disable the EPSR domain and then make the required changes. Note that disabling EPSR will create a loop, so is not recommended on a network with live data. Of course, in a live network, you can manually prevent a loop by disconnecting the cable between any two of the nodes.
Page 10 | AlliedWare™ OS How To Note: EPSR
1
Master
Node
(A)
Transit
Node
(C)
Transit
Node
(B)
epsr-example-basic-ring
End User Ports
S
P
End User Ports
End User Ports
port 1: primary port 2: secondary
port 1: ring
port 2: ring
port 1: ring
port 2: ring
Example
: A Basic Ring

Example 1: A Basic Ring

This example builds a simple 3-switch ring with one data VLAN, as shown in the following
1
diagram. Control packets are transmitted around the ring on vlan vlan2.
000 and data packets on

Configure the Master Node (A)

1. Create the control VLAN
create vlan=vlan1000 vid=1000
2. Add the ring ports to the control VLAN
add vlan=1000 port=1-2 frame=tagged
3. Create the data VLAN
create vlan=vlan2 vid=2
4. Add the ring ports to the data VLAN
The two ring ports must belong to the control VLAN and all data VLANs.
add vlan=2 port=1-2 frame=tagged
Page 11 | AlliedWare™ OS How To Note: EPSR
1
Example
: A Basic Ring
5. Remove the ring ports from the default VLAN
delete vlan=1 port=1-2
6. Create the EPSR domain
This step creates the domain, specifying that this switch is the master node. It also specifies which VLAN is the control VLAN and which port is the primary port.
create epsr=test mode=master controlvlan=vlan1000 primaryport=1
7. Add the data VLAN to the domain
add epsr=test datavlan=vlan2
8. Enable EPSR
enable epsr=test

Configure the Transit Nodes (B and C)

Each of the transit nodes has the same EPSR configuration in this example.
1. Create the control VLAN
create vlan=vlan1000 vid=1000
2. Add the ring ports to the control VLAN
add vlan=1000 port=1-2 frame=tagged
3. Create the data VLAN
create vlan=vlan2 vid=2
4. Add the ring ports to the data VLAN
The two ring ports must belong to both the control VLAN and all data VLANs.
add vlan=2 port=1-2 frame=tagged
5. Remove the ring ports from the default VLAN
delete vlan=1 port=1-2
Page 12 | AlliedWare™ OS How To Note: EPSR
1
Example
: A Basic Ring
6. Create the EPSR domain
This step creates the domain, specifying that this switch is the transit node. It also specifies which VLAN is the control VLAN.
create epsr=test mode=transit controlvlan=vlan1000
7. Add the data VLAN to the domain
add epsr=test datavlan=vlan2
8. Enable EPSR
enable epsr=test
Page 13 | AlliedWare™ OS How To Note: EPSR

Example 2: A Double Ring

Master
Node
(A)
Transit
Node
(D)
Transit
Node
(B)
epsr-example-double-ring
port 2:
secondary
Master
Node
(C)
port 1
Domain 1
control VLAN: 1000 data VLAN: 2
Domain 2
control VLAN: 40 data VLAN: 50
port 2
port 1
port 2
port 1: primary
port 5:
secondary
port 4: primary
port 4
port 5
port 4
port 5
Tra nsit
Node
(E)
Example 2: A Double Ring
This example adds to the previous ring by making two domains, as shown in the following diagram.
1. Configure the master node (switch A) for domain
1
The master node for domain 1 is the same as in the previous example (except that the domain has been renamed).
create vlan=vlan1000 vid=1000
add vlan=1000 port=1-2 frame=tagged
create vlan=vlan2 vid=2
add vlan=2 port=1-2 frame=tagged
delete vlan=1 port=1-2
create epsr=domain1 mode=master controlvlan=vlan1000 primaryport=1
add epsr=domain1 datavlan=vlan2
enable epsr=domain1
Page 14 | AlliedWare™ OS How To Note: EPSR
Example 2: A Double Ring
2. Configure the transit node (switch B) that belongs just to domain
1
This transit node is the same as in the previous example (except that the domain has been renamed).
create vlan=vlan1000 vid=1000
add vlan=1000 port=1-2 frame=tagged
create vlan=vlan2 vid=2
add vlan=2 port=1-2 frame=tagged
delete vlan=1 port=1-2
create epsr=domain1 mode=transit controlvlan=vlan1000
add epsr=domain1 datavlan=vlan2
enable epsr=domain1
3. Configure the master node (switch C) for domain 2
Configure the control VLAN:
create vlan=vlan40 vid=40
add vlan=40 port=4-5 frame=tagged
Configure the data VLAN:
create vlan=vlan50 vid=50
add vlan=50 port=4-5 frame=tagged
Remove the ring ports from the default VLAN:
delete vlan=1 port=4-5
Configure EPSR:
create epsr=domain2 mode=master controlvlan=vlan40 primaryport=4
add epsr=domain2 datavlan=vlan50
enable epsr=domain2
4. Configure the transit node (switch D) that belongs just to domain 2
Configure the control VLAN:
create vlan=vlan40 vid=40
add vlan=40 port=4-5 frame=tagged
Configure the data VLAN:
create vlan=vlan50 vid=50
add vlan=50 port=4-5 frame=tagged
Remove the ring ports from the default VLAN:
delete vlan=1 port=4-5
Page 15 | AlliedWare™ OS How To Note: EPSR
Example 2: A Double Ring
Configure EPSR:
create epsr=domain2 mode=transit controlvlan=vlan40
add epsr=domain2 datavlan=vlan50
enable epsr=domain2
5. Configure the transit node (switch E) that belongs to both domains
Two separate EPSR domains are configured on this switch.
1
Configure the control VLAN for domain
create vlan=vlan1000 vid=1000
add vlan=1000 port=1-2 frame=tagged
Configure the control VLAN for domain 2:
create vlan=vlan40 vid=40
add vlan=40 port=4-5 frame=tagged
:
Configure the data VLAN for domain 1:
create vlan=vlan2 vid=2
add vlan=2 port=1-2 frame=tagged
Configure the data VLAN for domain 2:
create vlan=vlan50 vid=50
add vlan=50 port=4-5 frame=tagged
Remove the ring ports from the default VLAN:
delete vlan=1 port=1-2,4-5
Configure EPSR for domain 1. This switch is a transit node:
create epsr=domain1 mode=transit controlvlan=vlan1000
add epsr=domain1 datavlan=vlan2
enable epsr=domain1
Configure EPSR for domain 2. This switch is a transit node:
create epsr=domain2 mode=transit controlvlan=vlan40
add epsr=domain2 datavlan=vlan50
enable epsr=domain2
Page 16 | AlliedWare™ OS How To Note: EPSR

Example 3: EPSR and RSTP

Master
Node
(A)
Transit
Node
(B)
epsr-example-rstp
port 2:
secondary
port 1
Domain 1
control VLAN: 1000 data VLAN: 2
RSTP:
STP VLAN: 10
port 2
port 1
port 2
port 1: primary
port 10
port 11
RSTP
Switch
(C)
RSTP
Switch
(D)
Switch
(E)
port 10
port 11
port 10
port 11
Example 3: EPSR and RSTP
This example uses EPSR to protect one ring and RSTP to protect another overlapping ring.
1. Configure the master node (switch A) for the EPSR domain
The master node is the same as in the previous example.
create vlan=vlan1000 vid=1000
add vlan=1000 port=1-2 frame=tagged
create vlan=vlan2 vid=2
add vlan=2 port=1-2 frame=tagged
delete vlan=1 port=1-2
create epsr=domain1 mode=master controlvlan=vlan1000 primaryport=1
add epsr=domain1 datavlan=vlan2
enable epsr=domain1
Page 17 | AlliedWare™ OS How To Note: EPSR
Loading...
+ 39 hidden pages