This document details SAN topologies supported by Brocade SilkWorm 2x00 switch fabrics and provides guidance on the number of end user ports that can reliably be deployed based on testing done to date. Exceeding these
port count guidelines may have unpredictable results on fabric stability.
BROCADE is providing this document as a starting point for users interested in implementing a Storage Area Network (SAN). The target users of this document are individuals who are responsible for developing a SAN architecture on behalf of a client user or an end
user who desires information to aid in developing a SAN architecture. Section 1describes SAN topologies and maximum size configurations supported as of the publication of this document. Section 2 presents information that is related to SAN Design and that should
prove helpful when designing and implementing a SAN. These relate to cabling, inter-switch links, switch counts, and fabric management options. BROCADE is working with OEMs and integrators exploring fabric solutions of 15, 20, and 30 or more switches. This
paper is designed to help in developing fabric solutions that use a large number of switches with hundreds of end nodes in tested and
proven topologies. This is a work in progress and as additional information is developed and larger SAN designs are tested this document will be updated.
Brocade provides SAN design guidance via our sales force to partners and end users. If you need to exceed the limits presented in
this guide, please contact a Brocade sales representative to receive help and guidance in designing your fabric.
53-0001575-01BROCADE Technical NotePage2 of 31
SAN Design: March 29, 2001 3:18 pm
2.0Fabric Topologies
This section explores a variety of fabric topologies and provides some specific network examples for SAN fabrics. Topologies fall into the
following general categories:
•Meshed Topology-- a network of switches that has at least one link to each adjacent switch. Fully meshed designs will have a
connection from each switch in the fabric to all other switches in the fabric. Other topologies are a specific instance of a mesh design.
•Star Topology-- central switch(es) with some or all ports used to connect to other switches; edge switches connect only to the center
switches
•Tier Architecture -- a switch hierarchy of two or more levels with inter switch connections that assume data paths go from one side
(hosts) to the other side (targets).
Each of these topologies has advantages and disadvantages. The SAN designer should be aware of the features and benefits of each design
when building a solution for a specific customer environment. Some advantages and disadvantages are detailed here:
Meshed Topology Designs
•Provide any-to-any connectivity for devices
•Good for designs where locality of data is know and hosts and targets can be located on the same switch but where some amount
of any-to-any connections are needed
•Provide for resiliency on switch failure with the fabric able to re-route traffic via other switches in the mesh
•Allows for expansion at the edges without disruption of the fabric and attached devices
•Allows for scaling in size as port count demands increase (see SAN building block in the sample configuration section)
•Host and storage devices can be placed anywhere in the fabric
Star Topology Designs
•Two hops maximum, consistent latency
•Multiple equal cost paths allowing for load sharing at time of configuration of fabric
•Easy to start small and scale
•Two paths through the core from edge switches allows for failover
53-0001575-01BROCADE Technical NotePage: 3 of 31
SAN Design: March 29, 2001 3:18 pm
Tier Architecture Designs
•Typically three layers of switches, a host layer, core layer, and storage layer
•Natural extension to star design (in some cases, Tier designs are Stars)
•Core switches are used to provide connectivity between host and storage layer switches, includes redundant switching elements
•Each layer can be scaled independently
•Cores can be simple to more complex, easily replaced with higher port count switches
•Can still use knowledge of data locality in placing devices in fabric
•Allows for bandwidth improvement by using multiple ISLs where needed
•Multiple paths in fabric allowing for redundant path selection should a failure occur
2.1REDU NDANCYAND FAILOVER
One of the major benefits of a Fibre Channel fabric is the ability of the fabric to recover from failures of individual fabric elements as long
as the design includes redundant paths. The BROCADE SilkWorm design supports auto re-configuration of the fabric when switches are
added or removed from the fabric. This allows for auto discovery of alternate routes between fabric nodes with the routing algorithm determining the most efficient route between nodes based on the currently available switches. Obviously to take advantage of this feature the
basic fabric design should have built in redundancy to allow for alternate paths to end node devices. A single switch fabric will have no
alternate paths between devices should the switch fail. However, a simple two switch fabric can be designed, along with redundant elements in the host and storage nodes, to allow for failure of a single switch and to use a route through an alternate switch.
A number of factors should be considered when designing a fabric and there is no one answer or single topology that addresses all problems. Each user will have unique system elements and design needs that will need to be factored into the fabric design. The later portion of
this document provides for a number of design topologies that can be used as templates for fabric designs. Key elements to consider are:
•How much redundancy is required? Hosts with key applications should have redundant paths to required storage (via the fabric),
meaning multiple HBA’s per host should be considered so a single HBA failure will not result in loss of host access to data
•Storage considerations. RAID devices provide for more reliability and resilience to failure of a single drive and allow for autorecovery on failure. HIgh availability designs should use RAID storage devices as the building blocks for storage -- these devic es
have built in recovery when using RAID 1(mirroring) or RAID 5 (striping with parity disk). A greater level of reliability can be achieved
by mirroring the storage device remotely using the switch support for devices at 10Km distance (or more using devices that support
extended distance optical signaling). Some RAID subsystems include the ability to mirror writes to another disk system as a feature
of the disk controller; software support for this feature (e.g. Veritas) also exist. A critical storage node can be mirrored loc ally within
a fabric or mirrored across an extended fabric link. BROCADE provides a licensed software option (Extended Switch, available in
release 2.1.3) that allows for increasing the E-port buffer credits for extended links. [Buffer credits allow for the sending device to
continue to send data up to the credit limit without having to wait for acknowledgment, improving performance. More credits allows
for a greater pipeline of data on a link, particularly useful when transmitting over extended distances.] The extended fabric option is
53-0001575-01BROCADE Technical NotePage: 4 of 31
SAN Design: March 29, 2001 3:18 pm
useful when combined with a link extender that can allow from 30 to 120 kilometers distance between switch elements. There is a
latency penalty for extended links that needs to considered where performance is a concern. Shorter links, lower latency -- with
roughly 100 microseconds of delay per 10KM of distance for round trip traffic.
• RAID devices have the added benefit of requiring only one switch port and an intelligent RAID controller can support multiple SCSI
or Fibre Channel drives behind it. RAID controllers will also off-load hosts from dedicating CPU cyles to supporting software RAID.
The trade off is cost and performance. A loop of disks contained in a JBOD can also be attached to a single switch port and managed
via software RAID. Redundant loops can be used to provide for high availability to stored data.
•Host systems can be designed to be passive fabric elements and only activated when a primary host system fails. Designs that use
two active hosts sharing the same data can also be achieved. An example of a remote mirrored high availability design is detailed
later in this paper.
2.2REDU NDANT FABRICS
The previous section discussed redundant elements within a fabric design. Another design approach is to use redundant fabrics. Two independent switch fabrics are used. The simplest version of this is a two switch solution where the switches are not connected. [See the first
example in section 2.0]. This solution allows for redundant fabrics and should a single switch fail in the case of the simplest design, data is
routed via the second switch in the alternate fabric. Recovery to the alternate switch occurs at the host/device driver level where failure of a
data path can be noted and an alternate path to storage can be selected.
There are four levels of redundancy possible within a fabric design. From least reliable to most reliable, they are:
•Single fabric, non-resilient
All switches are connected in such a way as to form a single fabric, and this fabric contains at least one single point of failure.
•Single fabric, resilient
All switches are connected in such a way as to form a single fabric, but no single point of failure exists which could cause the fabric
to segment.
•Dual fabric, non-resilient
Half of the switches are connected to form one fabric, and the other half form an identical fabric, which is completely unconnected
to the first fabric. Within each fabric, at least one single point of failure exists. This can be used in combination with dual attach
hosts and storage to keep a solution up even when one entire fabric fails.
•Dual fabric, resilient
53-0001575-01BROCADE Technical NotePage: 5 of 31
SAN Design: March 29, 2001 3:18 pm
Half of the switches are connected to form one fabric, and the other half form an identical fabric, which is completely unconnected
to the first fabric. There is no single point of failure in either fabric which could cause the fabric to segment. This can be used in
combination with dual attach hosts and storage to keep a solution up even when one entire fabric fails. This is generally the best
approach to take to SAN design for high availability.
FIGURE 1. This Figure shows an example of each of the types of resilient designs.
Levels of Redundancy
SPF
Single fabric,
non-resilient
Dual fabric,
non-resilient
Single fabric,
resilient
Dual fabric,
resilient
The following discussion will be about single fabrics with resiliency. If a dual fabric, resilient design is desired (in fact, this is recom-
mended), simply pick the appropriate single fabric design and build two of them.
53-0001575-01BROCADE Technical NotePage: 6 of 31
SAN Design: March 29, 2001 3:18 pm
FIGURE 2. Two Core Switch, Two ISL, Star Design.
The above SAN is a single fabric, resilient design. To deploy a dual fabric, resilient SAN based upon this architecture, the following SAN
would be built
FIGURE 3. Dual Fabric Design, Hosts and Storage connection to Two Independent Fabrics:
Hosts and
Storage
Dual connections
No connection!
Redundancy builds in the ability to allow for SAN management to take place on one SAN while the other SAN stays in operation. For a site
where high availability is mandatory and where significant down time cannot be tolerated this design approach is the most prudent.
•When designing a fabric solution, consider using two redundant fabrics to provide the maximum flexibility in terms of SAN
downtime and maintenance. This solution will allow for:
53-0001575-01BROCADE Technical NotePage: 7 of 31
SAN Design: March 29, 2001 3:18 pm
•Switch upgrades can take place on one SAN (firmware, hardware, both) and while this SAN is down the
redundant SAN stays in operation
•A switch failure in one SAN allow for failover to the redundant SAN while the failed switch is replaced.
•Eliminates the single point of failure in the system - while highly meshed and redundant connections in a sin-
gle fabric are possible, the overall design of the SAN when installed as a redundant set of device inter-connects is
simpler to maintain and provides insurance against failure of a single fabric
2.3SAMPLE SWITC H CONFIGURATIONS
This section provides a number so switch fabric designs using the topologies defined above. Simple switch designs are shown along with
more complex meshed fabric designs, Tier Architecture designs, and Star topology designs. This section also provides guidance on the
overall size in the form of port counts that can reliably be deployed today. As testing is completed for larger port count topologies they will
be added to this document.
53-0001575-01BROCADE Technical NotePage: 8 of 31
SAN Design: March 29, 2001 3:18 pm
2.4DUAL SWITCH HIGH AVAILABILITY CONFIGURATION - REDUNDANT FABRIC
FIGURE 4. Two Fabrics -- Simplest Redundant Fabric Configuration
H
SW
H
SW
D
•Can use multiple hosts sharing single disk device, with redundant
paths in two fabrics
•Redundant switches, not combined into fabric
•Dual HBAs in hosts, host level software provides for failover to
alternate HBA when failure is noted in one HBA
•Dual ported storage allows for either host to access the same data
•With 8 port switches can support 4 hosts and two disk devices,
larger configurations possible with 16 port switches
H - host
SW - switch
D - storage device
53-0001575-01BROCADE Technical NotePage: 9 of 31
SAN Design: March 29, 2001 3:18 pm
2.5TWO SWITC H FABRICFOR MIRRORINGAND DISASTER TOLERANCE
FIGURE 5. Extended Fabric Example
H
SW
D
10 KM
SW
D
H
•Sample configuration showing only two hosts and two storage
devices, larger configurations can be deployed
•This example shows how data being used at the local site can be
mirrored at a remote site via an extended fabric link. Primary system
data is replicated at remote site where a backup failover system is
located. Primary system disk information is mirrored to remote site that
can be 10KM distance with standard FC components (Long Wave
Length GBICs). Extended distances (20 to 120 KM possible) using
optical extender devices or DWDM devices.
•Starting with Fabric OS version 2.1.1, optional software to support
extended fabric is available. This allows increase in buffer-to-buffer
credits on E-ports to allow for maximum performance on links extended
over long distances. This option recommended when extended beyond
40KM.
•Mirroring across the link is accomplished by use of host based
mirroring software or by storage based mirroring options.
Local Site
Primary Compute Site
53-0001575-01BROCADE Technical NotePage: 10 of 31
Remote Site
Activated when
primary site fails
•Note: this configuration does not show a highly available solution,
there are single points of failure -- it points out the concept of remotely
mirroring data for disaster tolerance; other more highly available
architected solutions are possible at both sites.
•Alternative designs are possible where both sites mirror to each
other; sites can also consist of multi-switch fabrics connected over an
extended link
Loading...
+ 21 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.