Sun Oracle Switch 72, Datacenter InfiniBand Switch 72 User Manual

Sun Datacenter InfiniBand
Switch 72
User’s Guide
Part No. 820-7751-12 May 2010, Revision A
Copyright ©2009, 2010Oracle and/orits affiliates. All rights reserved.
This software and related documentation are provided under a license agreement containingrestrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreementor allowedby law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverseengineering, disassembly,or decompilationof thissoftware, unless required by law for interoperability, isprohibited.
The informationcontained herein is subject to change without notice and is not warranted to be error-free. If you find any errors,please report them tous inwriting.
If thisis software or related softwaredocumentation thatis delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, thefollowing noticeis applicable:
U.S. GOVERNMENTRIGHTS Programs, software, databases, and related documentationand technicaldata delivered to U.S. Government customers are "commercial computer software"or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplementalregulations. As such, the use, duplication, disclosure, modification, and adaptation shall be subject to the restrictions and license terms set forth in the applicable Government contract, and, to the extent applicable by the terms of the Government contract, theadditional rightsset forthin FAR 52.227-19, Commercial Computer Software License(December 2007).Oracle America,Inc., 500 Oracle Parkway, Redwood City, CA 94065.
This software or hardware isdeveloped forgeneral usein avariety ofinformation managementapplications. Itis not developed or intended for use inany inherently dangerous applications, including applications which may createa riskof personalinjury. If you use this software or hardware in dangerous applications, then you shall be responsibleto takeall appropriate fail-safe, backup, redundancy, and other measuresto ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle andJava are registeredtrademarks ofOracle and/orits affiliates. Other names may be trademarks of their respective owners.
AMD, Opteron, the AMD logo, and the AMD Opteronlogo are trademarks or registered trademarks of Advanced Micro Devices. Intel andIntel Xeon are trademarks or registeredtrademarks ofIntel Corporation.All SPARC trademarks are used under license and aretrademarks or registered trademarks of SPARCInternational, Inc.UNIX isa registered trademark licensed throughX/Open Company, Ltd.
This software or hardware and documentation may provide access to or information on content, products, and services from third parties. Oracle Corporationand itsaffiliates are not responsible forand expressly disclaim all warranties of any kind with respect to third-party content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services.
Copyright ©2009, 2010,Oracle et/ouses affiliés. Tous droits réservés.
Ce logicielet ladocumentation quil’accompagne sontprotégés par les lois sur la propriété intellectuelle. Ils sont concédés sous licence et soumis àdes restrictions d’utilisation et de divulgation. Sauf disposition de votre contrat de licence ou de la loi, vous ne pouvez pas copier, reproduire, traduire, diffuser, modifier,breveter, transmettre,distribuer, exposer, exécuter,publier ouafficher le logiciel, même partiellement, sous quelqueforme etpar quelqueprocédé que ce soit. Par ailleurs, il est interdit de procéder à toute ingénierie inverse du logiciel, de le désassembler oude ledécompiler,excepté àdes finsd’interopérabilité avec des logiciels tiers ou tel que prescrit par la loi.
Les informationsfournies dansce documentsont susceptiblesde modificationsans préavis.Par ailleurs,Oracle Corporationne garantitpas qu’elles soientexemptes d’erreurs et vous invite, le cas échéant, à lui en faire part par écrit.
Si celogiciel, oula documentationqui l’accompagne,est concédésous licenceau Gouvernementdes Etats-Unis,ou àtoute entitéqui délivre la licence dece logicielou l’utilisepour lecompte duGouvernement desEtats-Unis, lanotice suivantes’applique :
U.S. GOVERNMENTRIGHTS. Programs, software, databases, and related documentationand technicaldata delivered to U.S. Government customers are "commercial computer software"or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplementalregulations. As such, the use, duplication, disclosure, modification, and adaptation shall be subject to the restrictions and license terms set forth in the applicable Government contract, and, to the extent applicable by the terms of the Government contract, theadditional rightsset forthin FAR 52.227-19, Commercial Computer Software License(December 2007). Oracle America, Inc.,500 Oracle Parkway, Redwood City, CA 94065.
Ce logicielou matériela étédéveloppé pourun usagegénéral dansle cadre d’applications de gestion des informations. Ce logiciel ou matériel n’est pasconçu nin’est destinéà être utilisé dans des applications à risque, notamment dans des applications pouvant causer des dommages corporels. Si vous utilisez ce logiciel ou matériel dans le cadre d’applications dangereuses, il est de votreresponsabilité de prendretoutes les mesures de secours, de sauvegarde, de redondance et autresmesures nécessaires à son utilisation dans des conditions optimales de sécurité. Oracle Corporationet sesaffiliés déclinent toute responsabilité quant aux dommages causés par l’utilisation de ce logiciel ou matériel pour ce type d’applications.
Oracle etJava sontdes marques déposées d’Oracle Corporation et/ou de ses affiliés.Tout autre nom mentionné peut correspondreà des marques appartenant à d’autres propriétairesqu’Oracle.
AMD, Opteron, le logo AMD et le logo AMD Opteron sont des marquesou desmarques déposées d’Advanced Micro Devices. Intel et Intel Xeon sontdes marques ou des marques déposées d’Intel Corporation. Toutes les marques SPARC sont utiliséessous licenceet sontdes marques ou desmarques déposées de SPARC International, Inc. UNIX est une marquedéposée concédéesous licencepar X/OpenCompany, Ltd.
Ce logicielou matérielet ladocumentation quil’accompagne peuventfournir desinformations oudes liensdonnant accèsà descontenus, des produits et des services émanant de tiers. Oracle Corporation et ses affiliés déclinent toute responsabilité ou garantie expresse quantaux contenus, produits ou services émanant de tiers. En aucun cas, Oracle Corporation et ses affiliés ne sauraient être tenuspour responsables des pertes subies,des coûtsoccasionnés oudes dommagescausés parl’accès àdes contenus,produits ou services tiers, ou à leur utilisation.
Contents
Using This Documentation xi
Installing the Switch 1
Understanding Switch Specifications 1
Physical Specifications 2
Environmental Requirements 3
Acoustic Noise Emissions 3
Electrical Specifications 3
Network Management Connector and Pins 4
USB Management Connector and Pins 5
CXP Connector and Pins 5
Routing Service Cables 7
Power Cord Requirements 7
Management Cable Requirements 8
Understanding InfiniBand Cabling 9
InfiniBand Cable Cautions 10
InfiniBand Cable Guidelines 11
InfiniBand Cable Types 11
InfiniBand Cable Path Lengths 12
InfiniBand Cable Bundling 12
Floor and Underfloor Delivery of InfiniBand Cables 13
Overhead Delivery of InfiniBand Cables 13
v
Understanding the Installation 14
Installation Preparation 14
Suggested Tools 15
Antistatic Precautions 15
Installation Responsibilities 15
Installation Sequence 16
Shipping Carton Contents 16
Install the Switch in the Rack 18
Powering On the Switch 23
Attach the Management Cables 24
Attach the Power Cords 27
Accessing the Management Controller 28
Access the Management Controller From the Network
Management Port 29
Access the Management Controller From the USB Management
Port 30
Verify the Switch Status 31
Start the Subnet Manager 34
Connecting InfiniBand Cables 35
Attach the InfiniBand Cables 36
Check Link Status 41
Verifying the InfiniBand Fabric 41
Discover the InfiniBand Fabric Topology 42
Perform Diagnostics on the InfiniBand Fabric 43
Validate the InfiniBand Fabric and Report Errors 44
Administering the Switch 47
Troubleshooting the Switch 47
Switch Hardware Problems 48
InfiniBand Fabric Problems 51
vi Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Understanding the LEDs 54
Front Status LEDs 55
Rear Status LEDs 55
Check Chassis Status LEDs 56
Check Network Management Port Status LEDs 57
Check Link Status LEDs 57
Check Power Supply Status LEDs 58
Check Fan Status LEDs 59
Understanding Routing Through the Switch 59
CXP Connectors and Link LEDs to Switch Chip Port Routes 60
Switch Chip Port to Switch Chip Port Routes 61
Switch Chip Port to CXP Connectors and Link LED Routes 62
Signal Route Through the Switch 64
Switch GUIDs Overview 65
Administrative Command Overview 66
Hardware Command Overview 67
InfiniBand Command Overview 67
Monitoring the Hardware 67
Display Switch General Health 68
Display Power Supply Status 68
Check Board-Level Voltages 69
Display Internal Temperatures 70
Display Fan Status 71
Display Switch Environmental and Operational Data 71
Display Switch Firmware Versions 73
Locate a Switch Chip or Connector From the GUID 74
Display Switch Chip Boot Status 74
Display Link Status 75
Contents vii
Display Switch Chip Port Status 76
Monitoring the InfiniBand Fabric 76
Identify All Switches in the Fabric 77
Identify All HCAs in the Fabric 78
Display the InfiniBand Fabric Topology 78
Display a Route Through the Fabric 79
Display the Link Status of a Node 80
Display Counters for a Node 81
Display Data Counters for a Node 82
Display Low-Level Detailed Information About a Node 83
Display Low-Level Detailed Information About a Port 84
Map LIDs to GUIDs 86
Display Subnet Manager Status 87
Controlling the Hardware 87
Restart the Management Controller 88
Restart the Entire Switch 89
Reset the Switch Chip 89
Recover Ports After Switch Chip Reset 90
Set Link Speed 91
Disable a Switch Chip Port 92
Enable a Switch Chip Port 92
Change the Administrator Password 93
Controlling the InfiniBand Fabric 93
Perform Comprehensive Diagnostics for the Entire Fabric 94
Perform Comprehensive Diagnostics for a Route 95
Determine Changes to the InfiniBand Fabric Topology 95
Find 1x or SDR or DDR Links in the Fabric 97
Determine Which Links Are Experiencing Significant Errors 97
viii Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Clear Error Counters 98
Clear Data Counters 99
Check All Ports 99
Reset a Port 100
Set Port Speed 101
Disable a Port 102
Enable a Port 103
Controlling the Subnet Manager 104
Create the guid.txt File 105
Create the opensm.conf File 106
Enable the Subnet Manager 107
Disable the Subnet Manager 107
Set the Subnet Manager Priority 108
Start the Subnet Manager With the opensmd Daemon 109
Stop the Subnet Manager With the opensmd Daemon 109
Servicing the Switch 111
Understanding Service Procedures 111
Replaceable Components 112
Servicing the Power Supplies 113
Inspecting a Power Supply 114
Identify the Power Supply 114
Inspect the Power Supply Hardware 116
Inspect the Power Supply Connectors 116
Power Off a Power Supply 117
Remove a Power Supply 118
Install a Power Supply 120
Power On a Power Supply 122
Servicing the Fans 124
Contents ix
Inspecting a Fan 125
Identify the Fan 125
Inspect the Fan Hardware 126
Inspect the Fan Connector 127
Remove a Fan 127
Install a Fan 130
Servicing the InfiniBand Cables 132
Inspecting an InfiniBand Cable 132
Identify the InfiniBand Cable 133
Inspect the InfiniBand Cable Hardware 133
Inspect the InfiniBand Cable Connectors 134
Remove an InfiniBand Cable 134
Install an InfiniBand Cable 136
Servicing the Battery 139
Remove the Switch From the Rack 139
Replace the Battery 140
Upgrading the Firmware 145
Index 147
x Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Using This Documentation
This user guide provides detailed procedures that describe preparation, installation, administration, and service for the Sun Datacenter InfiniBand Switch 72 from Oracle service providers, and users who have advanced experience installing, administering, and servicing InfiniBand fabric hardware.
“Related Documentation” on page xi
“Documentation, Support, and Training” on page xii
“Documentation Feedback” on page xii
Related Documentation
The documents listed as online are available at:
®
. This document is written for technicians, system administrators, authorized
(http://docs.sun.com/app/docs/prod/ib.switch.72)
Application Title Part Number Format Location
Getting started Sun Datacenter InfiniBand Switch 72 Getting Started
Guide
Last-minute information
Installation, administration, and service
Command reference
Compliance Sun Datacenter InfiniBand Switch 72 Safety and
Sun Datacenter InfiniBand Switch 72 Product Notes 820-7753 PDF Online
Sun Datacenter InfiniBand Switch 72 User’s Guide 820-7751 PDF
Sun Datacenter InfiniBand Switch 72 Command Reference
Compliance Guide
820-7755 Printed
PDF
HTML
820-7752 PDF
HTML
820-7754 PDF Online
Shipping kit
Online
Online
Online
xi
Some commands reference the PICMG 3.0 specification, available at:
(http://www.picmg.org)
Some commands reference the IPMI specification, available at:
(http://www.intel.com/design/servers/ipmi)
Documentation, Support, and Training
These web sites provide additional resources:
Documentation (http://docs.sun.com)
Support (http://www.sun.com/support)
Training (http://www.sun.com/training)
Documentation Feedback
Submit comments about this document by clicking the Feedback[+] link at
(http://docs.sun.com). Include the title and part number of your document
with your feedback:
Sun Datacenter InfiniBand Switch 72 User’s Guide, part number 820-7751-12.
xii Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Installing the Switch
The following topics describe the installation of the switch. The topics are listed in the order of completion.
“Understanding Switch Specifications” on page 1
“Routing Service Cables” on page 7
“Understanding InfiniBand Cabling” on page 9
“Understanding the Installation” on page 14
“Shipping Carton Contents” on page 16
“Install the Switch in the Rack” on page 18
“Powering On the Switch” on page 23
“Connecting InfiniBand Cables” on page 35
“Verifying the InfiniBand Fabric” on page 41
Related Information
“Administering the Switch” on page 47
“Servicing the Switch” on page 111
Switch Remote Administration
Switch Reference
Understanding Switch Specifications
These topics describe the specifications of the switch and the connectors found on the switch chassis and the pinouts of those connectors.
“Physical Specifications” on page 2
“Environmental Requirements” on page 3
“Acoustic Noise Emissions” on page 3
“Electrical Specifications” on page 3
1
“Network Management Connector and Pins” on page 4
“USB Management Connector and Pins” on page 5
“CXP Connector and Pins” on page 5
Related Information
“Install the Switch in the Rack” on page 18
“Powering On the Switch” on page 23
Physical Specifications
Dimension Measurements
Width 17.52 in. (445.0 mm)
Depth 24 in. (609.6 mm)
Height 1.75 in. (44.5 mm)
Weight 23.0 lbs (11.4 kg)
Related Information
“Shipping Carton Contents” on page 16
“Install the Switch in the Rack” on page 18
2 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Environmental Requirements
Parameter Operating
Ambient temperature 41˚F to 89.6˚F (5˚C to 32˚C)
Relative humidity 5% to 85% noncondensing, 80˚F (27˚C) maximum wet bulb
Elevation (Sun requirement) Maximum 9840 feet (3000 meters) at 104˚F (40˚C)
Acoustic Noise Emissions
Parameter Operating Idling
Acoustic power LWAd (1B=10dB) 7.1 B 7.2 B
Acoustic pressure LpAm 58.9 dBA 59.0 dBA
Electrical Specifications
Parameter AC Version Requirement
Voltage 100 VAC to 240 VAC single phase, 47 to 63 Hz
Current (per input) 5.4 A maximum per input at 100 VAC
Current (total) 5.6 A maximum total for all inputs at 100 VAC
Power 550 Watts (Total input power is approximately equally
divided among the operating power supplies)
Related Information
“Power Cord Requirements” on page 7
“Attach the Power Cords” on page 27
Installing the Switch 3
Network Management Connector and Pins
The following table lists the pinout of the network management connector.
Pin. Signal
1 TXD+
2 TXD-
3 RXD+
4 Not used
5 Not used
6 RXD-
7 Not used
8 Not used
Related Information
“Management Cable Requirements” on page 8
“Attach the Management Cables” on page 24
“Access the Management Controller From the Network Management Port” on
page 29
4 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
USB Management Connector and Pins
The following table lists the pinout of the USB management connector.
Pin Signal
1 +5 VDC
2 - Data
3 + Data
4 GND
Related Information
“Management Cable Requirements” on page 8
“Attach the Management Cables” on page 24
“Access the Management Controller From the USB Management Port” on page 30
CXP Connector and Pins
The CXP connector has three InfiniBand port connections.
Installing the Switch 5
The following table lists the pinout for each connection.
Pin Signal Pin Signal Pin Signal Pin Signal
1 GND 22 GND 43 GND 64 GND
2 TX0p 23 TX1p 44 RX3p 65 RX2p
3 TX0n 24 TX1n 45 RX3n 66 RX2n
4 GND 25 GND 46 GND 67 GND
5 TX2p 26 TX3p 47 RX1p 68 RX0p
6 TX2n 27 TX3n 48 RX1n 69 RX0n
7 GND 28 GND 49 GND 70 GND
8 TX4p 29 TX5p 50 RX7p 71 RX6p
9 TX4n 30 TX5n 51 RX7n 72 RX6n
10 GND 31 GND 52 GND 73 GND
11 TX6p 32 TX7p 53 RX5p 74 RX4p
12 TX6n 33 TX7n 54 RX5n 75 RX4n
13 GND 34 GND 55 GND 76 GND
14 TX8p 35 TX9p 56 RX11p 77 RX10p
15 TX8n 36 TX9n 57 RX11n 78 RX10n
16 GND 37 GND 58 GND 79 GND
17 TX10p 38 TX11p 59 RX9p 80 RX8p
18 TX10n 39 TX11n 60 RX9n 81 RX8n
19 GND 40 GND 61 GND 82 GND
20 SCL 41 VCC3.3-Tx 62 PRSNT 83 VCC3.3-Rx
21 SDA 42 VCC12-Tx 63 Int_L/Reset_L84 VCC12-Rx
The following table provides descriptions of the CXP signals.
Signal Description
GND Ground for both signal and power return
VCC3.3-Tx 3.3V standby power from the power supplies
VCC3.3-Rx 3.3V standby power return
VCC12-Tx 12V power for cable export and future enhancements
VCC12-Rx 12V power return
6 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Signal Description
TXportdiff Differential InfiniBand transmit signal, where:
port 0 to 11
diff n or p for negative or positive
RXportdiff Differential InfiniBand receive signal, where:
port 0 to 11
diff n or p for negative or positive
2
SCL I
SDA I
PRSNT Connection present
Int_L/Reset_L Reset line
C clock
2
C data I/O
Related Information
“InfiniBand Cable Types” on page 11
“Connecting InfiniBand Cables” on page 35
Routing Service Cables
These topics describe cable routing requirements:
“Power Cord Requirements” on page 7
“Management Cable Requirements” on page 8
Related Information
“Understanding InfiniBand Cabling” on page 9
Power Cord Requirements
The power supplies are in a N+N redundancy. Line power is provided from two sources, A and B.
Installing the Switch 7
Your switch country kit should contain two power cords that are specific to your country or application. The following table describes the power cords available.
Cable Part Number Description
X311L (180-1097) North America/Asia, IEC 320 C13 to NEMA 5-15P - 15A/125V 2.5M Black, RoHS:Y
X312E (180-1982) China, IEC 320 C13 to GB 2099/GB 1002 - 10A/250V 2.0M, RoHS:Y
X312F (180-1999) Argentina, IEC 320 C13 to IRAM 2073 - 10A/250V 2.0M Black, RoHS:Y
X312G (180-1662) Korea, IEC 320 C13 to KSC 8305 - 15A/250V 2.0M Black, RoHS:Y
X312L (180-1993) Continental Europe, IEC 320 C13 to CEE 7/7 10A/250V 2.0M Black, RoHS:Y
X314L (180-1994) Swiss, IEC 320 C13 to SEV 1011 - 10A/250V 2.0M Black, RoHS:Y
X317L (180-1997) U.K., IEC 320 C13 to BS 1363 - 10A/250V 2.0M Black, RoHS:Y
X332A (180-2121) Taiwan, IEC 320 C13 to NEMA 5-15P - 10A/125V 2.5M Black, RoHS:Y
X383L (180-1995) Danish, IEC 320 C13 to Asfnit 107 - 10A/250V 2.0M Black, RoHS:Y
X384L (180-1996) Italian, IEC 320 C13 to CEI 23-16/VII - 10A/250V 2.0M Black, RoHS:Y
X386L (180-1998) Australian, IEC 320 C13 to AS 3112 - 10A/250V 2.0M Black, RoHS:Y
Caution – Install and route power cabling only in a manner that complies to federal,
state, and local electrical codes.
Related Information
“Electrical Specifications” on page 3
“Attach the Power Cords” on page 27
Management Cable Requirements
Management of the switch is done at the management console, which is either a 10/100 Ethernet connection at the NET ports or a USB-to-serial device attached to the USB port.
Typically, the NET connection (network management) is the default means of communicating with the management controller. The controller has a DHCP client in operation and requires the Ethernet network to have a DHCP server. The DHCP server must be configured with the MAC address of the management controller, so the server can provide an IP address to the management controller upon boot. If a DHCP server is not available, then the USB connection is used.
8 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
The advantage of the NET connection over the USB connection is that administration of the switch can happen from anywhere on the network. There is no cable length constraint for the network management route because of the re-amplification, filtering, and processing that happens at each hub or switch within the Ethernet network. No network management cable should be any longer than 100 meters.
The USB connection requires a USB-to-serial adapter. The adapter must be configured to communicate with your serial device management console. The serial device can be a serial terminal, a terminal server, or a serial connection running on a system or laptop. Because of the nature of the serial signal, a serial management cable cannot be used reliably if it is more than 10 meters long.
The USB-to-serial adapter is not included with your switch. You can purchase such an adapter from computer and electronics stores.
Related Information
“Network Management Connector and Pins” on page 4
“USB Management Connector and Pins” on page 5
“Attach the Management Cables” on page 24
Understanding InfiniBand Cabling
These topics describe InfiniBand cabling:
“InfiniBand Cable Cautions” on page 10
“InfiniBand Cable Guidelines” on page 11
“InfiniBand Cable Path Lengths” on page 12
“InfiniBand Cable Types” on page 11
“InfiniBand Cable Bundling” on page 12
“Floor and Underfloor Delivery of InfiniBand Cables” on page 13
“Overhead Delivery of InfiniBand Cables” on page 13
Related Information
“Routing Service Cables” on page 7
“Connecting InfiniBand Cables” on page 35
Installing the Switch 9
InfiniBand Cable Cautions
To prevent InfiniBand cable damage, you must follow these cautions:
Do not uncoil the cable, as a kink might occur. Hold the coil closed as you unroll the cable, pausing to allow the cable to relax as it is unrolled.
Do not step on the cable or connectors. Plan cable paths away from foot traffic or rolling loads.
Do not pull the cable out of the shipping box, through any opening, or around any corners. Unroll the cable as you lay it down and move it through turns.
Do not twist the cable to open a kink. If it is not severe, open the kink by unlooping the cable.
Do not straighten the cable to correct a bend that is too tight. Leave the cable bend as is.
Do not drop the cable or connectors from any height. Gently set the cable down, resting the cable connectors on a stable surface.
Do not bend the cables to a radius tighter than 85 mm (3.4 inches). Ensure that cable turns are as wide as possible.
Do not pack the cable to fit a tight space. Use an alternative cable route.
Do not hang the cable for a length more than 2 meters (7 feet). Minimize the hanging weight with intermediate retention points.
Do not cinch the cable with hard fasteners or cable ties. Use soft hook-and-loop fastener for bundling and securing cables.
Do not drag the cable or its connectors over any surface. Carry the entire cable to and from the points of connection.
Related Information
“Connecting InfiniBand Cables” on page 35
10 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Do not force the cable connector into the receptacle by pushing on the cable. Apply connection or disconnection forces at the connector only.
InfiniBand Cable Guidelines
Proper InfiniBand cable installation requires the following:
1. Plan the cable routes and cable length needs, identifying problematic cable route bends, minimizing the length of continuous vertical runs to no more than 2 meters (7 feet), and specifying hardware to support cable routing.
See “InfiniBand Cable Path Lengths” on page 12.
2. Carry the entire cable to the points of connection and unroll the cable from the first connection point to the second. Keep the coil closed and pause to enable the cable to relax as it is unrolled and moved through turns.
3. Ensure that cable route turns are larger than 85 mm (3.4 inches) radius for optical cables and 127 mm (5 inches) radius for copper cables. Find alternative routes for turns that are tighter.
4. Secure the cable to hard points and bundle it with soft, hook-and-loop fasteners.
See “InfiniBand Cable Bundling” on page 12.
5. Mediate the slack between securing points to maintain minimal cable tension and proper support.
See “Floor and Underfloor Delivery of InfiniBand Cables” on page 13 and
“Overhead Delivery of InfiniBand Cables” on page 13.
6. Label the ends of cables to identify their routes. Rest the cable connectors on a stable surface when they are not connected.
Related Information
“Power Cord Requirements” on page 7
“Management Cable Requirements” on page 8
InfiniBand Cable Types
The following table lists the cables available for the switch, their length, and data rate.
Cable Type Lengths Data Rate
Splitter, optical core, CXP - QSFP x3 10 m, 20 m QDR
Splitter, copper core, CXP - QSFP x3 1 m, 2 m, 3 m, 5 m QDR
Pass-through, optical core, CXP - CXP 10 m, 20 m QDR
Installing the Switch 11
Cable Type Lengths Data Rate
Pass-through, copper core, CXP - CXP 0.4 m, 1 m, 2 m, 3 m, 5 m QDR
Related Information
“CXP Connector and Pins” on page 5
InfiniBand Cable Path Lengths
Cable paths should be as short as possible. When the length of a cable path has been calculated, select the shortest cable to satisfy the length requirement. When specifying a cable, consider the following:
Bends in the cable path increases the required length of the cable. Rarely does a
cable travel in a straight line from connector to connector. Bends in the cable path are necessary, and each bend increases the total length.
Bundling increases the required length of the cables. Bundling causes one or more
cables to follow a common path. However, the bend radius is different in different parts of the bundle. If the bundle is large and unorganized, and there are many bends, one cable might experience only the inner radius of bends, while another cable might experience the outer radius of bends. In this situation, the differences of the required lengths of the cables is quite substantial.
If you are routing the InfiniBand cable under the floor, consider the height of the
raised floor when calculating cable path length.
Related Information
“InfiniBand Cable Types” on page 11
InfiniBand Cable Bundling
When bundling InfiniBand cables in groups, use hook and loop straps to keep cables organized. If possible, use color-coordinated straps to help identify cables and their routing. The InfiniBand splitter and 4X copper conductor cables are fairly thick and heavy for their length. Consider the retention strength of the hook and loop straps when supporting cables. Bundle as few cables as reasonably possible. If the InfiniBand cables break free of their straps and fall free, the cables might break internally when they strike the floor or are jerked from tension.
You can bundle the cables using many hook and loop straps. Do not bundle more than 12 cables together. A fully configured switch has 24 InfiniBand cables, so that is at least two bundles.
12 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Place the hook and loop straps as close together as reasonably possible. For example, every 1 ft (0.3 m). If a cable breaks free from a strap, the cable can not fall far before it is retained by another strap.
Related Information
“Connecting InfiniBand Cables” on page 35
Floor and Underfloor Delivery of InfiniBand Cables
The switch accepts InfiniBand cables from floor or underfloor delivery. The cable management hardware at the rear of the switch supports the weight of the InfiniBand cables.
Floor and underfloor delivery limits the tension in the InfiniBand cable to the weight of the cable for the rack height of the switch.
Related Information
“Connecting InfiniBand Cables” on page 35
Overhead Delivery of InfiniBand Cables
For overhead delivery, use cable shelves and lattices to support the InfiniBand cables.
If the overhead delivery has a large drop height, consider using an intermediate support for the InfiniBand cables. Use of the support can limit the tension in the InfiniBand cable to the weight of the cable for the distance between the supports and the switch.
Related Information
“Connecting InfiniBand Cables” on page 35
Installing the Switch 13
Understanding the Installation
This section provides you with a quick introduction to the installation process. Topics include:
“Installation Preparation” on page 14
“Suggested Tools” on page 15
“Antistatic Precautions” on page 15
“Installation Responsibilities” on page 15
“Installation Sequence” on page 16
Related Information
“Shipping Carton Contents” on page 16
“Install the Switch in the Rack” on page 18
“Powering On the Switch” on page 23
“Connecting InfiniBand Cables” on page 35
“Verifying the InfiniBand Fabric” on page 41
Installation Preparation
Before installing or servicing the switch, you must prepare the following:
The environment where the switch is to be installed must conform to the
requirements found in “Environmental Requirements” on page 3.
The rack to receive the switch must have proper power, management, and
InfiniBand fabric cabling brought to it.
The rack must have an available location for the switch.
There must be a clean, dry, stable work surface.
Related Information
“Suggested Tools” on page 15
“Antistatic Precautions” on page 15
“Installation Responsibilities” on page 15
“Installation Sequence” on page 16
14 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Suggested Tools
The following tools are necessary or beneficial for installing the switch:
Antistatic mat
Antistatic wrist strap
No. 2 Phillips screwdriver
No. 1 Phillips screwdriver
Flashlight
Gloves
Magnifying glass
Related Information
“Antistatic Precautions” on page 15
Antistatic Precautions
When installing the switch chassis, take care to follow antistatic precautions:
Use an antistatic mat as a work surface.
Wear an antistatic wrist strap that is attached to either the mat or a metal portion
of the switch chassis.
Related Information
“Suggested Tools” on page 15
Installation Responsibilities
The personnel who install the switch must be fully capable of the following tasks:
Rackmount a heavy object
Perform line voltage verification
Connect delicate cables in tight spaces
Configure network hosts and serial terminals
Perform software tasks of an administrative nature
Interpret screen output as it pertains to InfiniBand fabrics
Installing the Switch 15
Related Information
“Installation Sequence” on page 16
Installation Sequence
The process of installing the switch has a specific sequence of tasks that must be performed in order. The following table describes the switch installation task sequence and provides links to those procedures.
Step Links
1 “Shipping Carton Contents” on page 16
2 “Install the Switch in the Rack” on page 18
3 “Attach the Management Cables” on page 24
4 “Attach the Power Cords” on page 27
5 “Accessing the Management Controller” on page 28
6 “Verify the Switch Status” on page 31
7 “Start the Subnet Manager” on page 34
8 “Attach the InfiniBand Cables” on page 36
9 “Check Link Status” on page 41
10 “Discover the InfiniBand Fabric Topology” on page 42
11 “Perform Diagnostics on the InfiniBand Fabric” on page 43
12 “Validate the InfiniBand Fabric and Report Errors” on page 44
Shipping Carton Contents
Sun Datacenter InfiniBand Switch 72
Cable bracket and rackmount kit
Cable management bracket and cover
Two rack-mounting rail assemblies
Assortment of screws and captive nuts
Sun Datacenter InfiniBand Switch 72 Getting Started Guide
16 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Item Descripton
1 Switch
2 Front mounting brackets, long
3 Front mounting brackets, short
4 C-shaped brackets
5 Documentation
6 Hardware
7 Cable management extenders
8 Long rails
9 Cable management cover
10 Cable management assembly
11 Attachment brackets
12 Attachment plates
Power cords (not pictured)
After verifying the package contents, install the switch into the rack. See “Install the
Switch in the Rack” on page 18.
Related Information
“Install the Switch in the Rack” on page 18
Installing the Switch 17
Install the Switch in the Rack
Caution – The airflow through the switch is in from the fans, through the chassis,
and out at the connector panel. The front of the switch chassis (fan end) intakes from the cold aisle and the rear of the switch chassis (connector end) exhausts to the hot aisle. This flow direction requires you to install the switch in an orientation that is the opposite of what you might assume.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. If installed, open the rack doors.
3. Assemble the cable management extenders.
a. Slide the attachment bracket over the extender, so that the tab on the bracket
is opposite the flange on the extender.
The open end of the tab is toward the flange. The flat end of the tab is toward the rear of the extender.
b. Place the attachment plate on the flange side of the extender, opposite the
attachment bracket.
c. Use two screws to sandwich the attachment bracket and plate to the extender,
in the position farthest from the flange.
18 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
d. Using a No. 2 Phillips screwdriver, tighten the two screws.
e. Repeat from Step a for the other cable management extender.
4. Attach the cable management extenders and long rails to the rear of the rack.
a. Place the long rail to the mounting location on the rear rack post.
b. Butt the flange of the extender to the flange of the long rail.
c. Secure the assembly to the post with two captive nuts and two screws.
d. Repeat from Step a for the other cable management extender and long rail.
Installing the Switch 19
5. Attach the long front brackets (with cutouts) to the switch with 4 screws on each side.
The flange of the long front brackets point away from the switch.
6. Attach the C-shaped brackets to the switch with 4 screws on each side.
The edge of the C-shaped bracket is flush to the rear of the chassis.
7. Route the power cords through the rack with the female end at the front of the rack where the switch will install.
Ensure that there is 24 inches (610 mm) of power cord slack at the front of the rack to provide an adequate service loop for when removing the switch from the rack.
8. Carefully lift the switch and slide it into the rack, from the front rearward.
Ensure that the ends of the long rails slide into the C-shaped brackets at the rear of the switch chassis and that the power cords lay into the cut-outs of the long front mounting brackets.
20 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
9. Mount the front of the switch chassis to the front rack posts with two captive nuts and two screws at each side.
Tighten the screws securely.
10. Install the cable management bracket to the tabs of the attachment brackets at the rear of the switch, tightening the thumbscrews on each side of the cable management bracket.
Installing the Switch 21
11. Install the cable management bracket cover.
12. Tighten the thumbscrews on each side of the cover.
22 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
13. Attach the management cables.
See “Attach the Management Cables” on page 24.
Related Information
“Shipping Carton Contents” on page 16
“Powering On the Switch” on page 23
Powering On the Switch
After installing the components, enable powering up of the switch by performing these tasks.
“Attach the Management Cables” on page 24
“Attach the Power Cords” on page 27
“Accessing the Management Controller” on page 28
“Verify the Switch Status” on page 31
“Start the Subnet Manager” on page 34
Installing the Switch 23
Related Information
“Understanding Switch Specifications” on page 1
“Routing Service Cables” on page 7
“Connecting InfiniBand Cables” on page 35
Attach the Management Cables
The switch has two connectors for network communication or serial communication with the management controller.
The network management connector, labeled NET, is a 100 BASE-T Ethernet interface. This connector is preferred because it permits remote management of the switch over the Ethernet network.
The USB management connector, labeled USB, is the second choice for communication with the management controller in the switch. The management console can be a serial terminal, a system running a TIP connection, or other serial device which communicates with the management controller through a USB-to-serial adapter. The serial parameters for communication with the USB-to-serial adapter is typically 115600, 8, N, 1.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. Connect the network management cables from the management console to the connectors labeled NET.
24 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
3. Configure the DHCP server with the MAC address of the management controller.
The MAC address is printed on the customer information (yellow) sheet on the outside of the switch shipping carton and on the pull-out tab on the left side front of the switch chassis, adjacent to power supply 0.
4. (Optional) Connect the serial management cables from the management console to the USB-to-serial adapter, and from the adapter to the connector labeled with the USB symbol.
Installing the Switch 25
5. Route the management cables so that they do not interfere with other cables, with servicing the switch, or with other systems.
6. Prepare the management console for communication with the management controller.
7. Power on the switch.
See “Attach the Power Cords” on page 27.
Related Information
“Network Management Connector and Pins” on page 4
“USB Management Connector and Pins” on page 5
“Management Cable Requirements” on page 8
“Attach the InfiniBand Cables” on page 36
26 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Attach the Power Cords
The power cords for the switch ship separately and are specific to the country of installation. The facility power receptacles for the power cords should be located such that the power cords are routed out of the way, either to the sides of the rack or under the floor.
When live power is delivered to the receptacles at the front of the chassis, standby and main power is made available by the power supplies. When standby power is distributed to the chassis, the management controller is powered on. The main power is supplied for the switch chips and fans.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. Ensure that the circuit breakers for the power supplies are switched off.
3. Plug a power cord into each power receptacle at the front of the chassis.
4. Route each power cord to its respective facility power receptacle.
Use cable ties or hook and loop fastener straps to bundle and secure the cord.
Installing the Switch 27
5. Plug each power cord into the receptacle.
6. Energize the circuit breakers so that the power receptacles are live.
7. Verify that the status LEDs for each power supply indicates normal operation.
The AC LED lights and in a moment, the OK LED should light. The Attention LED should be unlit. See “Check Power Supply Status LEDs” on page 58.
Note – At this time, power is being supplied to the management controller. The
controller is effectively on and booting up. You might see the boot sequence on the management console.
8. Verify that the fans spin up.
You should feel air going into the fans and the fan Attention LEDs should be unlit. See “Check Fan Status LEDs” on page 59.
9. Verify that the chassis status OK LED lights.
See “Check Chassis Status LEDs” on page 56.
10. Access the management controller.
See “Accessing the Management Controller” on page 28.
Related Information
“Power Cord Requirements” on page 7
“Electrical Specifications” on page 3
“Attach the InfiniBand Cables” on page 36
Accessing the Management Controller
With power applied, you can now access the management controller.
“Access the Management Controller From the Network Management Port” on
page 29
“Access the Management Controller From the USB Management Port” on page 30
Related Information
“Network Management Connector and Pins” on page 4
“USB Management Connector and Pins” on page 5
“Management Cable Requirements” on page 8
28 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Access the Management Controller From the Network
Management Port
Note – The administrator of the switch has the username of root.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. If you have not already done so, configure the DHCP server with the MAC address and new host name of the management controller inside of your switch.
The MAC address is printed on the customer information (yellow) sheet on the outside of the switch shipping carton and on the pull-out tab on the left side front of the switch chassis, adjacent to power supply 0.
3. Open a SSH session and connect to the management controller by specifying the controller’s host name as configured with the DHCP server.
For example:
% ssh -l root nm2name root@nm2name’s password: password #
where nm2name is the host name of the management controller. The name might be the word hostname. Initially, the password is changeme.
Note – You can change the password at a later time. See “Change the Administrator
Password” on page 93 for instructions on how to change the administrator password.
4. Verify the switch status.
See “Verify the Switch Status” on page 31.
Related Information
“Network Management Connector and Pins” on page 4
“Management Cable Requirements” on page 8
Installing the Switch 29
Access the Management Controller From the USB
Management Port
Note – The administrator of the switch has the username of root.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. If you have not already done so, connect a USB-to-serial adapter to the USB port of the switch.
3. Connect a serial terminal, terminal server, or workstation with a TIP connection to the USB-to-serial adapter.
Configure the terminal or terminal emulator with these settings:
115200 baud
8 bits
No parity
1 Stop bit
No handshaking
4. Press the Return or Enter key on the serial device several times to synchronize the connection.
You might see text similar to the following:
... CentOS release 5.2 (Final) Kernel 2.6.27.13-nm2 on an i686
nm2name login:
where nm2name is the host name of the management controller. The name might be the word hostname. Even if you do not see the text, go to Step 5.
5. Type root for the login name followed by the root password of changeme.
nm2name login: root Password: password #
The # prompt is displayed.
30 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Note – You can change the password at a later time. See “Change the Administrator
Password” on page 93 for instructions on how to change the administrator password.
6. Verify the switch status.
See “Verify the Switch Status” on page 31.
Related Information
“USB Management Connector and Pins” on page 5
“Management Cable Requirements” on page 8
Verify the Switch Status
You can use the following commands on the management controller to check the status of the switch.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. Check the overall health of the switch:
# showunhealthy OK - No unhealthy sensors #
An unfavorable output from the showunhealthy command means a hardware fault with that particular component.
3. Check the status of the power supplies:
# checkpower PSU 0 present status: OK PSU 1 present status: OK #
A power supply output that is not OK from the checkpower command means that there is a problem with that power supply. See “Check Power Supply Status
LEDs” on page 58 for assistance.
Installing the Switch 31
4. Check the status of the fans:
# getfanspeed Fan 0 rpm 12311 Fan 1 rpm 12311 Fan 2 rpm 12311 Fan 3 rpm 12433 Fan 4 rpm 12433 #
A stopped or low speed in the output of the getfanspeed command means
there is a problem with that particular fan.
If not present is in the output of the getfanspeed command, yet a fan is
installed at the particular slot, there is a problem with that particular fan.
For either condition, check the fan. See “Servicing the Fans” on page 124.
5. Check the status of the switch chips:
# checkboot I4-A OK I4-B OK I4-C OK I4-D OK I4-E OK I4-F OK #
If the output of the checkboot command is not OK, there is a problem with the particular switch chip. Try resetting the switch chip. See “Reset the Switch Chip”
on page 89.
6. Alternatively, you can use the env_test command to perform the preceding checks and more:
# env_test NM2 Environment test started: Starting Voltage test: Voltage ECB OK Measured 3.3V Main = 3.28 V Measured 3.3V Standby = 3.40 V Measured 12V = 12.06 V Measured 5V = 5.10 V Measured VBAT = 3.17 V Measured 1.8V = 1.78 V Measured 1.2V Standby = 1.21 V Measured 1.8V Standby = 1.80 V
32 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Measured 2.5VA = 2.51 V Measured 2.5VB = 2.51 V Measured 1.2VA = 1.22 V Measured 1.2VB = 1.22 V Measured 1.2VC = 1.21 V Measured 1.2VD = 1.21 V Measured 1.2VB = 1.21 V Measured 1.2VE = 1.21 V Measured 1.2VF = 1.21 V Voltage test returned OK Starting PSU test: PSU 0 present PSU 1 present PSU test returned OK Starting Temperature test: Back temperature 23.00 Front temperature 32.62 ComEx temperature 26.12 I4-A temperature 55, maxtemperature 56 I4-B temperature 48, maxtemperature 49 I4-C temperature 53, maxtemperature 53 I4-D temperature 48, maxtemperature 49 I4-E temperature 53, maxtemperature 54 I4-F temperature 53, maxtemperature 54 Temperature test returned OK Starting FAN test: Fan 0 running at rpm 12433 Fan 1 running at rpm 12311 Fan 2 running at rpm 12311 Fan 3 running at rpm 12433 Fan 4 running at rpm 12433 FAN test returned OK Starting Connector test: Connector test returned OK Starting I4 test: I4-A OK I4-B OK I4-C OK I4-D OK I4-E OK I4-F OK All I4s OK I4 test returned OK NM2 Environment test PASSED #
7. Once the switch has an operational status, start the Subnet Manager.
See “Start the Subnet Manager” on page 34.
Installing the Switch 33
Related Information
Switch Reference, showunhealthy command
Switch Reference, checkpower command
Switch Reference, getfanspeed command
Switch Reference, checkboot command
Switch Reference, env_test command
Start the Subnet Manager
If you have no Subnet Managers other than the one within the management controller, you can set the Subnet Manager priority to the lowest value. If you have other Subnet Managers in your InfiniBand fabric, you must decide which is to be the master Subnet Manager. The master Subnet Manager has the highest priority. All other Subnet Managers are slave Subnet Managers, and must have a lower priority.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. Create the root node GUID file.
See “Create the guid.txt File” on page 105.
3. Create the Subnet Manager configuration file.
See “Create the opensm.conf File” on page 106.
4. Determine the priority of the Subnet Manager within the management controller.
5. Set the priority of the Subnet Manager:
# setsmpriority priority
where priority is 0 (lowest) to 13 (highest). For example, to set the Subnet Manager to priority 13:
# setsmpriority 13
------------------------------------------------­OpenSM 3.2.6_20090717 Reading Cached Option File: /etc/opensm/opensm.conf Loading Cached Option:routing_engine = ftree Loading Cached Option:sminfo_polling_timeout = 1000 Loading Cached Option:polling_retry_number = 3 Command Line Arguments:
34 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Priority = 13 Creating config file template ’/tmp/osm.conf’. Log File: /var/log/opensm.log
------------------------------------------------­#
6. Enable the Subnet Manager:
# enablesm Starting IB Subnet Manager. [ OK ] #
7. Attach the InfiniBand cables.
See “Attach the InfiniBand Cables” on page 36.
Related Information
Switch Reference, setsmpriority command
Switch Reference, enablesm command
Switch Reference, opensm command
Connecting InfiniBand Cables
After verifying the switch operational status, you can begin attaching the InfiniBand cables.
“InfiniBand Cable Cautions” on page 10
“InfiniBand Cable Guidelines” on page 11
“Attach the InfiniBand Cables” on page 36
“Check Link Status” on page 41
Related Information
“Understanding InfiniBand Cabling” on page 9
“Powering On the Switch” on page 23
“Verifying the InfiniBand Fabric” on page 41
Installing the Switch 35
Attach the InfiniBand Cables
Caution – InfiniBand cables must never turn tighter than a 5-inch (127 mm) radius.
A tighter radius damages the wires and fibers inside the cable.
Note – When you install the InfiniBand cables, connect cables to the lower
connectors first, then connect cables to the upper connectors.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. Loosen the two captive thumbscrews that secure the cover to the cable management bracket.
3. Lift the cover off.
36 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
4. Remove the protective cap from the connector and visually inspect the cable connector.
The shell should not be bent and should be parallel to the inner boards. If the connector is bent or damaged, use a different cable.
5. Ensure that the retraction strap is forward.
6. Orient the cable connector to the CXP receptacle squarely and horizontally.
Ensure that the upper shell just touches the underside of the top of the receptacle on the rear panel.
Installing the Switch 37
7. Slowly move the connector in.
As you slide the connector in, the top of the shell should scrape against the underside of the top of the CXP receptacle.
If the connector stops or binds after about 1/4 in. (5 mm) travel, back out and
repeat from Step 6.
If the connector stops or binds with about 1/8 in. (2 mm) still to go, back out
and repeat Step 7.
8. Continue to push the connector in until the hooks catch onto the top of the receptacle.
38 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
9. Place the cable into the open slot on the cable management bracket.
10. Repeat Step 4 through Step 9 for all cables to be installed.
11. Replace the cover for the cable management bracket and tighten the thumbscrews.
Installing the Switch 39
12. Route the InfiniBand cables so that they do not interfere with other cables, or with servicing the Sun Datacenter InfiniBand Switch 72 or other systems.
Use hook and loop fastener straps to bundle and secure the cables.
Note – Do not use cable zip ties to bundle or secure the cable, because the ties
damage the wires inside the cable.
13. Check that the Link LEDs for cabled links are lit green.
If the Link LED is unlit, the link is down. If the Link LED flashes, there are symbol errors. See “Check Link Status LEDs” on page 57.
14. If installed, close the rack doors to maintain EMI compliance.
15. Check the link status.
See “Check Link Status” on page 41.
Related Information
“Understanding InfiniBand Cabling” on page 9
“Attach the Management Cables” on page 24
“Attach the Power Cords” on page 27
40 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Check Link Status
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. On the management controller, determine the state of the links:
# listlinkup
If the link for a connector is reported as Not present, there is no cable
attached or the link at either end of the cable is down.
If a port is down, use the enableswitchport command to bring the port up.
Alternatively, use the i4reset command to reset the entire switch chip. See
“Enable a Switch Chip Port” on page 92 and “Reset the Switch Chip” on page 89.
3. Verify the InfiniBand fabric.
See “Verifying the InfiniBand Fabric” on page 41.
Related Information
Switch Reference, listlinkup command
Switch Reference, enableswitchport command
Switch Reference, i4reset command
“Verify the Switch Status” on page 31
Verifying the InfiniBand Fabric
Use the ibnetdiscover, ibdiagnet, and ibcheckerrors commands to initially determine the operational status of your switch in the InfiniBand fabric.
“Discover the InfiniBand Fabric Topology” on page 42
“Perform Diagnostics on the InfiniBand Fabric” on page 43
“Validate the InfiniBand Fabric and Report Errors” on page 44
Related Information
“Powering On the Switch” on page 23
“Connecting InfiniBand Cables” on page 35
Installing the Switch 41
Discover the InfiniBand Fabric Topology
The ibnetdiscover command enables you to see the InfiniBand fabric topology and build a topology file which is used by the OpenSM Subnet Manager.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. On the management controller, type:
# ibnetdiscover # # Topology file: generated on Thu Aug 13 18:16:02 2009 # # Max of 2 hops discovered # Initiated from node 0021283a8620b0f0 port 0021283a8620b0f0 vendid=0x2c9 devid=0xbd36 sysimgguid=0x21283a8620b0f3 switchguid=0x21283a8620b0f0(21283a8620b0f0) Switch 36 "S-0021283a8620b0f0" # "Sun DCS 72 QDR FC switch o4nm2-72p-2" enhanced port 0 lid 25 lmc 0 [36] "S-0021283a8620b0d0"[4] # "Sun DCS 72 QDR switch 1.2(LC)" lid 23 4xQDR [35] "S-0021283a8620b0b0"[5] # "Sun DCS 72 QDR switch 1.2(LC)" lid 21 4xQDR [34] "S-0021283a8620b0d0"[6] # "Sun DCS 72 QDR switch 1.2(LC)" lid 23 4xQDR . . . vendid=0x2c9 devid=0xbd36 sysimgguid=0x21283a8620b0e3 switchguid=0x21283a8620b0e0(21283a8620b0e0) Switch 36 "S-0021283a8620b0e0" # "Sun DCS 72 QDR switch 1.1(FC)" base port 0 lid 24 lmc 0 [22] "S-0021283a8620b0c0"[18] # "Sun DCS 72 QDR switch 1.2(LC)" lid 22 4xQDR . . . #
Note – The output for your InfiniBand fabric will differ from that in the example.
42 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
3. Perform InfiniBand fabric diagnostics.
See “Perform Diagnostics on the InfiniBand Fabric” on page 43.
Related Information
Switch Reference, ibnetdiscover command
“Validate the InfiniBand Fabric and Report Errors” on page 44
Perform Diagnostics on the InfiniBand Fabric
The ibdiagnet command performs a collection of tests on the InfiniBand fabric and generates several files that contain parameters and aspects of the InfiniBand fabric.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. On the management controller, type:
# ibdiagnet
In the following example, the ibdiagnet command is minimized to determine which links are underperforming:
# ibdiagnet -lw 4x -ls 10 -skip all Loading IBDIAGNET from: /usr/lib/ibdiagnet1.2
-W- Topology file is not specified. Reports regarding cluster links will use direct routes.
Loading IBDM from: /usr/lib/ibdm1.2
-I- Using port 0 as the local port.
-I- Discovering ... 6 nodes (6 Switches & 0 CA-s) discovered.
. . .
-I- Links With links width != 4x (as set by -lw option)
-I---------------------------------------------------
-I- No unmatched Links (with width != 4x) were found
-I---------------------------------------------------
-I- Links With links speed != 10 (as set by -ls option)
-I---------------------------------------------------
-I- No unmatched Links (with speed != 10) were found
. . .
-I- Stages Status Report:
Installing the Switch 43
STAGE Errors Warnings Bad GUIDs/LIDs Check 0 0 Link State Active Check 0 0 Performance Counters Report 0 0 Specific Link Width Check 0 0 Specific Link Speed Check 0 2 Partitions Check 0 0 IPoIB Subnets Check 0 0
Please see /tmp/ibdiagnet.log for complete log
----------------------------------------------------------------
-I- Done. Run time was 16 seconds. #
Note – The output for your InfiniBand fabric will differ from that in the example.
3. Validate and check errors for the InfiniBand fabric.
See “Validate the InfiniBand Fabric and Report Errors” on page 44.
Related Information
Switch Reference, ibdiagnet command
“Discover the InfiniBand Fabric Topology” on page 42
Validate the InfiniBand Fabric and Report Errors
The ibcheckerrors command uses the topology file to scan the InfiniBand fabric and validate the connectivity as described in the topology file, and to report errors as indicated by the port counters.
1. Identify the prerequisite and subsequent installation tasks that you must perform in conjunction with this procedure.
See “Installation Sequence” on page 16.
2. On the management controller, type:
# ibcheckerrors #warn: counter RcvSwRelayErrors = 48342 (threshold 100) lid 25 port 255 Error check on lid 25 (Sun DCS 72 QDR FC switch o4nm2-72p-2) port all: FAILED #warn: counter RcvSwRelayErrors = 56839 (threshold 100) lid 25 port 28 Error check on lid 25 (Sun DCS 72 QDR FC switch o4nm2-72p-2) port 28: FAILED #warn: counter RcvSwRelayErrors = 56839 (threshold 100) lid 25 port 9 Error check on lid 25 (Sun DCS 72 QDR FC switch o4nm2-72p-2) port 9: FAILED #warn: counter SymbolErrors = 65535 (threshold 10) lid 20 port 255 Error check on lid 20 (Sun DCS 72 QDR switch 1.2(LC)) port all: FAILED
44 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
. . . ## Summary: 6 nodes checked, 0 bad nodes found ## 144 ports checked, 2 ports have errors beyond threshold #
Note – The output for your InfiniBand fabric will differ from that in the example.
Related Information
Switch Reference, ibcheckerrors command
“Discover the InfiniBand Fabric Topology” on page 42
“Perform Diagnostics on the InfiniBand Fabric” on page 43
Installing the Switch 45
46 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Administering the Switch
The following topics describe the administration of the switch.
“Troubleshooting the Switch” on page 47
“Administrative Command Overview” on page 66
“Monitoring the Hardware” on page 67
“Monitoring the InfiniBand Fabric” on page 76
“Controlling the Hardware” on page 87
“Controlling the InfiniBand Fabric” on page 93
Related Information
“Installing the Switch” on page 1
“Servicing the Switch” on page 111
Switch Remote Administration
Switch Reference
Troubleshooting the Switch
The following topics help you resolve some basic problems that might occur with your switch.
“Switch Hardware Problems” on page 48
“InfiniBand Fabric Problems” on page 51
“Understanding the LEDs” on page 54
“Understanding Routing Through the Switch” on page 59
“Switch GUIDs Overview” on page 65
Related Information
“Administrative Command Overview” on page 66
47
“Monitoring the Hardware” on page 67
“Monitoring the InfiniBand Fabric” on page 76
“Controlling the Hardware” on page 87
“Controlling the InfiniBand Fabric” on page 93
Switch Hardware Problems
The following table lists situations that might occur with switch hardware and corrective steps that can be taken to resolve the problem.
Situation Corrective Steps
The Attention LED on a power supply is lit or the power supply seems dysfunctional.
There is no network communication with the management controller.
1. Check the power supply status.
See “Display Power Supply Status” on page 68.
2. Unplug the respective power cord, wait 15 minutes, then reattach the power cord.
3. If the previous steps do not rectify the situation, replace the power supply.
See “Servicing the Power Supplies” on page 113.
1. Verify that the management controller is powered without faults.
See “Check Chassis Status LEDs” on page 56 .
2. Verify that there is a LinkUp condition at the network management port.
See “Check Network Management Port Status LEDs” on page 57.
3. Verfiy the DHCP server is providing the IP address you are using to access the management controller.
See DHCP server documentation.
4. If you are able to access the management controller through the USB connector, restart the management controller.
See “Access the Management Controller From the USB Management Port” on
page 30 and “Restart the Management Controller” on page 88.
5. If you are unable to access the management controller through the USB connector, power cycle the switch.
See “Power Off a Power Supply” on page 117 and “Power On a Power Supply” on
page 122.
6. If the previous steps do not rectify the situation, replace the switch.
See “Remove the Switch From the Rack” on page 139 and “Installing the Switch”
on page 1.
48 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Situation Corrective Steps
The Attention LED on the switch chassis is lit or the management controller seems dysfunctional.
The Attention LED on a fan is lit or the fan seems dysfunctional.
1. If you are unable to access the management controller, power cycle the switch.
See “Power Off a Power Supply” on page 117 and “Power On a Power Supply” on
page 122.
2. If you are able to access the management controller, restart the management controller.
See “Restart the Management Controller” on page 88.
3. Check overall switch health.
See “Display Switch General Health” on page 68
4. Verify that the switch is within operating temperatures and voltages.
See “Display Switch Environmental and Operational Data” on page 71.
5. If the previous steps do not rectify the situation, replace the switch.
See “Remove the Switch From the Rack” on page 139 and “Installing the Switch”
on page 1.
1. Check the fan speed.
See “Display Fan Status” on page 71.
2. If the previous step does not rectify the situation, replace the fan.
See “Servicing the Fans” on page 124.
3. If the previous step does not rectify the situation, install the fan into another available slot.
See “Servicing the Fans” on page 124
4. If no other slots are available, replace the switch.
See “Remove the Switch From the Rack” on page 139 and “Installing the Switch”
on page 1.
After installation, no links are operational.
After installation, not all links are operational.
1. Verify that there is at least one Subnet Manager active on the InfiniBand Fabric.
See “Display Subnet Manager Status” on page 87.
2. If no Subnet Manager is active, start the Subnet Manager within the switch.
See “Start the Subnet Manager” on page 34.
3. If the previous steps do not rectify the situation, restart the Subnet Manager.
See “Disable the Subnet Manager” on page 107 and “Enable the Subnet Manager”
on page 107.
1. Determine which links are non-operational.
See “Display Link Status” on page 75.
2. For links that are “Down”, disable and re-enable the respective ports.
See “Disable a Switch Chip Port” on page 92 and “Enable a Switch Chip Port” on
page 92.
3. If the previous steps do not rectify the situation, disable the respective port.
See “Disable a Switch Chip Port” on page 92.
Administering the Switch 49
Situation Corrective Steps
There was a power outage during a firmware update.
1. If you are able to access the management controller, restart the management controller.
See “Restart the Management Controller” on page 88.
2. If you are unable to access the management controller, power cycle the switch.
See “Power Off a Power Supply” on page 117 and “Power On a Power Supply” on
page 122.
3. Reperform the firmware upgrade.
See “Upgrading the Firmware” on page 145.
Related Information
“InfiniBand Fabric Problems” on page 51
“Understanding the LEDs” on page 54
“Understanding Routing Through the Switch” on page 59
“Switch GUIDs Overview” on page 65
50 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
InfiniBand Fabric Problems
The following table lists situations that might occur with the InfiniBand fabric and corrective steps that can be taken to resolve the problem.
Situation Corrective Steps
Performance of the InfiniBand fabric seems diminished.
1. Determine if there are errors or problems with the InfiniBand fabric.
See:
“Perform Comprehensive Diagnostics for the Entire Fabric” on page 94
“Find 1x or SDR or DDR Links in the Fabric” on page 97
“Determine Which Links Are Experiencing Significant Errors” on page 97
2. Locate the affected nodes by the GUID provided in the output of the ibdiagnet
command.
See “Locate a Switch Chip or Connector From the GUID” on page 74.
3. If the problem is at a cable connection, swap the suspect cable with a known good cable
or reconnect the cable to a known good remote port and repeat Step 1.
See “Servicing the InfiniBand Cables” on page 132.
4. If the problem still remains at the cable connection, disable and re-enable the respective
port and repeat Step 1.
See “Disable a Port” on page 102 and “Enable a Port” on page 103.
Temporary solution:
• If the problem still remains, disable the affected port.
See “Disable a Port” on page 102.
Permanent solution:
• If the problem still remains, replace the affected component or the switch.
See “Servicing the InfiniBand Cables” on page 132.
See remote port’s documentation for replacement procedures.
See “Remove the Switch From the Rack” on page 139 and “Installing the Switch” on
page 1.
Administering the Switch 51
Situation Corrective Steps
An InfiniBand Link LED is blinking.
1. Disconnect and properly reconnect both ends of the respective InfiniBand cable.
See Switch Service, servicing an InfiniBand cable.
2. If the LED is still blinking, determine the significance of the errors through use of the ibdiagnet command.
See “Determine Which Links Are Experiencing Significant Errors” on page 97.
3. Determine which connectors map to the affected link by deconstructing the node’s GUID and port.
See “Locate a Switch Chip or Connector From the GUID” on page 74.
4. If some of the links are running at 1x or SDR, use that situation elsewhere in this table to rectify the problem.
5. Disable and re-enable the respective ports.
See “Disable a Port” on page 102 and “Enable a Port” on page 103.
6. If the errors are still significant, swap the cable with a known good one or reconnect the cable to a known good remote port, and repeat from 2.
7. Depending upon what does or does not rectify the problem, replace that component.
See “Servicing the InfiniBand Cables” on page 132.
See remote port’s documentation for replacement procedures.
52 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Situation Corrective Steps
Some InfiniBand links are running at 1x or SDR.
For a temporary solution:
1. Identify the suspect links using the ibdiagnet command.
See “Find 1x or SDR or DDR Links in the Fabric” on page 97. Look for text like the following:
-W- link with SPD=2.5 found at direct path "1,19"
From: a Switch PortGUID=0x00066a00d80001dd Port=19
To: a Switch PortGUID=0x00066a00d80001dd Port=24
2. Determine which connectors map to the affected link by deconstructing the node’s GUID and port.
See “Locate a Switch Chip or Connector From the GUID” on page 74.
3. Verify the cable connection at both ends.
See “Servicing the InfiniBand Cables” on page 132.
4. Disable and re-enable the respective ports.
See “Disable a Port” on page 102 and “Enable a Port” on page 103.
5. If the previous steps do not rectify the problem, disable the port.
See “Disable a Port” on page 102.
For a permanent solution:
1. Perform the steps for a temporary solution, steps 1 to step 4.
2. Swap the cable with a known good one or reconnect the cable to a known good remote port, and repeat from 1.
3. Depending upon what does or does not rectify the problem, replace that component or the switch.
See “Servicing the InfiniBand Cables” on page 132.
See remote port’s documentation for replacement procedures.
See “Remove the Switch From the Rack” on page 139 and “Installing the Switch” on
page 1.
There are errors on some InfiniBand links.
1. Clear the error counters.
See “Clear Error Counters” on page 98.
2. Start a fabric stress test.
3. Identify the suspect links using the ibdiagnet command.
See “Determine Which Links Are Experiencing Significant Errors” on page 97. Look for text like the following:
-W- lid=0x0006 guid=0x0021283a8816c0a0 dev=48438 Port=34
Performance Monitor counter : Value
link_recovery_error_counter : 0x1
symbol_error_counter : 0x25 (Increase by 3 during ibdiagnet)
4. For links that are experiencing recovery errors or substantial symbol errors, refer to other parts of this table to help identify the cause and rectify the problem.
Administering the Switch 53
Situation Corrective Steps
Output of InfiniBand commands provides only GUID and port, not switch chip or CXP connectors.
You can find the location of a node in the switch, by deconstructing the node’s GUID and port, then you can crossreference the node and port to a connector.
See “Locate a Switch Chip or Connector From the GUID” on page 74 and “Understanding
Routing Through the Switch” on page 59.
Related Information
“Switch Hardware Problems” on page 48
“Understanding the LEDs” on page 54
“Understanding Routing Through the Switch” on page 59
“Switch GUIDs Overview” on page 65
Understanding the LEDs
Status LEDs are used on many components of the switch chassis as a means of indicating the component’s state. You can check the state of the LEDs, using the following procedures:
“Front Status LEDs” on page 55
“Rear Status LEDs” on page 55
“Check Chassis Status LEDs” on page 56
“Check Network Management Port Status LEDs” on page 57
“Check Link Status LEDs” on page 57
“Check Power Supply Status LEDs” on page 58
“Check Fan Status LEDs” on page 59
Related Information
“Switch Hardware Problems” on page 48
“InfiniBand Fabric Problems” on page 51
“Understanding Routing Through the Switch” on page 59
“Switch GUIDs Overview” on page 65
54 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Front Status LEDs
The power supply status LEDs and fan status LEDs are located on the front of the switch chassis.
FIGURE: Front Status LEDs
Figure Legend
1 Power supply AC LED
2 Power supply Attention LED
3 Power supply OK LED
4 Fan status LEDs
Related Information
“Check Power Supply Status LEDs” on page 58
“Check Fan Status LEDs” on page 59
Rear Status LEDs
The chassis status LEDs, network management status LEDs, and link status LEDs are located on the rear of the switch chassis.
Administering the Switch 55
FIGURE: Rear Status LEDs
Figure Legend
1 Chassis status LEDs
2 NET status LEDs
3 Link status LEDs
Related Information
“Check Chassis Status LEDs” on page 56
“Check Network Management Port Status LEDs” on page 57
“Check Link Status LEDs” on page 57
Check Chassis Status LEDs
The chassis status LEDs are located on the left side of the rear panel. See “Rear Status
LEDs” on page 55.
1. Visually inspect the chassis status LEDs.
56 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
2. Compare what you see to the following table.
Glyph Location Name Color State and Meaning
Top Locator White On – No function.
Off – Disabled.
Flashing – The switch is identifying itself.
Middle Attention Amber On – Normal fault detected.
Off – No faults detected.
Flashing – No function.
Bottom OK Green On – Switch is functional without fault.
Off – Switch is off or initializing.
Flashing – No function.
Related Information
“Display Switch Environmental and Operational Data” on page 71
Check Network Management Port Status LEDs
The network management port status LEDs are located on the network management connector of the rear panel. See “Rear Status LEDs” on page 55.
1. Visually inspect the NET status LEDs.
2. Compare what you see to the following table.
Name Position Color State
Activity Left Green On – No function.
Off – No activity.
Flashing – Packet activity.
Link speed Right Amber or green Amber on – 100BASE-T
Green on – 1000BASE-T
Off – No link or link down.
Flashing – No function.
Check Link Status LEDs
The link status LEDs are located at the InfiniBand connectors of the rear panel. See
“Rear Status LEDs” on page 55.
1. Visually inspect the link status LEDs.
Administering the Switch 57
2. Compare what you see for a particular link to the following table.
Name Color State and Meaning
Link Green On – Link established.
Off – No link or link down.
Flashing – Symbol errors.
Related Information
“Display Link Status” on page 75
“Display the Link Status of a Node” on page 80
“Display Counters for a Node” on page 81
Check Power Supply Status LEDs
The power supply status LEDs are located on the power supply at the front of the chassis. See “Front Status LEDs” on page 55.
1. Visually inspect the power supply’s status LEDs.
2. Compare what you see on the power supply to the following table.
Glyph Location Name Color State and Meaning
Top OK Green On – 12 VDC is supplied.
Off – No DC voltage is present.
Flashing – No function.
Middle Attention Amber On – Fault detected, 12 VDC shut down.
Off – No faults detected.
Flashing – No function.
Bottom AC Green On – AC power present and good.
Off – AC power not present.
Flashing – No function.
Caution – If a power supply has shut down because of a thermal or overcurrent
condition, signified by the amber Attention LED lighting, remove the respective power cord from the chassis. Allow the power supply to completely cool for at least 15 minutes. A shorter cooling time might cause damage to the power supply when the power cord is reattached. If the Attention LED lights amber upon reattaching the power cord, replace the power supply.
58 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Related Information
“Display Power Supply Status” on page 68
“Check Board-Level Voltages” on page 69
Check Fan Status LEDs
The fan status LEDs are located in the lower right corner of the fans at the front of the switch chassis. See “Front Status LEDs” on page 55.
1. Visually inspect the fan status LEDs.
2. If the LED is lit, there is a problem with that fan.
Related Information
“Display Fan Status” on page 71
Understanding Routing Through the Switch
The tables in the following topics describe the routing through the switch. The first table maps the CXP connector and link LED to a switch chip port. The second table provides a switch chip to switch chip map. The third table maps the switch chip port to a CXP connector and link LED. When a command’s output provides a switch chip port or CXP connector, you can use these tables to determine the route that link is following.
“CXP Connectors and Link LEDs to Switch Chip Port Routes” on page 60
“Switch Chip Port to Switch Chip Port Routes” on page 61
“Switch Chip Port to CXP Connectors and Link LED Routes” on page 62
“Signal Route Through the Switch” on page 64
Note – The tables have shaded cells which are used for the example described in
“Signal Route Through the Switch” on page 64.
Related Information
“Switch Hardware Problems” on page 48
“InfiniBand Fabric Problems” on page 51
“Understanding the LEDs” on page 54
“Switch GUIDs Overview” on page 65
Administering the Switch 59
CXP Connectors and Link LEDs to Switch Chip Port Routes
The following table provides a mapping of the CXP connector and its link LEDs to the respective switch chip and port. The switch chip and port are given as chip-port, where:
chip – The identifying letter of the switch chip (A–D).
port – The identifying number of the port (19–36).
Note – The shaded cell seen in the PDF file is used for the example described in
“Signal Route Through the Switch” on page 64. The shaded cell is not seen in the
HTML file.
CXP Connector A CXP Connector B
Connector
Group
0 A-20 A-21 A-22 A-19 A-23 A-24
1 A-26 A-27 A-28 A-25 A-29 A-30
Left LEDP3Center LEDP2Right LEDP1Left LEDP3Center LEDP2Right LED
P1
2 A-35 A-34 A-33
3 B-20 B-21 B-22 B-19 B-23 B-24
4 B-26 B-27 B-28 B-25 B-29 B-30
5 B-35 B-34 B-33 B-36 B-32 B-31
6 C-20 C-21 C-22 C-19 C-23 C-24
7 C-26 C-27 C-28 C-25 C-29 C-30
8 C-35 C-34 C-33 C-36 C-32 C-31
9 D-20 D-21 D-22 D-19 D-23 D-24
10 D-26 D-27 D-28 D-25 D-29 D-30
11 D-35 D-34 D-33 D-36 D-32 D-31
A-36 A-32 A-31
Note – When using CXP to QSFP splitter cables, the P1, P2, and P3 QSFP connectors
are related to the right, center, and left link LEDs respectively.
Related Information
“Switch Chip Port to Switch Chip Port Routes” on page 61
60 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
“Switch Chip Port to CXP Connectors and Link LED Routes” on page 62
“Signal Route Through the Switch” on page 64
“Display Link Status” on page 75
“Display Switch Chip Port Status” on page 76
Switch Chip Port to Switch Chip Port Routes
The following table provides a mapping of switch chip (I4) and port to another switch chip (I4) and port.
Note – The shaded cell seen in the PDF file is used for the example described in
“Signal Route Through the Switch” on page 64. The shaded cell is not seen in the
HTML file.
I4 Port I4 Port I4 Port I4 Port I4 Port I4 Port I4 Port I4 Port
A1→ E9C1→ F9E1→ A9F 1→ C9
A2→ F8C2→ E8E 2→ C8F 2→ A8
A3→ F7C3→ E7E 3→ C7F 3→ A7
A 4 E 6C4→ F6E4→ A6F 4→ C6
A5→ F5C5→ E5E 5→ C5F 5→ A5
A6→ E4C6→ F4E6→ A4F 6→ C4
A7→ F3C7→ E3E 7→ C3F 7→ A3
A8→ F2C8→ E2E 8→ C2F 8→ A2
A9→ E1C9→ F1E9→ A1F 9→ C1
A10→ F19C10→ E19E10→ D13F 10→ B13
A11→ E20C 11→ F20E11→ B14F11→ D14
A12→ E21C12→ F21
E 12 B 15 F 12 D15
A13→ F27C13→ E27E13→ D10F 13→ B10
A14→ E26C14→ F26E14→ B11F14→ D11
A15→ E25C15→ F25E15→ B12F15→ D12
A16→ F24C16→ E24E16→ D18F 16→ B18
A17→ E23C17→ F23E17→ B17F17→ D17
A18→ F22C18→ E22E18→ D16F 18→ B16
B1→ E28D 1F28E19→ C10F 19→ A10
Administering the Switch 61
I4 Port I4 Port I4 Port I4 Port I4 Port I4 Port I4 Port I4 Port
B2→ F29D 2E29E20→ A11 F 20→ C11
B3→ F30D 3E30E21→ A12F 21→ C12
B4→ E36D 4F36E22→ C18F 22→ A18
B5→ F35D 5E35E23→ A17F 23→ C17
B6→ E34D 6F34E24→ C16F 24→ A16
B7→ F33D 7E33E25→ A15F 25→ C15
B8→ F32D 8E32E26→ A14F 26→ C14
B9→ E31D 9F31E27→ C13F 27→ A13
B10→ F13D10→ E13E28→ B1F28→ D1
B11→ E14D11→ F14E29→ D2F29→ B2
B12→ E15D12→ F15E30→ D3 F30→ B3
B13
F10D13→ E10E31→ B9F31→ D9
B14→ E11D14→ F11E32→ D8F32→ B8
B15→ E12D15→ F12E33→ D7 F33→ B7
B16→ F18D16→ E18E34→ B6F34→ D6
B17→ E17D17→ F17E35→ D5 F35→ B5
B18→ F16D18→ E16E36→ B4F36→ D4
Related Information
“CXP Connectors and Link LEDs to Switch Chip Port Routes” on page 60
“Switch Chip Port to CXP Connectors and Link LED Routes” on page 62
“Signal Route Through the Switch” on page 64
“Display Link Status” on page 75
“Display Switch Chip Port Status” on page 76
Switch Chip Port to CXP Connectors and Link LED Routes
A reverse-lookup mapping of the switch chip and port to CXP connector and its link LEDs is provided in the following table. The connectors and LEDs are given as connectorletter-location, where:
connector – The identifying number of the connector group (0–11).
letter – The letter identifying the upper (A) or lower (B) connector.
62 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
location – The location of the LED, left, center, or right.
Note – The shaded cell seen in the PDF file is used for the example described in
“Signal Route Through the Switch” on page 64. The shaded cell is not seen in the
HTML file.
Port Switch Chip A Switch Chip B Switch Chip C Switch Chip D
19 0B-left-P3 3B-left-P3 6B-left-P3 9B-left-P3
20 0A-left-P3 3A-left-P3 6A-left-P3 9A-left-P3
21 0A-center-P2 3A-center-P2 6A-center-P2 9A-center-P2
22 0A-right-P1 3A-right-P1 6A-right-P1 9A-right-P1
23 0B-center-P2 3B-center-P2 6B-center-P2 9B-center-P2
24 0B-right-P1
3B-right-P1 6B-right-P1 9B-right-P1
25 1B-left-P3 4B-left-P3 7B-left-P3 10B-left-P3
26 1A-left-P3 4A-left-P3 7A-left-P3 10A-left-P3
27 1A-center-P2 4A-center-P2 7A-center-P2 10A-center-P2
28 1A-right-P1 4A-right-P1 7A-right-P1 10A-right-P1
29 1B-center-P2 4B-center-P2 7B-center-P2 10B-center-P2
30 1B-right-P1 4B-right-P1 7B-right-P1 10B-right-P1
31 2B-right-P1 5B-right-P1 8B-right-P1 11B-right-P1
32 2B-center-P2 5B-center-P2 8B-center-P2 11B-center-P2
33 2A-right-P1 5A-right-P1 8A-right-P1 11A-right-P1
34 2A-center-P2 5A-center-P2 8A-center-P2 11A-center-P2
35 2A-left-P3 5A-left-P3 8A-left-P3 11A-left-P3
36 2B-left-P3 5B-left-P3 8B-left-P3 11B-left-P3
Note – When using CXP to QSFP splitter cables, the P1, P2, and P3 QSFP connectors
are related to the right, center, and left link LEDs respectively.
Related Information
“CXP Connectors and Link LEDs to Switch Chip Port Routes” on page 60
“Switch Chip Port to Switch Chip Port Routes” on page 61
“Signal Route Through the Switch” on page 64
Administering the Switch 63
“Display Link Status” on page 75
“Display Switch Chip Port Status” on page 76
Signal Route Through the Switch
By combining the information from the tables in “Understanding Routing Through
the Switch” on page 59, it is possible to determine a route through the switch. This
topic describes a sample situation that might occur.
1. A route is initiated at connector 2B. The left LED (P3) blinks.
2. Using “CXP Connectors and Link LEDs to Switch Chip Port Routes” on page 60,it is determined that the link routes to switch chip A, through port 36.
3. The Subnet Manager instructs switch chip A to use port 4 to forward the link.
4. Using “Switch Chip Port to Switch Chip Port Routes” on page 61, it is determined that the link routes to I4 switch chip E, port 6.
5. The Subnet Manager instructs switch chip E to use port 12 to forward the link.
6. Using “Switch Chip Port to Switch Chip Port Routes” on page 61, it is determined that the link routes to switch chip B, port 15.
7. The Subnet Manager instructs switch chip B to use port 24 to forward the link.
8. Using “Switch Chip Port to CXP Connectors and Link LED Routes” on page 62,it is determined that the link exits the switch at connector 3B. The right LED (P1) blinks.
Related Information
“CXP Connectors and Link LEDs to Switch Chip Port Routes” on page 60
“Switch Chip Port to Switch Chip Port Routes” on page 61
“Switch Chip Port to CXP Connectors and Link LED Routes” on page 62
“Display a Route Through the Fabric” on page 79
“Perform Comprehensive Diagnostics for a Route” on page 95
64 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Switch GUIDs Overview
Global unit identifiers (GUIDs) are unique 64-bit strings that identify nodes such as switches and channel adapters. For the Sun Datacenter InfiniBand switches, GUIDs are modified to identify the node’s role and location. The following table describes the GUID’s structure.
63 16 15 12 11 8 7 4 3 0
MAC (48 bits) Type Pos Device Num
The five fields of the GUID are described as follows:
MAC – Bits 63 through 16 are the Machine Allocation Code (MAC) address. A
standard for network components, the MAC address is typically provided by manufacturers in a 6-byte, colon delimited string. For example, 00:11:22:33:44:55.
Type – Bits 15 through 12 is the type of board on which the node resides.
Pos – Bits 11 through 8 identify the position of the board within the switch.
Device – Bits 7 through 4 identify which device on the board has that node.
Num – Bits 3 through 0 are numbers reserved for the programs which modify the
GUID. In most occurrences, the value is 0x2.
The following table provides values for Type, Pos, Device.
Board Type Position Device
Fabric card 0xF 0x0 (Fabric card 0)–0x8 (Fabric card 8) 0xA (I4 chip 0)–0xB (I4 chip 1)
Line card 0x1 0x0 (Line card 0)–0x8 (Line card 8) 0xA (I4 chip 0)–0xD (I4 chip 3)
Gateway 0x3 0x0 0xA,0xB (I4 chips 0, 1) 0x0, 0x4 (Gbe
chips)
36-Port 0xA 0x0 0xA (I4 chip 0)
72-Port 0xB 0x0 0xA (I4 chip 0)–0xF (I4 chip 5)
For example, given the following output from the ibswitches command:
Switch : 0x0021283a8620b0e0 ports 36 "Sun DCS 72 QDR switch 1.1(FC)" base port 1 lid 24 lmc 0
The GUID is 0x0021283a8620b0e0 or 0021283a8620 B 0 E 0. Using the information provided in this topic:
The MAC address is 0x0021283a8620 or 00:21:28:3A:86:20.
The type is 0xB, or a 72-port board.
Administering the Switch 65
The position is 0x0.
The device is 0xE, or I4 switch chip E.
The number is 0.
Related Information
Switch Reference, ibnetdiscover command
Switch Reference, ibnodes command
Switch Reference, ibswitches command
Switch Reference, ibhosts command
“Identify All Switches in the Fabric” on page 77
“Switch Hardware Problems” on page 48
“InfiniBand Fabric Problems” on page 51
“Understanding the LEDs” on page 54
“Understanding Routing Through the Switch” on page 59
Administrative Command Overview
The following topics provide an overview of administrative tasks and the command sets to perform those tasks. Administering the switch requires accessing the management controller.
“Access the Management Controller From the Network Management Port” on
page 29
“Access the Management Controller From the USB Management Port” on page 30
“Hardware Command Overview” on page 67
“InfiniBand Command Overview” on page 67
Related Information
“Troubleshooting the Switch” on page 47
“Monitoring the Hardware” on page 67
“Monitoring the InfiniBand Fabric” on page 76
“Controlling the Hardware” on page 87
“Controlling the InfiniBand Fabric” on page 93
66 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Hardware Command Overview
The management controller uses a simplified Linux OS and file system. From the # prompt on the management controller, you can type hardware commands to perform some administrative and management tasks. Hardware commands are user-friendly and can perform some testing upon the switch chips, enabling greater control of the switch and its operation.
After you log in to the root account, the shell prompt (#) appears, and you can enter shell commands. Enter the hardware commands in the following format:
# command [arguments][arguments]...
Related Information
Switch Reference, understanding hardware commands
“InfiniBand Command Overview” on page 67
InfiniBand Command Overview
The InfiniBand commands are a means of monitoring and controlling aspects of the InfiniBand fabric. These commands are also installed on and run from the management controller, which is also the host of the Subnet Manager. Use of these commands requires thorough knowledge of InfiniBand architecture and technology.
After you log in to the root account, the shell prompt (#) appears, and you can enter shell commands. Enter the InfiniBand commands in the following format:
# command [option][option] ...
Related Information
Switch Reference, understanding InfiniBand commands
“Hardware Command Overview” on page 67
Monitoring the Hardware
The following topics enable you to display and check the operation and status of the switch.
Administering the Switch 67
Note – To use the commands described in these topics, you must be the root user of
the management controller.
“Display Switch General Health” on page 68
“Display Power Supply Status” on page 68
“Check Board-Level Voltages” on page 69
“Display Internal Temperatures” on page 70
“Display Fan Status” on page 71
“Display Switch Environmental and Operational Data” on page 71
“Display Switch Firmware Versions” on page 73
“Display Switch Chip Boot Status” on page 74
“Display Link Status” on page 75
“Display Switch Chip Port Status” on page 76
Related Information
“Monitoring the InfiniBand Fabric” on page 76
“Controlling the Hardware” on page 87
Display Switch General Health
An easy way to perform a quick check of the switch’s operation is with the showunhealthy command.
On the management controller, type:
# showunhealthy OK - No unhealthy sensors #
Related Information
Switch Reference, showunhealthy command
“Display Switch Environmental and Operational Data” on page 71
Display Power Supply Status
The checkpower command performs a simple pass-fail test on the power supplies.
68 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
1. On the management controller, type:
# checkpower PSU 0 present status: OK PSU 1 present status: Alert #
2. If you see the words Alert or Not Present in the command output, perform the following steps for the affected power supply.
a. Remove the power cord from the affected power supply.
See “Power Off a Power Supply” on page 117.
b. Remove the affected power supply from the switch chassis.
See “Remove a Power Supply” on page 118.
c. Wait one minute.
d. Reinstall the power supply.
See “Install a Power Supply” on page 120.
e. Re-attach the power cord.
See “Power On a Power Supply” on page 122.
f. If the power supply’s Attention LED lights or the checkpower command still
reports Alert or Not Present for the power supply, replace the power supply.
See “Servicing the Power Supplies” on page 113.
Related Information
Switch Reference, checkpower command
“Check Board-Level Voltages” on page 69
“Display Switch Environmental and Operational Data” on page 71
Check Board-Level Voltages
When you want to know if the voltages used by the various components within the switch are within nominal values, use the checkvoltages command. If a voltage deviates more than 10% of its nominal value, there is a problem. This check also reports if a battery has failed.
Administering the Switch 69
On the management controller, type:
# checkvoltages Voltage ECB OK Measured 3.3V Main = 3.28 V Measured 3.3V Standby = 3.37 V Measured 12V = 12.00 V Measured 5V = 5.10 V Measured VBAT = 3.08 V Measured 1.8V = 1.77 V Measured 1.2V Standby = 1.21 V Measured 1.8V Standby = 1.80 V Measured 2.5VA = 2.51 V Measured 2.5VB = 2.51 V Measured 1.2VA = 1.22 V Measured 1.2VB = 1.21 V Measured 1.2VC = 1.21 V Measured 1.2VD = 1.22 V Measured 1.2VE = 1.21 V Measured 1.2VF = 1.21 V All voltages OK #
Related Information
Switch Reference, checkvoltages command
“Display Power Supply Status” on page 68
“Display Switch Environmental and Operational Data” on page 71
Display Internal Temperatures
If you are concerned with the internal temperature of the switch, you can use the showtemps command to display nominal and measured temperatures. If there is a great deviation, there might not be enough cooling.
On the management controller, type:
# showtemps Back temperature 22.88 Front temperature 31.75 Com-Express temperature 26.12 I4-A temperature 55 maxtemperature 55 I4-B temperature 47 maxtemperature 49 I4-C temperature 52 maxtemperature 52 I4-D temperature 48 maxtemperature 49
70 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
I4-E temperature 52 maxtemperature 52 I4-F temperature 52 maxtemperature 54 #
Related Information
Switch Reference, showtemps command
“Display Fan Status” on page 71
“Display Switch Environmental and Operational Data” on page 71
Display Fan Status
The output of the getfanspeed command can help you determine fan speed inconsistencies, which might indicate a future failure.
On the management controller, type:
# getfanspeed Fan 0 rpm 17940 Fan 1 rpm 17940 Fan 2 rpm 17687 Fan 3 rpm 17940 Fan 4 rpm 17687 #
Related Information
Switch Reference, getfanspeed command
“Display Switch Environmental and Operational Data” on page 71
Display Switch Environmental and Operational
Data
The env_test command enables you to perform several investigative tasks, helping you determine the internal environment and operational status of the switch.
On the management controller, type:
# env_test NM2 Environment test started: Starting Voltage test: Voltage ECB OK Measured 3.3V Main = 3.28 V
Administering the Switch 71
Measured 3.3V Standby = 3.40 V Measured 12V = 12.06 V Measured 5V = 5.10 V Measured VBAT = 3.17 V Measured 1.8V = 1.78 V Measured 1.2V Standby = 1.21 V Measured 1.8V Standby = 1.80 V Measured 2.5VA = 2.51 V Measured 2.5VB = 2.51 V Measured 1.2VA = 1.22 V Measured 1.2VB = 1.22 V Measured 1.2VC = 1.21 V Measured 1.2VD = 1.21 V Measured 1.2VB = 1.21 V Measured 1.2VE = 1.21 V Measured 1.2VF = 1.21 V Voltage test returned OK Starting PSU test: PSU 0 present PSU 1 present PSU test returned OK Starting Temperature test: Back temperature 23.00 Front temperature 32.62 ComEx temperature 26.12 I4-A temperature 55, maxtemperature 56 I4-B temperature 48, maxtemperature 49 I4-C temperature 53, maxtemperature 53 I4-D temperature 48, maxtemperature 49 I4-E temperature 53, maxtemperature 54 I4-F temperature 53, maxtemperature 54 Temperature test returned OK Starting FAN test: Fan 0 running at rpm 12433 Fan 1 running at rpm 12311 Fan 2 running at rpm 12311 Fan 3 running at rpm 12433 Fan 4 running at rpm 12433 FAN test returned OK Starting Connector test: Connector test returned OK Starting I4 test: I4-A OK I4-B OK I4-C OK I4-D OK I4-E OK I4-F OK All I4s OK
72 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
I4 test returned OK NM2 Environment test PASSED #
Related Information
Switch Reference, env_test command
Display Switch Firmware Versions
In time, newer or updated switch firmware might become available. The nm2version command displays the versions of the firmware within the switch chassis.
On the management controller, type:
# nm2version NM2-72p version: 0.1.0-1 Build time: Aug 24 2009 16:41:03 FPGA version: 0x94 ComExpress info:
Board Name: "NOW1" Manufacturer Name: "JUMP"
Manufacturing Date: 2009.02.19
Last Repair Date: 1980.01.01
Serial Number: "NCD2S0240" Hardware Revision: 0x0100 Firmware Revision: 0x0102
Jida Revision: 0x0103 Feature Number: 0x0001 #
Note – The output of the nm2version command contains extraneous information.
In this example, only the output that is bold is relevant to the administration of the switch.
Related Information
Switch Reference, nm2version command
Administering the Switch 73
Locate a Switch Chip or Connector From the
GUID
The output of some InfiniBand commands identify a node by its GUID. You can determine the switch chip or InfiniBand cable connection associated with that GUID using this procedure.
1. In the output of the InfiniBand command, identify the GUID and port in question.
For example, given the following output from the ibdiagnet command:
-W- lid=0x0055 guid=0x0021283a8638b0c0 dev=48438 Port=21
In this example, the GUID is 0x0021283a8638b0c0 and the port is 21.
2. Truncate all digits of the GUID except the four digits on the right.
For example, GUID 0x0021283a8638b0c0 is truncated to b0c0.
The b0 means that the node is inside of a Sun Datacenter InfiniBand Switch 72. The c means that the node is located in I4 switch chip C. The port is 21.
3. Use the tables in “Understanding Routing Through the Switch” on page 59 to cross-reference the routing to other switch chips and ports and to connections.
For example, using “Switch Chip Port to CXP Connectors and Link LED Routes”
on page 62, I4 switch chip C, port 21 routes to connector 6A, the center LED, and
the link is carried by P2.
Related Information
“Switch GUIDs Overview” on page 65
Display Switch Chip Boot Status
If you believe there is a problem with a switch chip, the checkboot command provides a quick pass-fail check of the switch chips. Should a switch chip fail the test, you can use the i4reset command in attempt to reboot the switch chip.
On the management controller, type:
# checkboot I4-A OK I4-B OK I4-C OK I4-D OK
74 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
I4-E OK I4-F OK #
Related Information
Switch Reference, checkboot command
“Reset the Switch Chip” on page 89
“Display Switch Environmental and Operational Data” on page 71
Display Link Status
In some situations, you might need to know the status of each route through the switch. Additionally, the listlinkup command displays where InfiniBand cables are connected to the switch.
On the management controller, type:
# listlinkup Connector 0A Present <-> I4-A Ports 22 up 21 up 20 up Connector 1A Not present Connector 2A Not present Connector 3A Not present Connector 4A Not present . . Connector 10B Not present Connector 11B Not present Link I4-A 01 <-> I4-E 09 up Link I4-A 02 <-> I4-F 08 up Link I4-A 03 <-> I4-F 07 up . . . Link I4-D 18 <-> I4-E 16 up #
Related Information
Switch Reference, listlinkup command
“Display the Link Status of a Node” on page 80
“Set Link Speed” on page 91
Administering the Switch 75
Display Switch Chip Port Status
If by using other procedures, it seems that a particular port of a switch chip is problematic, you can use the getportstatus command to provide the port state, width, and speed.
On the management controller, type:
# getportstatus switch_chip port
where:
switch_chip is the letter of the switch chip (A–F).
port is the number of the port on the switch chip (1–36).
For example:
# getportstatus A 7 Portstate 4 Portphystate 5 LinkWidthActive 2 LinkSpeedActive 4 #
Related Information
Switch Reference, getportstatus command
“Display the Link Status of a Node” on page 80
Monitoring the InfiniBand Fabric
The following tasks enable you to display and check the operation and status of the InfiniBand fabric and components.
Note – To use the commands described in these topics, you must be the root user of
the management controller.
“Identify All Switches in the Fabric” on page 77
“Identify All HCAs in the Fabric” on page 78
“Display the InfiniBand Fabric Topology” on page 78
“Display a Route Through the Fabric” on page 79
76 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
“Display the Link Status of a Node” on page 80
“Display Counters for a Node” on page 81
“Display Data Counters for a Node” on page 82
“Display Low-Level Detailed Information About a Node” on page 83
“Display Low-Level Detailed Information About a Port” on page 84
“Map LIDs to GUIDs” on page 86
“Display Subnet Manager Status” on page 87
Related Information
“Monitoring the Hardware” on page 67
“Controlling the InfiniBand Fabric” on page 93
Identify All Switches in the Fabric
If you need to know identity information about the switches in the InfiniBand fabric, you can use the ibswitches command. This command displays the GUID, name, LID, and LMC for each switch. The output of the command is a mapping of GUID to LID for switches in the fabric.
On the management controller, type:
# ibswitches Switch : 0x0021283a8620b0f0 ports 36 "Sun DCS 72 QDR FC switch o4nm2-72p-2" enhanced port 0 lid 25 lmc 0 Switch : 0x0021283a8620b0d0 ports 36 "Sun DCS 72 QDR switch 1.2(LC)" base port 0 lid 23 lmc 0 Switch : 0x0021283a8620b0b0 ports 36 "Sun DCS 72 QDR switch 1.2(LC)" base port 0 lid 21 lmc 0 Switch : 0x0021283a8620b0a0 ports 36 "Sun DCS 72 QDR switch 1.2(LC)" base port 0 lid 20 lmc 0 Switch : 0x0021283a8620b0c0 ports 36 "Sun DCS 72 QDR switch 1.2(LC)" base port 0 lid 22 lmc 0 Switch : 0x0021283a8620b0e0 ports 36 "Sun DCS 72 QDR switch 1.1(FC)" base port 0 lid 24 lmc 0 . . . #
Note – The output in the example is just a portion of the full output.
Administering the Switch 77
Related Information
Switch Reference, ibhosts command
Switch Reference, ibswitches command
Identify All HCAs in the Fabric
Similar to the ibswitches command, the ibhosts command displays identity information about the HCAs in the InfiniBand fabric. The output contains the GUID and name for each HCA.
Note – Unlike the ibswitches command, the ibhosts command does not display
the LIDs.
On the management controller, type:
# ibhosts Ca : 0x5080020000911314 ports 1 "nsn32-50 HCA-1" Ca : 0x5080020000911310 ports 1 "nsn32-20 HCA-1" Ca : 0x50800200008e532c ports 1 "ib-71 HCA-1" Ca : 0x50800200008e5328 ports 1 "ib-70 HCA-1" Ca : 0x50800200008296a4 ports 2 "ib-90 HCA-1" . . . #
Note – The output in the example is just a portion of the full output.
Related Information
Switch Reference, ibhosts command
Switch Reference, ibswitches command
Display the InfiniBand Fabric Topology
To understand the routing that happens within your InfiniBand fabric, the ibnetdiscover command displays the node-to-node connectivity. The output of the command is dependent upon the size of your fabric.
78 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
On the management controller, type:
# ibnetdiscover
Related Information
Switch Reference, ibnetdiscover command
“Determine Changes to the InfiniBand Fabric Topology” on page 95
“Perform Comprehensive Diagnostics for the Entire Fabric” on page 94
Display a Route Through the Fabric
Often you need to know the route between two nodes. The ibtracert command can provide that information by displaying the GUIDs, ports, and LIDs of the nodes along the route.
On the management controller, type:
# ibtracert slid dlid
where:
slid is the LID of the source node.
dlid is the LID of the destination node.
For example:
# ibtracert 25 24 From switch {0x0021283a8620b0f0} portnum 0 lid 25-25 "Sun DCS 72 QDR FC switch o4nm2-72p-2" [1] -> switch port {0x0021283a8620b0c0}[9] lid 22-22 "Sun DCS 72 QDR switch
1.2(LC)" [2] -> switch port {0x0021283a8620b0e0}[8] lid 24-24 "Sun DCS 72 QDR switch
1.1(FC)" To switch {0x0021283a8620b0e0} portnum 0 lid 24-24 "Sun DCS 72 QDR switch
1.1(FC)" #
For this example:
The route starts at switch with GUID 0x0021283a8620b0f0 and is using port
0. The switch is LID 25 and in the description, the switch host’s name is o4nm2-72p-2.
The route enters at port 1 of the switch with GUID 0x0021283a8620b0c0 and
exits at port 9. The switch is LID 22.
Administering the Switch 79
The route enters at port 2 of the switch with GUID 0x0021283a8620b0e02
and exits at port 8. The switch is LID 24.
Note – The output for your InfiniBand fabric will differ from that in the example.
Related Information
Switch Reference, ibtracert command
“Perform Comprehensive Diagnostics for a Route” on page 95
Display the Link Status of a Node
If you want to know the link status of a node in the InfiniBand fabric, the ibportstate command can tell you the state, width, and speed of that node.
On the management controller, type:
# ibportstate lid port
where:
lid is the LID of the node.
80 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
port is the port of the node.
For example:
# ibportstate 24 8 PortInfo: # Port info: Lid 24 port 8
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................10.0 Gbps
Peer PortInfo: # Port info: Lid 24 DR path slid 25; dlid 65535; 0,8 port 2
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................10.0 Gbps
#
In the output, the Active parameters are the current state of the port.
Related Information
Switch Reference, ibportstate command
“Display Switch Chip Port Status” on page 76
“Display Link Status” on page 75
“Find 1x or SDR or DDR Links in the Fabric” on page 97
“Set Port Speed” on page 101
Display Counters for a Node
To help ascertain the health of a node, the perfquery command displays the performance, error, and data counters for that node.
Administering the Switch 81
On the management controller, type:
# perfquery lid port
where:
lid is the LID of the node.
port is the port of the node.
Note – If a port value of 255 is specified for a switch node, the counters are the total
for all switch ports.
For example:
# perfquery 24 8 # Port counters: Lid 24 port 8
PortSelect:......................8
CounterSelect:...................0x1b01
SymbolErrors:....................0
. . .
VL15Dropped:.....................0
XmtData:.........................2925432
RcvData:.........................2966688
XmtPkts:.........................40631
RcvPkts:.........................41204
#
Note – The output in the example is just a portion of the full output.
Related Information
Switch Reference, perfquery command
“Clear Error Counters” on page 98
Display Data Counters for a Node
If you want to know the data counters for a node, the ibdatacounts command provides that subset of the perfquery command output.
82 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
On the management controller, type:
# ibdatacounts lid port
where:
lid is the LID of the node.
port is the port of the node.
Note – If no port is specified for a switch node, the counters are the total for all
switch ports. This is also defined by port 255.
For example:
# ibdatacounts 24 # Port counters: Lid 24 port 255
XmtData:.........................22834224
RcvData:.........................22835016
XmtPkts:.........................317142
RcvPkts:.........................317153
#
Related Information
Switch Reference, ibdatacounts command
“Clear Data Counters” on page 99
Display Low-Level Detailed Information About
a Node
If intensive troubleshooting is necessary to resolve a problem, the smpquery command can provide very detailed information about a node.
Administering the Switch 83
On the management controller, type:
# smpquery switchinfo lid
where lid is the LID of the node.
For example, to see detailed information about a switch with LID 24, type:
# smpquery switchinfo 24 # Switch info: Lid 24
LinearFdbCap:....................49152
RandomFdbCap:....................0
McastFdbCap:.....................4096
LinearFdbTop:....................25
DefPort:.........................0
DefMcastPrimPort:................255
DefMcastNotPrimPort:.............255
LifeTime:........................18
StateChange:.....................0
LidsPerPort:.....................0
PartEnforceCap:..................32
InboundPartEnf:..................1
OutboundPartEnf:.................1
FilterRawInbound:................1
FilterRawOutbound:...............1
EnhancedPort0:...................0
#
Related Information
Switch Reference, smpquery command
Display Low-Level Detailed Information About
a Port
The smpquery command can provide very detailed information about a port.
On the management controller, type:
# smpquery portinfo lid port
where:
lid is the LID of the node.
84 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
port is the port of the node.
For example, to see detailed information about port 8 on the switch with LID 24, type:
# smpquery portinfo 24 8 # Port info: Lid 24 port 8
Mkey:............................0x0000000000000000
GidPrefix:.......................0x0000000000000000
Lid:.............................0x0000
SMLid:...........................0x0000
CapMask:.........................0x0
DiagCode:........................0x0000
MkeyLeasePeriod:.................0
LocalPort:.......................8
LinkWidthEnabled:................1X or 4X
LinkWidthSupported:..............1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkDownDefState:................Polling
ProtectBits:.....................0
LMC:.............................0
. . .
SubnetTimeout:...................0
RespTimeVal:.....................0
LocalPhysErr:....................8
OverrunErr:......................8
MaxCreditHint:...................85
RoundTrip:.......................16777215
#
Note – The output in the example is just a portion of the full output.
Related Information
Switch Reference, smpquery command
Administering the Switch 85
Map LIDs to GUIDs
In an InfiniBand fabric, the Subnet Manager and Subnet Administrator assign subnet-specific LIDs to nodes. Often in the use of the InfiniBand commands, you must provide a LID to issue a command to a particular InfiniBand device. Alternatively, the output of a command might identify InfiniBand devices by their LID.
You can create a file that is a mapping of node LIDs to node GUIDs, which can help with administrating your InfiniBand fabric. The following procedure creates a file that lists the LID in hexadecimal, the GUID in hexadecimal, and the node description.
Note – Creation of the mapping file is not a requirement for InfiniBand
administration.
1. Create an inventory file:
# osmtest -f c -i inventory.txt
Note – The inventory.txt file can be used for other purposes too, besides this
procedure.
2. Create a mapping file:
# cat inventory.txt |grep -e ’^lid’ -e ’port_guid’ -e ’desc’ |sed ’s/^lid/\nlid/’ > mapping.txt
3. Edit the latter half of the mapping.txt file to remove the nonessential information.
The content of the mapping.txt file looks similar to the following:
lid 0x14 port_guid 0x0021283a8620b0a0 # node_desc Sun DCS 72 QDR switch 1.2(LC)
lid 0x15 port_guid 0x0021283a8620b0b0 # node_desc Sun DCS 72 QDR switch 1.2(LC)
lid 0x16 port_guid 0x0021283a8620b0c0 # node_desc Sun DCS 72 QDR switch 1.2(LC)
86 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
. . .
Note – The output in the example is just a portion of the entire file.
Related Information
Switch Reference, osmtest command
Display Subnet Manager Status
If you want to quickly determine your Subnet Manager ’s priority and state, the sminfo command can also provide the LID and GUID of the hosting HCA.
On the management controller, type:
# sminfo
For example:
# sminfo sminfo: sm lid 25 sm guid 0x21283a8620b0f0, activity count 44911 priority 13 state 3 SMINFO_MASTER #
In the output, the Subnet Manager’s hosting HCA has LID 25 and GUID 0x21283a8620b0f0. The Subnet Manager has a priority of 13 (high) and its state is 3 (master).
Related Information
Switch Reference, sminfo command
“Controlling the Subnet Manager” on page 104
Controlling the Hardware
The following topics describe how you can manage the switch and its components.
Administering the Switch 87
Note – To use the commands described in these topics, you must be the root user of
the management controller.
“Restart the Management Controller” on page 88
“Restart the Entire Switch” on page 89
“Reset the Switch Chip” on page 89
“Recover Ports After Switch Chip Reset” on page 90
“Set Link Speed” on page 91
“Disable a Switch Chip Port” on page 92
“Enable a Switch Chip Port” on page 92
“Change the Administrator Password” on page 93
Related Information
“Monitoring the Hardware” on page 67
“Controlling the InfiniBand Fabric” on page 93
Restart the Management Controller
Should the management controller enter an indeterminate state, you can reboot it. The Subnet Manager and other services are also restarted, however, the InfiniBand fabric’s integrity is not impacted.
On the management controller, type:
# reboot Broadcast message from root (pts/0) (Mon Nov 23 12:19:17 2009): The system is going down for reboot NOW! # Connection to 123.45.67.89 closed by remote host. Connection to 123.45.67.89 closed. #
The management controller power cycles and reinitializes.
Note – The reboot command severs any management console link to the
management controller. You must re-access the management controller to regain administrative control.
88 Sun Datacenter InfiniBand Switch 72 User’s Guide • May 2010
Loading...