The software described in this document is "commercial computer software" provided with restricted rights (except as to included open/free source) as specified
in the FAR 52.227-19 and/or the DFAR 227.7202, or successive sections. Use beyond license provisions is a violation of worldwide intellectual property laws,
treaties and conventio ns. Thi s document is provided with limited rights as defined in 52.227 -14.
The electronic (soft w are) version of this documen t wa s developed at private expense; if acquired under an agreement with t he U S A government or any
contractor thereto, it is acquire d as “commercial computer software” subject to the provisions of its ap pl icable license a greement, as specified in (a) 48 CFR
12.212 of the FAR; or, if acquired for Department of Defense units, (b) 48 CFR 227-7202 of the DoD FAR Supplement; or sections succeeding thereto.
Contractor/manufactu rer is SGI, 46600 Landing Parkway, Fremont, CA 94538.
TRADEMARKS AND ATTRIBUTIONS
Silicon Graphics, SGI, the SGI logo, NUMAlink and NUMAflex are trademarks or registered trademarks of Silicon Graphics International Corp. or its
subsidiaries in the United States and/or other countries worldwide.
Intel, Itanium and Xeon are trademarks or register ed trademarks of Intel Corporation or its subsidiaries in the United States and ot he r cou nt rie s.
UNIX is a registered trademark in the United States and other countries, licensed excl usi vely through X/Open Com pany, Ltd.
Infiniband is a trademark of the In fi niB and Trade Association.
LSI, MegaRAID, and Mega RAID Storage Manager are trademarks or registered trademarks of LSI Corporation.
Linux is a registered tra de ma rk of Linus Torvalds in the U.S. and other countries.
Red Hat and all Red Hat-based trademarks are trademarks or registered trademarks of Red Hat, Inc. in the United States and other countries.
SUSE LINUX is a registered trademark of Novell Inc.
Windows is a registered trademark of Microsoft Corporation in the Uni te d States and ot he r countries.
All other trademarks mentioned herein are the pr ope rt y of their respective owners.
Record of Revision
VersionDescription
001June, 2012
First Release
002May, 2013
MIC/GPU Blade and RAID updates
007-5832-002iii
Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . xiii
Table A-8Pin Assignments for USB Type A Connector . . . . . . . . 88
007-5832-002xiii
About This Guide
This guide provides an overview of the architecture, general operation and descriptions of the
major components that compose the SGI
procedures for powering on and p owering of f th e syst em, bas ic trou bleshoot ing and mainte nance
information, and important safety and regulatory specificatio ns .
Audience
This guide is written for owners, system administrators, and users of SGI UV 2000 computer
systems. It is written with the assumption that the reader has a good working knowledge of
computers and computer systems.
Important Information
W arning: To avoid pro blems that coul d void your warra nty, your S GI or other app roved
system support engineer (SSE) should perform all the set up, addition, or replacement of
parts, cabling, and service of your SGI UV 200 0 s ystem, with the exception of the following
items that you can perform yourself:
•Using your system consol e controller to ente r commands and perf orm system func tions such
as powering on and powering off, as described in this guide.
UV 2000 family of s ervers. It also pr ovides the standard
•Adding and replacing PCIe cards, as described in this guide.
•Adding and replacing disk drives in dual-disk enabled riser blades.
•Removing and replacing the IRU power supplies.
•Using the On/Off switch and other switches on the rack PDUs.
•Using the ESI/ops panel (operating panel) on optional mass storage bricks.
007-5832-002xv
About This Guide
Chapter Descriptions
The following topics are covered in this guide:
•Chapter 1, “Operation Procedures,” provides instructions for powering on and powering off
•Chapter 2, “System Control,” describes the function of the overall system control network
•Chapter 3, “System Overview,” provides technical overview information needed to
•Chapter 4, “Rack Information,” describes the rack sizes and general features.
•Chapter 6, “Add or Replace Procedu res,” provides instructions for installing or removing the
•Chapter 7, “Troubleshooting and Diagnostics,” provides recommended actions if problems
•Appendix A, “Technical Specifications and Pinouts ‚" provi des phy sical, env ironmenta l, and
your system.
interface and provides basic instructions for operating the controllers.
understand the basic functional architecture of the SGI UV 2000 systems.
router technology abailable in SGI UV 2000 systems consisting of two or more racks. This
router technology is available in an enclosure “package” known as the Octal Router Chassis.
customer-replaceable components of your system.
occur on your system.
power specifications for your system. Also included are the pinouts for the non-proprietary
connectors.
•Appendix B, “Safety Information and Regulatory Specifications‚" lists regulatory
information related to use of the UV 2000 system in the United States and other countries. It
also provides a list of safety instructions to follow when installing , operating, or servicing
the product.
xvi007-5832-002
Related Publications
The following SGI documents are relevant to the UV 2000 series system:
•SGI UV CMC Software User Guide
•SGI UV System Management Node Administrator's Guide
Related Publications
(P/N 007-5636-00x)
This guide describes how to use the system console controller commands to monitor and
manage your SGI UV 2000 system via line commands. Coverage of control includes
descriptions of the interface and usage of the commands. These commands are primarily
used when a system management node is not present in the system. Note that it does not
cover controller command information for the SGI UV 10 or UV 20.
(P/N 007-5694-00x)
This guide covers the system management node (SMN) for SGI UV 2000 series systems. It
describes the software and hardware compo nents used with the SMN as well as providing an
overview of the UV system control network. System network addressing is covered and a
chapter on how to use KVM to enable remote console access from the system management
node is included.
•SGI Management Center Quick Start Guide
(P/N 007-5672-00x)
This document may be helpful to users or administrators of SGI UV systems using the SGI
Management Center interface. The guide provides introductory information on
configuration, operation and monitoring of your UV system using the management center
software.
•SGI Management Center System Administrator’s Guide
(P/N 007-5642-00x)
This guide is intended for system administrators who work with the SGI Management
Center software GUI to manage and control SGI UV 2000 systems. Depending on your
system configuration and implementation, this guide may be optional. The manual is written
with the assumption the user has a good working knowledge of Linux.
•SGI UV Software Install Guide
(P/N 007-5675-00x)
In UV systems that come with pre-installed Linux software operati ng systems; this
document describes how to re-install it when necessary.
007-5832-002xvii
About This Guide
•SGI UV Systems Linux Configuration and Operations Guide
(P/N 007-5629-00x)
This guide is a reference docum ent for people who manage the operation of SGI UV 2000
systems. It explains how to perf orm general sys tem configuration and operation un der Linux
for SGI UV. For a list of manuals supporting SGI Linux releases and SGI online resources,
see the SGI Performance Suite documentation.
•SGI UV Systems Installation Guide
(P/N 007-5675-00x)
This guide covers software installation on UV 2000 systems and their SMNs.
•Linux Application Tuning Guide for SGI X86-64 Based Systems
(P/N 007-5646-00x)
This guide includes a chapter that covers ad vanced tuni ng strategi es for applicati ons running
on SGI UV systems as well as other SGI X86 ba sed systems.
•MegaRAID SAS Software User’s Guide, publication number (860-0488-00x)
This document describes the LSI Corporation’s MegaRAID Storage Manager software.
•LSI Integrated SAS for RAID User’s Guide, publication number (860-0476-00x)
This user guide explains how to configure and use the software components of the LSI
Integrated RAID software product used with LSI SAS controllers.
•Man pages (online)
Man pages locate and print the titled entries from the online reference manuals.
You can obtain SGI documentation, release notes, or man pages in the following ways:
•See the SGI Technical Publications Library at http://do cs.sgi.com
Various formats are available. This library contains th e most r ecent and mos t comprehen sive
set of online books, release notes, man pages, and other information.
•The release notes, which contain the latest information about software and documentation in
this release, are in a file named README.SGI in the root directory of the SGI ProPack for
Linux Docu mentation CD.
•You can also view man pages by typing man <title> on a command line.
SGI systems shipped with Linux include a set of Linux man pages, formatted in the standard
UNIX “man page” style. Important system configuration files and comman ds are d ocumented on
man pages. These are found online on the internal system disk (o r DVD) and are displayed using
xviii007-5832-002
Conventions
Conventions
the man command. References in the documentation to these pages include the name of the
command and the section number in whi ch the command is foun d. For example, to di splay a man
page, type the request on a command line:
man commandx
For additional informati on about display ing man pages usi ng the man command, see man(1). In
addition, the apropos command locates man pages b ased on ke ywords. For example, to display
a list of man pages that describe disks, type t he following on a command line:
apropos disk
For information about setting up and using apropos, see apropos(1).
The following conventions are used throughout this document:
ConventionMeaning
CommandThis fixed-space font denotes literal items such as commands, files,
routines, path names, sign als, messages, and programming language
structures.
variableThe italic typeface denotes variable entries and words or concepts being
defined. Italic typeface is also used for book titles.
user inputThis bold fixed-space font denotes literal items that the user enters in
interactive sessions. Output is shown in nonbold, fixed-space font.
[ ]Brackets enclose optional portions of a command or directive line.
...Ellipses indicate that a preceding element can be repeated.
man page(x)Man page section identifiers appear in parentheses after man page names.GUI elementThis font denotes the names of graphical us er interface (GUI) elements such
as windows, screens, dialog boxes, menus, toolbars, icons, buttons, boxes,
fields, and lists.
007-5832-002xix
About This Guide
Product Support
Reader Comments
SGI provides a comprehensive product support and maintenance program for its products, as
follows:
•If you are in North America, contact the Technical Assistance Center at
+1 800 800 4SGI or contact your authorized service provider.
•If you are outside North America, contact the SGI subsidiary or authorized distributor in
your country. International customers can visit http://www.sgi.com/support/
Click on the “Support Centers” link under the “Online Support” heading for information on
how to contact your nearest SGI customer support center.
If you have comments about the technical accuracy, content, or organization of this document,
contact SGI. Be sure to include the title and document number of the manual with your comments.
(Online, the document number is located in the front matter of the manual. In printed manuals, the
document number is located at the bottom of each page.)
You can contact SGI in any of the following ways:
•Send e-mail to the following address: techpubs@sgi.com
•Contact your customer service representative and ask that an incident be filed in the SGI
incident tracking system.
•Send mail to the following address:
Technical Publications
SGI
46600 Landing Parkway
Fremont, California 94538
SGI values your comments and will respond to them promptly.
xx007-5832-002
Precautions
Chapter 1
1.Operation Procedures
This chapter explains the basics of how to operate your new system in the following sections:
•“Precautions” on page 1
•“Power Connections Overview” on page 2
•“System Control Overview” on page 12
•“Using Embedded Support Partner (ESP)” on page 22
•“Optional Components” on page 23
Before operating your system, familiarize yourself with the safety information in the following
sections:
•“ESD Precaution” on page 1
•“Safety Precautions” on page 2
ESD Precaution
Caution: Observe all ESD precautions. Failure to do so can result in damage to the equipment.
Wear a grounding wrist strap when you handle any ESD-sensitive device to eliminate possible
ESD damage to equipment. Connect the wrist strap cord directly to earth ground.
007-5832-0021
1: Operation Pr ocedures
Safety Precautions
!
Warning: Before operating or servicing any part of this product, read the “Safety
Information” on page 89.
Danger: Keep fingers and co nductive tools away from hi gh-voltage areas. Failure to
follow these precautions will result in serious injury or death. The high-voltage areas of the
system are indicated with high-voltage warning labels.
Caution: Power off the system only after the system software has been shut down in an orderly
manner . If you power of f the sy stem befor e you halt the operat ing system, data may be corrupted.
W arning: If a lithium battery is installed in your system as a soldered part, only qualified
SGI service personnel should replace this lithium battery. For a battery of another type,
replace it only with the same type or an equivale nt type recommended by the battery
manufacturer, or an explosion could occur. Discard used batteries according to the
manufacturer’s instructions.
Power Connections Overview
Prior to operation, your SGI UV 2000 system should be set up and connected by a professional
installer. If you are powering on the system for the first time or want to confirm proper power
connections, follow these steps:
1.Check to ensure that the power connector on the cable between the rack’s power distribution
units (PDUs) and the wall power-plug receptacles are s ecurely plugged in.
2.For each individual IRU that you want to power on, make sure that the power cables are
plugged into all the IRU power supplies correctly, see the example in Figure 1-1 on page 3.
Setting the circuit breakers on the PDUs to the “On” position will apply power to the IRUs
and will start the CMCs in the IRUs. Note that the CMC in each IRU stays powered on as
2007-5832-002
Power Connections Overview
long as there is power coming into the unit. Turn off the PDU breaker switch on each of the
PDUs that supply voltage to the IRU’s power supplies if you want to remove all power from
the unit.
Important: In a system configuration using 2-outlet s ingle-phas e PDUs, each po wer supp ly
in an IRU should be connected to a different PDU within the rack. This will ensure the
maximum amperage output of a single PDU is not exceeded if a power supply fails.
Power cord
Figure 1-1IRU Power Supply Cable Locatio n Example
3.If you plan to power on a server that includes optional mass storage enclosures, make sure
that the power switch on the rear of each PSU/cooling module (one or two per storage
enclosure) is in the
1 (on) position.
4.Make sure that all PDU circuit breaker switches (see the examples in the following three
figures) are turned on to provide power to the server when the system is powered on.
Figure 1-2 shows an example of a single-phase 2-plug PDU that can be used with the SGI UV
2000 system. This PDU can be used to d istribute power to th e IRUs when the system is configur ed
with single-phase power.
007-5832-0023
1: Operation Pr ocedures
Figure 1-2Single-Phase 2-Outlet PDU Example
Figure 1-3 on page 5 shows an example of an eight-plug single-phase PDU that can be used in the
SGI UV 2000 rack system. This unit is used to support auxiliary equipment in the rack.
4007-5832-002
n
)
Power
source
Power Connections Overview
Power
distributio
unit (PDU
Figure 1-3Single-Phase 8-Outlet PDU
007-5832-0025
1: Operation Pr ocedures
Figure 1-4 shows examples of the three-phase PDUs that can be used in the SGI UV 2000 system.
These PDUs are used to distribute power to the IRUs when the system is configured with
three-phase power. The enlarged section shows an optional PDU status interface panel.
Figure 1-4Three-Phase PDU Examples
6007-5832-002
System Connections Overview
You can monitor and interact with your SGI UV 2000 server from the following sources:
•Using the SGI 1U rackmount console option you can connect directly to the system
management node (SMN) for basic monitoring and administration of the system. See “1U
Console Option” in Chapter 2 for more information; SLES 11 or later is required.
•A PC or workstation on the local area network can connect to the SMN’s extern al ethernet
port and set up remote console sessions or display GUI objects from the SGI Management
Center interface.
•A serial console display can be plugged into the CMC at the rear of IRU 001. You can also
monitor IRU information and system operational status from other IRUs that are connected
to IRU 001.
These console connections enable you to view the status and error messages generated by
the chassis management controllers in your SGI UV 2000 rack. For example, you can
monitor error messages that warn of power or temperature values that are out of tolerance.
See the section “1U Console Option” in Chapter 2, for additional in f orm ation.
The following subsections describe the options for establishing and using communication
connections to work with your SGI UV 2000 system.
System Connections Overview
Connecting to the UV System Control Network
The ethernet connection is the preferred method of accessing the system console.
Administrators can perform one of the following options for connectivity:
•If the SMN is plugged into the customer LAN, connect to the SMN (SSH w/ X11
Forwarding) and start the SGI Management Center remotely.
•An in-rack system console can be directly connected to the system management node via
VGA and PS2, see Figure 1-5 on page 8. You can then log into the SMN and perform
system administration either through CLI commands or via the SGI Management Center
interface.
Note that the CMC is factory set to DHCP mode and thus has no fixed IP address and cannot be
accessed until an IP address is established. See the subsection “Using DHCP to Establish an IP
Address” on page 10 for more information on this topic.
007-5832-0027
1: Operation Pr ocedures
A micro-USB serial connection can be used to communicate directly with the CMC. This
connection is typically used for service purposes or for system controller and system console
access in small systems where an in-rack system console is not used or available.
System Controller Access
Access to the SGI UV 2000 system controller network is accompli shed by the following
connection methods:
•A LAN connection to the system management node (running the SGI Management Center
•A micro-USB serial cable connection to the “Console” port (see Figure 1-6) on the CMC
Note: Each IRU has two chassis management controller (CMC) slots located in the rear of the
IRU directly below the cooling fans. Only one CMC is supported in each IRU. The CMC slo t on
the right is the slot that is populated.
software application). This can also be done using an optional VGA-connected console, see
Figure 1-5.
(see note below). See also “Serial Console Hardware Requirements” on page 9.
Mouse
Keyboard
VGA Port
Figure 1-5System Management Node Rear Video Connections
HEARTBEAT
PWR GOOD
CMCSMNACCCONSOLERESET
Figure 1-6UV CMC Connection Faceplate Example
8007-5832-002
Serial Console Hardware Requirements
The console type and how these console types are connected to the SGI UV 2000 servers is
determined by what cons ole option is chosen. I f you have an SGI UV 2000 serv er and wish to use
a serially-connected “dumb terminal”, you can connect the terminal via a micro-USB serial cable
to the console port connector on the CMC. The terminal should be set to the following functional
modes:
•Baud rate of 115,200
•8 data bits
•One stop bit, no parity
•No hardware flow control (RTS/CTS)
Note that a serial console is generally connected to the first (bottom) IRU in any single rack
configuration.
Establishing a Serial Connection to the CMC on SGI UV 2000
System Connections Overview
If you have an SGI UV 2000 system and wish to use a serially-connected "dumb terminal", you
can connect the terminal via a micro-USB serial cable to the console port connecto r on the C MC
board of the IRU.
1.The terminal should be set to the operational modes described in the previous subsection.
Note that a serial console is generally connected to the CMC on th e first (bottom) IRU in an y
single rack configuration.
2.On the system management node (SMN) port, the CMC is configured to request an IP
address via dynamic host configuration protocol (DHCP).
3.If your system does not have an SMN, the CMC address cannot be directly obtained by
DHCP and will have to be assigned, see the following subsections for more inf orm ation.
Establishing CMC IP Hardware Connections
For IP address configuration, there are two options: DHCP or static IP. The followi ng subs ections
provide information on t he setup and use of both.
007-5832-0029
1: Operation Pr ocedures
Note: Both options require the use of the CMC's serial port, refer to Figure 1-6 on page 8.
For DHCP, you must determine the IP address that the CMC has been assigned; for a static IP,
you must also configure the CMC to us e the desired static IP address.
To use the serial port connection, you must attach and properly configure a micro-USB cable to
the CMC's "CONSOLE" port. Configure the se rial port as described in “Serial Console Hardware
Requirements” on page 9.
When the serial port session is established, the console will show a CMC login, and the user can
login to the CMC as user "root" with password "root".
Using DHCP to Establish an IP Address
To determine the IP address assigned to the CMC, you must first establish a conn ection to the
CMC serial port (as indi cated in the section “Serial Console Hardware Req uirements” on page 9),
and run the command "ifconfig eth0". This will report the IP address that the CMC is
configured to use.
Running the CMC with DHC P is not recommended as the preferred option for SGI UV 2000
systems. The nature of DHCP makes it difficult to determine the IP address of the CMC, and it
is possible for that IP address to change over ti me, d epend i ng o n t he DHC P configuration usage.
The exception would be a configuration where the system administrator is using DHCP to assign
a "permanent" IP address to the CMC.
To switch from a static IP back to DHCP, the configuration file
/etc/sysconfig/ifcfg-eth0 on the CMC must be modified (see additional instructions
in the “Using a Static IP Address” section) . The file must contain the following line to enable use
of DHCP:
BOOTPROTO=dhcp
Using a Static IP Address
T o configure the CMC to use a static IP add ress, the user/administrator must edit the configur ation
file /etc/sysconfig/ifcfg-eth0 on the CMC. The user can use the "vi" command
(i.e. "vi /etc/sysconfig/ifcfg-eth0") to modify the file.
The configuration file should be modified to contain these lines:
10007-5832-002
System Connections Overview
BOOTPROTO=static
IPADDR=<IP address to use>
NETMASK=<netmask>
GATEWAY=<network gateway IP address>
HOSTNAME=<hostname to use>
Note that the "GATEWAY" and "HOSTNAME" lines are optional.
After modifying the file, save and write it using the vi command ":w!", and then exit vi using ":q".
Then reboot the CMC (using the "reboot" command); after it reboots, it will be configured with
the specified IP address.
007-5832-00211
1: Operation Pr ocedures
System Control Overview
All SGI UV 2000 system individual rack units (IRUs) use an embedded chassis manag ement
controller (CMC). The CMC communicates with both the blade-level board management
controllers (BMCs) and the system management node (SMN), which runs the SGI Management
Center software. In concert with the SGI Management Center software, they are generically
known as the system control network.
The SGI UV 2000 system contro l network pro vides control and monitoring functionality for each
compute blade, power supply, and fan assembly in each individual rack unit (IRU) enclosure in
the system. The IRU is a 10U-high enclosure that supplies power, cooling, network fabric
switching and system control for up to eight compute blades. A single chassis management
controller blade is installed at the rear of each IRU.
The SGI Management Center System Administrator’s Guide (P/N 007-5642-00x) provides more
detailed information on using the GUI to administer your SGI UV 2000 system.
The SGI Management Center is an application that provides control over multiple IRUs, and
communication to other UV systems. Remote administration requ ires that the SMN be connected
by an Ethernet connection to a private or public Local Area Network (LAN).
The CMC network in concert with the SMN provides the following functionality:
•Powering the entire system on and off.
•Powering individual IRUs on and off.
•Power on/off individual blades in an IRU.
•Monitoring the environmental state of the system.
•Partitioning the system.
•Enter controller commands to monitor or change particular system functions within a
particular IRU. See the SGI UV CMC Software User G uide (P/N 007-5636-00x) for a
complete list of command line interface (CLI) commands.
•Provides access to the system OS console allowing you to run diagnostics and boot the
system.
12007-5832-002
Communicating with the System
The two primary ways to communicate with and administer the SGI UV 200 0 system are throug h
SGI Management Center interface or the UV command line interface (C LI).
The SGI Management Center Graphical User Interface
The SGI Management Center interface is a server monitoring and management system. The SGI
Management Center GUI provides status metrics on operational aspects for each node in a system.
The interface can also be customized to meet the specific needs of individual systems.
The SGI Management Center System Administrator’s Guide (P/N 007-5642-00x) provides
information on using the interface to monitor and maintain your SGI UV 2000 system. Also, see
Chapter 2 in this guide for additional reference information on the SGI Manageme nt Center
interface.
Powering-On and Off From the SGI Management Cen te r In terface
System Control Overview
Commands issued from the SGI Management Center interface ar e typically sent to all enclosures
and blades in the system (up to a maximum 128 blades per SSI) dep ending on set parameter s. SGI
Management Center services are started and stopped from scripts that exist in
/etc/init.d
SGI Management Center, is commonly installed in /opt/sgi/sgimc, and is controlled by one of
these services—this allows you to manage SGI Management Cen ter services using standar d Linux
tools such as chkconfig and service.
If your SGI Management Center interface is not already ru nning, o r you are bring ing it up for the
first time, use the following steps:
1.Power on the server running the SGI Management Center interface.
2.Open an ssh or other terminal session command line console to the SMN using a remote
workstation or local VGA terminal.
3.Use the information in the section “Power Connections Overview” on page 2 to ensure that
all system components are supplied with power and ready for bring up.
4.Log in to the SMN as root (the default password is sgisgi).
007-5832-00213
1: Operation Pr ocedures
5.On the command line, enter mgrclient and press Enter.
The SGI Management Center Login dialog box is displayed.
6.Enter a user name (root by default) and password (root by default) and click OK.
The SGI Management Center interface is displayed.
7.The power on (green button) and power off (red button) are located in the middle of the SGI
Management Center GUI’s Tool Bar - icons which provide quick access to common tasks
and features.
See the SGI Management Center System Administrator’s Guide for more information.
The Command Line Interface
The UV command line interface (CLI) is accessible by logging into either a system main tenance
node (SMN) or chassis management controller (CMC).
Note: The command line interface is virtually the same when used from either the SMN or the
CMC. Using the command line interface from the SMN may require that the command be tar geted
to a specific UV 2000 system if the SMN is managing more than one SSI.
Log in as root, (default password is root) when logging into the CMC.
Login as sysco, when logging into the SMN.
Once a connection to the SMN or CMC is established, various system control commands can be
entered. See “Powering On and Off from the Command Line Interface” on page 14 for specific
examples of using the CLI commands.
Powering On and Off from the Command Line Interface
The SGI UV 2000 command line interface is accessible by logging into either the system
management node (SMN) as root or the CMC as root.
Instructions issued at the command line interface of a local console prompt typically only affect
the local partition or a part of the system. Depending on the directory level you are logged in at,
you may power up an entire partition (SSI), a single rack, or a single IRU enclosure. In CLI
command console mode, you can obtain only limited information about the overall system
configuration. An SMN has information about the IRUs in its SSI. Each IRU has information
14007-5832-002
about its internal blades, and also (if other IRUs are attached via NUMAlink to the IRU)
information about those IRUs.
Command Options for Power On
The following example command options can be used with either the SMN or CMC CLI:
usage: power [-vcow] on|up [system-SSN]...t urns power on
-v, --verbose verbose output
-c, --clear clear EFI variables (system and partition targets only)
-o, --override override partition check
-w, --watch watch boot progress
To monitor the power-on sequence during boot, see the section “Monitoring Power On” on
page 19, the -uvpower option must be included with the command to power on.
Power On the System From the SMN Command Line
1.Login to the SMN as root, via a ter minal window similar to the following:
System Control Overview
The default password for logging in to the SMN as root is sgisgi.
# ssh -X root@uv-system-smn
root@system-smn>
Once a connection to the SMN is established, the SMN prompt is presented and various
system control commands can be entered.
2.To see a list of available commands enter the following:
root@uv-system-smn>ls /sysco/bin/help
3.Change the working d irectory to sysco, similar to the following:
root@uv-system-smn>cd /sysco
In the following example the system is powered on without monitoring the progress or status
of the power-on process.When a power command is issued, it checks to see if the individaul
rack units (IRUs) are powered on; if not on, the power command powers up the IRUs and
then the blades in the IRU are powered on.
4.Enter th e power on command, similar to the following:
sysco@uv-system-smn>power on
007-5832-00215
1: Operation Pr ocedures
The system will take time to fully power up (depending on size and options).
Specific CLI Commands Used With the SMN
The following list of available CLI commands are specifically for the SMN:
auth authenticate SSN/APPWT change
bios perform bios actions
bmc access BMC shell
cmc access CMC shell
config show system configuration
console access system consoles
help list available commands
hel access hardware error logs
hwcfg access hardware configuration variable
leds display system LED values
log display system controller logs
power access power control/status
Type '<cmd> --help' for help on individual commands.
16007-5832-002
Optional Power On From the CMC Command Line
Most SGI UV 2000 systems come with a system management node (SMN) and there should be
few reasons for powering on the system from a CMC.
Note: Some basic systems are sold without an SMN and requir e sligh tly dif feren t admini strat ive
procedures. These “SMN-less” systems are restricted in regards to number of blades and router
options available.
Use the following information if you have a need to power on from a CMC rather than the SMN
CLI or the SGI Management Center GUI. If the SMN is not availa ble you can still boot the system
directly by using the CMC, see “Booting Directly From a CMC”.
Note: The command line interface for the CMC is virtually the same as that for the SMN, with
the exception that the CMC does not have the ability to target a system when multiple systems are
supported from one SMN.
System Control Overview
Booting Directly From a CM C
If a system m anagement node (SMN) is not available, it is possible to power on and administer
your system directly from the CMC. When available, the optional SMN should always be the
primary interface to the system.
The console type and how these console types are connected to the SGI UV 2000 systems is
determined by what console option is chosen. T o monitor or adminis ter a system through the CMC
network, you will need to establish a mini-USB serial connection to the CMC. See the information
in the following two subsections.
Power On the System Using the CMC Network
You can use a use a direct mini-USB serial connection to the CMC to power on your UV system;
note that this process is not the standard way to administer a system. Use the following steps:
1.Establish a connection (as detailed in the previous subsections). C M Cs have their rack and
“U” position set at the factory. The CMC will have an IP address, similar to the following:
172.17.<rack>.<slot>
007-5832-00217
1: Operation Pr ocedures
2.You can use the IP address of the CMC to login, as follows:
ssh root@<IP-ADDRESS>
T yp ically, the default password for the CMC set out of the SGI factory is root. The default
password for logging in as sysco on the SMN is sgisgi.
3.Power up your UV system using the power on command, as follows:
CMC:r1i1c> power on
The system will take time to fully power up (depending on size and options). Lar ger system s take
longer to fully power on. Information on booting Linux from the shell prompt is included at the
end of the subsection (“Monitoring Power On” on page 19).
Optional Power On Using the SMN to Connect to the CMC
Typically, the default password for the CMC set out of the SGI factory is root.
Use the following steps to establish a network connection from the SMN to the CMC and power
on the system using the CMC prompt and the command line interface:
1.Establish a network connection to the CMC by using the ssh command from the SMN to
connect to the CMC, similar to the following example:
Note: This is only valid if your PC or workstation that is connected to the CMC (via the SMN
connection) has its /etc/hosts file setup to include the CMCs.
2.Power up your UV system using the power-on command, as follows:
CMC:r1i1c> power on
Note that the larger a system is, the more time it will take to power up completely. Information on
booting Linux from the sh ell promp t is includ ed at the end o f the subs ection ( “Monitorin g Power
On” on page 19).
Open a separate window on your PC or workstation and establish anot her connectio n to the SMN
or CMC and use the uvcon command to open a system console and monitor the system boot
process. Use the following steps:
CMC:r1i1c> uvcon
uvcon: attempting connection to localhost...
uvcon: connection to SMN/CMC (localhost) established.
uvcon: requesting baseio console access at r001i01b00...
uvcon: tty mode enabled, use ’CTRL-]’ ’q’ to exit
uvcon: console access established
uvcon: CMC <--> BASEIO connection active
************************************************
******* START OF CACHED CONSOLE OUTPUT *******
************************************************
******** [20100512.143541] BMC r001i01b10: Cold Reset via NL
broadcast reset
******** [20100512.143541] BMC r001i01b07: Cold Reset via NL
broadcast reset
******** [20100512.143540] BMC r001i01b08: Cold Reset via NL
broadcast reset
******** [20100512.143540] BMC r001i01b12: Cold Reset via NL
broadcast reset
******** [20100512.143541] BMC r001i01b14: Cold Reset via NL
broadcast reset
******** [20100512.143541] BMC r001i01b04: Cold Reset via NL....
Note: Use CTRL-] q to exit the console.
007-5832-00219
1: Operation Pr ocedures
Depending upon the size of your system , it can take 5 to 10 min utes for the UV sys tem to boot to
the EFI shell. When the shell> prompt appears, enter fs0:, as follows:
shell> fs0:
At the fs0: prompt, enter the Linux boot loader information, as follows:
fs0:\> \efi\SuSE\elilo
The ELILO Linux Boot loader is called and various SGI configuration scripts are run and the
SUSE Linux Enterprise Server 11 Service Pack x installation program appears.
Power off a UV System
To power down the UV system, use the power off command, as follows:
CMC:r1i1c> power off
==== r001i01c (PRI) ====
You can also use the power status command, to check the power status of your system
CMC:r1i1c> power status
==== r001i01c (PRI) ====
on: 0, off: 32, unknown: 0, disabled: 0
The following command options can be used with the power off|down command:
usage: power [-vo] off|down [system-SSN]...turns power off
-v, --verbose verbose output
-o, --override override partition check
Additional CLI Power Command Options
The following are examples of command options related to power status of the system IRUs.
These commands and arguments can be used with either the SMN or CMC CLI.
usage: power [-vchow] reset [system-SSN]...toggle reset
-v, --verbose verbose output
-c, --clear clear EFI variables (system and partition
targets only)
-h, --hold hold reset high
-o, --override override partition check
-w, --watch watch boot progress
20007-5832-002
System Control Overview
usage: power [-v] ioreset [system-SSN]...toggle I/O reset
-v, --verbose verbose output
usage: power [-vhow] cycle [system-SSN]...cycle power off on
-v, --verbose verbose output
-h, --hold hold reset high
-o, --override override partition check
-w, --watch watch boot progress
usage: power [-v10ud] [status] [system-SSN]...show power status
-v, --verbose verbose output
-1, --on show only blades with on status
-0, --off show only blades with off status
-u, --unknown show only blades with unknown status
-d, --disabled show only blades with disabled status
usage: power [-ov] nmi|debug [system-SSN]...issue NMI
-o, --override override partition check
-v, --verbose verbose output
usage: power [-v] margin [high|low|norm|<value>] [system-SSN]...power
margin control
high|low|norm|<value> margin state
-v, --verbose verbose output
usage: power --help
--help display this help and exit
007-5832-00221
1: Operation Pr ocedures
Using Embedded Support Partner (ESP)
Embedded Support Part ner (ES P) aut om atical l y d e tects system conditions that indicate poten tial
future problems and then notifies the appr opriate personnel. This enables you and SGI system
support engineers (SSEs) to proactively support systems and resolve issues before they develop
into actual failures.
ESP enables users to monitor one or more systems at a site fr om a local or remote connection. ESP
can perform the following functions:
•Monitor the system configuration, events, performance, and availability.
•Notify SSEs when specific events occur.
•Generate reports.
ESP also supports the following:
•Remote support and on-site troubleshooting.
•System group management, which enabl es you to manag e an entire group of systems from a
single system.
For additional information on this and other available monitoring services, see the section “SGI
Electronic Support” in Chapter 7.
22007-5832-002
Optional Components
Besides adding a network-connected system console or basic VGA monitor, you can add or
replace the following hardware items on your SGI UV 2000 series server:
•Peripheral component interface (PCIe) cards into the optional PCIe expansion chassis.
•PCIe cards into the blade-mounted PCIe riser card.
•Disk drives in your dual disk drive riser card equipped compute blade.
PCIe Cards
The PCIe based I/O sub-systems, are industry standard for connecting peripherals, storage, and
graphics to a processor blade. The following are the primary configurable I/O system interfaces
for the SGI UV 2000 series systems:
•The optional full -height two-slot internal PCIe blade is a dual-node compute blade that
supports one full-height x16 PCIe Gen3 card in the top slot and on e low-profile x16 PCIe
Gen3 card in the lower slot. See Figure 1-7 on page 24 for an example.
Optional Components
•The optional dual low-p rofile PCIe b lade supp orts t wo PCIe x 16 Gen 3 cards . See Figur e 1-8
on page 24 for an example.
•The optional external PCIe I/O expansion chassis supports up to four PCIe cards. The
external PCIe chassis is supported by connection to a compute blade using an optional host
interface card (HIC). Each x16 PCIe enabled blade ho st inter face con nector can s upp ort o ne
I/O expansion chassis. See Chapter 6 for more details on the optional external PCIe chassis.
Important: PCIe cards installed in an optional two-slot PCIe blade are not hot swappabl e or hot
pluggable. The compute blade using the PCIe riser must be powered down and removed from the
system before installation or removal of a PCIe card(s). Also see “Installing Cards in the 1U PCIe
Expansion Chassis” on page 65 for more PCIe related information.
Not all blades or PCIe cards may be available with your system configuration. Check with your
SGI sales or service representative for availability. See Chapter 6, “Add or Replace Procedures”
for detailed instructions on installing or removing PCIe cards or SGI UV 2000 system disk drives.
007-5832-00223
1: Operation Pr ocedures
Figure 1-7PCIe Option Blade Example with Full-Height and Low-Profile Slots
Figure 1-8PCIe Option Blade Example with Two Low-Profile Slots
24007-5832-002
PCIe Drive Controllers in BaseIO Blade
The SGI UV 2000 system offers a RAID or non-RAID (JBOD) PCIe-based drive controller that
resides in the BaseIO blade’s PCIe slot. Figure 1-9 shows an example of the system disk HBA
controller location in the BaseIO blade.
Note: The PCIe drive controller (u pper-r ight section in Figure 1-9) always hosts the system boot
drives.
Optional Components
SAS System Disk Connectors
PCIE
LAN1
LAN0BMC
SAS0-3SAS4-7
VGA
2
1
0
USB
SERIAL
Figure 1-9BaseIO Blade and PCIe Disk Controller Example
RAID PCIe Disk Controller
At the time this document was published, the optional RAID controller used in the BaseIO blade
PCIe slot is an LSI MegaRAID SAS 9280-8e. This PCIe 2.0 card uses two external SAS control
connectors and supports the following:
•RAID levels 0, 1, 5, 6, and 10
•Advanced array configuration and management utilities
007-5832-00225
1: Operation Pr ocedures
•Support for global hot spares and dedicated hot spares
•Support for user-defined stripe sizes: 8, 16, 32, 64, 128, 256, 512, or 1024 KB
The RAID controller also supports the following advanced array configuration and management
capabilities:
•Online capacity expansion to add space to an existing drive or a new drive
•No reboot necessary after expansion
•Online RAID level migration, including drive migration, roaming and load balancing
•Media scan
•User-specified rebuild rates (specifying the percentage of system resources to use from 0
•Nonvolatile random access memory (NVRAM) of 32 KB for storing RAID system
Non-RAID PCIe Disk Controller
percent to 100 percent)
configuration information; the MegaRAID SAS firmware is stored in flash ROM for easy
upgrade.
At publication time, the LSI 9200-8e low-profile PCIe drive controller HBA is the defau lt
non-RAID system disk controller for the SGI UV 2000. This drive controller has the following
features:
•Supports SATA and SAS link rates of 1.5 Gb/s, 3.0 Gb/s, and 6.0 Gb/s
•Provides two x4 external mini-SAS connecto r s (SFF-8088)
•The HBA has onboard Flash memory for the firmware and BIOS
•The HBA is a 6.6-in. x 2.713-in., low-profile board
•The HBA has multiple status and activity LEDs and a diagnostic UART port
•A x8 PCIe s lot is required for the HBA to operate within the system
26007-5832-002
Chapter 2
2.System Control
This chapter describes the general interaction and functions of the overall SGI UV 2000 system
control. System control parameters depend somewhat on the overall size and complexity of the
SGI UV 2000 but will generally include the following three areas:
•The system management node (SMN) which runs the SGI Management Center software
•The chassis management controllers (CMC) boards - one per IRU
•The individual blade-based board management controllers (BMC) - report to the CMCs
Note: While it is possible to operate and administer a very basic (single-rack) SGI UV 2000
system without using an SMN an d SGI Management Center, this is an exception rather than rule.
Levels of System Control
The system control network configuration of your server will depend on the size of the system and
control options selected. T yp ically , an E thernet LAN connection to the system controller network
is used. This Ethernet connection is made from a r emote PC/workstation conne cted to the system
management node (SMN). The SMN is a separate stand-alone server installed in the SGI UV 2000
rack. The SMN acts as a gateway and buffer between the UV system control network and any
other public or private local area networks.
Important: The SGI UV system control network is a private , closed network. It should not be
reconfigured in any way to change it from the standard SGI UV factory installation. It should not
be directly connected to any other network. The UV system control network is not designed for
and does not accommodate additional network traffic, routing, address naming (other than its own
schema), or DCHP controls (other than its own configuration). The UV system control network
also is not security hardened, nor is it tolerant of heavy network traffic, and is vulnerable to Denial
of Service attacks.
007-5832-00227
2: System Control
System Management Node (SMN) Overview
An Ethernet connection directl y from the SMN (Figure 2-1 on page 29) to a local private or publ ic
Ethernet allows the system to be administered directly from a local or remote console via the SGI
Management Center interface installed on the SMN. Note that there is no direct inter-connected
system controller function in the optional expansi on PCIe modules.
The system controller network is designed into all IRUs. Controllers within the system report and
share status information via the CMC Ethernet interconnects. This maintains controller
configuration and topology information between all controllers in an SSI. Figure 2-2 on page 30
shows an example system control netwo rk using an optional and separate (remot e) workstation to
monitor a single-rack SGI UV 2000 system. It is also possible to connect an optional PC or
(in-rack) console directly to the SMN, see Figure 2-4 on page 34.
Note: Mass storage option enclosures are not specifically monitored by the system controller
network. Most optional mass storage enclosures have their own internal microcontrollers for
monitoring and controlling all elements of the disk array. See the user’s gui de for your mass
storage option for more information on this topic.
For information on administering network connected SGI systems using the SGI Management
Center, see the SGI Management Center System Administrator’s Guide (P/N 007-5642-00x).
28007-5832-002
Levels of System Control
n
er
Slim DVD drive option
Power Supply Module
Mouse
Keyboard
BMC Port
USB
Port 1
COM
Port1
USB
Port 0
Disk drive bays
VGA Port
LAN ports 1-4
System
LEDs
Full-height (full-depth)
x16 PCIe slot
Full-height (half-depth)
x16 PCIe slot
System
reset
Mai
pow
Figure 2-1System Management Node Front and Rear Panels
CMC Overview
The CMC system for the SGI UV 2000 se rvers ma nages po wer contr ol and s equencing , pr ovides
environmental control and monitoring, initiates system resets, stores identi ficat ion and
configuration information, and pro vides console/di agnostic and scan interface. A CMC port fro m
each chassis management controller connects to a dedicated Ethernet switch that provides a
synchronous clock signal to all the CMCs in an SSI.
Viewing the system from the rear, the CMC blade is on the right side of the IRU. The CMC
accepts direction from the SMN and supports powering-up and powering-down individual or
groups of compute blades and environmental monitoring of all units within the IRU. The CMC
sends operational requests to the Baseboard Management Controller (BMC) on each compute
007-5832-00229
2: System Control
blade installed. The CMC provides data collected from the compute nodes within the IRU to the
system management node upon request.
CMCs can communicate with the blade BMCs and ot her IRU CMCs when they are linked together
under a single system image (SSI); also called a partition. Each CMC shares its information with
the SMN as well as other CMCs within the SSI. Note that the system management node (serv er),
optional mass storage units and PCIe expansion enclosures do not have a CMC installed.
Remote
workstation
monitor
Local Area Network (LAN)
SGI UV 2000
SGI UV 2000 system
Local Area Network (LAN)
Cat-5 Ethernet
Figure 2-2SG I UV 2000 LAN-attached System Control Network Example
30007-5832-002
BMC Overview
System Controller Interaction
Each compute blade in an IRU has a baseboard management controller (BMC). The BMC is a
built-in specialized microcontroller hardware component that monitors and reports on the
functional “health” status of the blade. The BMC provides a key functional element in the overall
Intelligent Platform Management Interface (IPMI) architecture.
The BMC acts as an interface to the higher leve ls of system control such as the IRU’ s CMC board
and the higher level control system used in the system management node. The BMC can report
any on-board sensor information that it has regarding temperatures, power status, operating
system condition and other functio nal paramet ers that may be repor ted by the blade. When any o f
the preset limits fall out of bounds, the information will be reported by the BMC and an
administrator can take some corrective action. This could entai l a node shu tdown , reset (NMI ) or
power cycling of the individual blade.
The individual blade BMCs do not have inf ormation on the status of other blades within the IRU.
This function is handled by the CMCs and the system management node. Note that blades
equipped with an optional BaseIO riser board have a dedicated BMC Ethernet port.
System Controller Interaction
In all SGI UV 2000 servers all the system controller types (SMNs, CMCs and BMCs)
communicate with each other in the following ways:
•System control commands and communications are passed between the SMN and CMCs via
a private dedicated gigabit Ethernet network. The CMCs communicate directly with the
BMC in each installed blade by way of the IRU’s internal backplane.
•All CMCs can communicate with each other via an Ethernet “ring” configuration network
established within an SSI.
•In larger configurations the system con trol communication path includes a private, dedicated
Ethernet switch that allows communication between an SMN and multiple SSI
environments.
007-5832-00231
2: System Control
IRU Controllers
All IRUs have a chassis ma nagement co ntroller ( CMC) board install ed. The fo llowing subsection
describes the basic features of the controllers:
Note: For additional information on controller commands, see the SGI UV CMC Software User
Guide (P/N 007-5636-00x).
Chassis Management Controller Functions
The following list summarizes the control and monitoring functions that the CMC performs. Many
of the controller functions are common across all IRUs , however, some functions are specific to
the type of enclosure.
•Monitors individual blade status via blade BMCs
•Controls and monitors IRU fan speeds
1U Console Option
•Reads system identification (ID) PROMs
•Monitors voltage levels and reports failures
•Monitors and contro l s warning LEDs
•Monitors the On/Off power process
•Provides the ability to create multiple system partitions
•Provides the ability to flash system BIOS
The SGI optional 1U console (Figure 2-3 on page 33) is a rackmountable unit that includes a
built-in keyboard/touch pad. It uses a 17-inch (43- cm) LCD flat panel display of up to 1280 x 1024
pixels.
32007-5832-002
IRU Controllers
1
2
3
4
Figure 2-3Optional 1U Rackmount Console
Flat Panel Rackmount Console Option Features
The 1U flat panel console option has the following listed features:
1.Slide Release - Move this tab sideways to slide the console out. It locks the drawer closed
when the console is not in use and prevents it from accidentally sliding open.
2.Handle - Used to push and pull the module in and out of the rack.
3.LCD Display Controls - The LCD controls include On/Off buttons and buttons to control
the position and picture settings of the LCD display.
007-5832-00233
2: System Control
4.Power LED - Illuminates blue when the unit is receiving power.
The 1U console attaches to the system management node server using PS/2 and HD15M
connectors or to an optional KVM switch (not provided by SGI). See Figure 2-4 for the SMN
video connection points. The 1U console is basically a “dumb” VGA terminal, it cannot be used
as a workstation or loaded with any system administration program.
The 27-pound (12.27-kg) console automatically goes into sleep mode when the cover is closed.
Mouse
Keyboard
Figure 2-4System Management Node (SMN) Direct Video Connection Ports
VGA Port
34007-5832-002
Chapter 3
3.System Overview
This chapter provides an overview of t he physical and architectural aspects of your SGI UV 2000
series system. The major components of the SGI UV 2000 series systems are described and
illustrated.
The SGI UV 2000 series is a family of multiprocessor distributed shared memory (DSM)
computer systems that can scale from 16 to 2,048 Intel proces sor cores as a cache-co herent single
system image (SSI). Future releases may s cale to lar ger processor counts for sing le system image
(SSI) applications. Contact your SGI sales or service representative for the most current
information on this topic.
In a DSM system, each processor board contains me mory that it shares with the o ther p rocess ors
in the system. Because the DSM system is modular, it combines the advantages of lower
entry-level cost with global scalability in processors, memory, and I/O. You can install and
operate the SGI UV 2000 series system in your lab or server r oom. Each 4 2U SGI rack holds o ne
to four 10-U high enclos ures that support up to eight compute/memory and I/O sub modules
known as “blades.” These blades contain printed circuit boards (PCBs) with ASICS, processors,
memory components and I/O chipsets mounted on a mec hanical carrier. The blades slide directly
in and out of the SGI UV 2000 IRU enclosures.
This chapter consists of the following sections:
•“System Mo dels” on page 37
•“System Architecture” on page 39
•“System Features” on page 41
•“System Components” on page 46
Figure 3-1 shows the front view of a single-rack SGI UV 2000 system.
007-5832-00235
3: System Overview
SGI UV 2000
Figure 3-1SGI UV 2000 Single-Rack System Example
36007-5832-002
System Models
System Models
The basic enclosure within the SGI UV 2000 system is the 10U high “individual rack unit” (IRU).
The IRU enclosure contains up to eight comp ute blad es conn ected to each other via a b ackp lane.
Each IRU has ports that are brought out to external NUMAlink 6 connectors. The 42U rack for
this server houses all IRU enclosures, option modules, and other components; up to 64 processor
sockets (512 processor cores) in a single rack. The SGI UV 2000 server system requires a
minimum of one BaseIO equipped blad e for every 2,048 processor cores. Higher core counts in
an SSI may be available in future releases, check with your SGI sales or service representative for
the most current information.
Note: Special systems operated without a system management node (SMN) must have an
optional external DVD drive available to connect to the BaseIO blade.
Figure 3-2 shows an example of how IRU placement is done in a single-rack SGI UV 2000 server .
The system requires a minimum of one 42U tall rack with three single-phase power distribution
unit (PDU) plugs per IRU installed in the rack. Three outlets are required to support each power
shelf. There are three power su pplies per power s helf and two power connections are requ ired for
an SMN.
You can also add additional PCIe expansion enclosures or RAID and non-R AID disk st orag e to
your server system. Power outlet needs for these options should be calculated in advance of
determining the number of outlets needed for the overall system.
007-5832-002 37
3: System Overview
UV Rack
IRU
IRU
(System Management Node)
1U (I/O) Expansion Slot
IRU
Individual Rack Unit (IRU)
IRU
Figure 3-2SGI UV 2000 IRU and Rack
38007-5832-002
System Architecture
The SGI UV 2000 computer system is based on a distr ibuted shared memory (DSM) architecture.
The system uses a global-address-space, cache-coherent multiprocessor that scales up to 512
processor cores in a single rack. Because it is modular , the DSM combines the advantages of lower
entry cost with the ability to scale processor count, memory, and I/O independently in each rack.
Note that a maximum of 2,048 cores are supported on a single-system image (SSI). Larger SSI
configurations may be offered in the future, contact your SGI sales or service representative for
additional information.
The system architecture for the SGI UV 2000 system is a sixth-generation NUMAflex DSM
architecture known as NUMAlink 6 or NL6. In the NUMAlink 6 architecture, all processors and
memory can be tied together into a single logical system. This combination of processors,
memory, and internal switches constitute the interconnect fabric called NUMAlink within an d
between each 10U IRU enclosure.
The basic expansion building block for the NUMAlink interconnect is the processor node; each
processor node consists of a dual-Hub ASIC (also known as a HARP) and two eight-core
processors with on-chip secondary caches. The Intel processors are connected to the dual-Hub
ASIC via quick path interconnects (QPIs). Each dual-HUB ASIC is also connected to the system’ s
NUMAlink interconnect fabric through one of sixteen NL6 ports.
System Architecture
The dual-Hub ASIC is the heart of the proces sor and memory node blade technology. This
specialized ASIC acts as a crossbar between the processors and the network interface. The Hub
ASIC enables any processor in the SSI to access the memory of all processors in the SSI.
Figure 3-3 on page 40 shows a function a l block diagram of the SGI UV 2000 s eri es system IRU.
System configurations of up to eight IRUs can be constructed withou t the use of ex ternal routers.
Routerless systems can have any number of blades up to a maximum of 64. Routerless system
topologies reduce the number of external NUMAlink cables required to interconnect a system.
External optional routers are needed to support multi-rack systems with more than four IRUs, see
Chapter 5, “Optional Octal Router Chassis Informat ion” for mor e information.
007-5832-002 39
3: System Overview
1
N
N
I
I
3
3
-
-
3
1
N
N
I
I
3
3
-
-
1
3
0
NI3-2
N
I
1
-
2
IRU Backplane Connections
2
-
3
I
N
2
-
1
I
N
1
0
N
I
0
3
I
N
-
N
I
0
-
N
3
I
2
-
3
N
I
N
2-2
I
-
0
-
N
N
2
I
I
1
1
-
-
3
1
N
N
I
I
1
1
-
-
3
1
0-2
I
N
2-2
I
N
I
1
0
N
-
I
-
3
0
N
N
I
0
-3
N
I
2
-
3
N
I
0
-
N
3
I
2
-
3
0-2
I
N
N
N
I
0
-
N
I
0
-
N
I
2
-
3
2-2
I
N
N
I
I
3
3
-
-
3
1
N
N
N
I
I
3
3
I
-
-
2
1
3
-
2
2
3
3
2
2
-
3
I
N
2
-
1
I
N
1
0
N
-
I
3-0
N
I
N
N
I
I
1
1
-
-
1
3
N
N
I
I
1
1
-
-
3
1
2
-
3
I
N
2
-
1
I
N
N
I
1
0
-
I
3
0
N
-
1
0
I
-
N
0
-
3
I
N
-
3
I
N
1
I
N
I
N
-
1
3
N
I
-
N
5
2
2
-
N
N
I
I
3
3
-
-
3
1
N
N
I
I
3
3
-
-
3
1
0
4
0
2
-
3
I
2
-
1
I
N
N
I
0
-
3
N
I
2
-
N
3
I
2
N
-
I
2
0
-
N
N
2
I
I
1
1
-
-
3
1
N
N
I
I
1
1
-
-
3
1
2
-
2
I
N
2
-
0
I
N
N
I
0
-
3
N
I
2
-
3
N
N
N
N
I
2
1
0
N
I
-
3
0
N
I
-
2
-
2
I
N
2
-
0
I
N
N
I
N
2
I
0
-
2
N
I
0
-
3
N
I
2
-
3
1
I
0
-
I
3
0
-
I
N
N
N
I
1
I
3
-
2
I
0
-
-
3
-
2
3
7
N
N
N
N
I
I
I
I
1
1
3
3
-
-
-
-
1
3
3
1
N
N
N
N
I
3
-
-2
3
I
I
I
1
3
1
-
-
-
1
1
3
6
2
-
1
2
-
3
I
N
Left
Backplane
Figure 3-3Functional Block Diagram of the Individual Rack Unit (IRU)
40007-5832-002
Right
Backplane
Note: this drawing only shows the cabling
between the compute blades within the
IRU. External cabling (cabling that exits
the IRU) is not shown.
System Features
The main features of the SGI UV 2000 series server systems are discussed in the following
sections:
•“Modularity and Scalability” on page 41
•“Distributed Shared Memory (DSM)” on page 41
•“Chassis Management Controller (CMC)” on page 43
•“Distributed Shared I/ O” on page 43
•“Reliability, Availability, and Serviceability (RAS)” on page 44
Modularity and Scalability
The SGI UV 2000 series systems are modular systems. The components are primarily hous ed in
building blocks referred to as individual rack units (IRUs). Additional optional mass storage may
be added to the rack along with additional IRUs. You can add different types of blade options to
a system IRU to achieve the desired system configuration. You can easily configure systems
around processing capability, I/O capability, memory size, MIC/GPU capability or storage
capacity. The air-cooled IRU enclosure system has redundant, hot-swap fans and redundant,
hot-swap power supplies.
System Features
Distributed Shared Memory (DSM)
In the SGI UV 2000 series server, memory is physically distributed both within and among the
IRU enclosures (compute/memory/I/O blades); however, it is accessible to and shared by all
NUMAlinked devices within the single-system image (SSI). This is to say that all NUMAlinked
components sharing a single Linux operating system, operate and share the memory “fabric” of
the system. Memory latency is the amount of time required for a processor to retrieve data from
memory . Memor y latency is lowest when a processor accesses local memory. Note the following
sub-types of memor y within a system:
•If a processor accesses memory that is directly connected to its resident socket, the memory
is referred to as local memory. Figure 3-4 on page 42 shows a conceptual block diagram of
the blade’s memory, compute and I/O pathways.
•If a processor needs to access memory located in another socket, or on another blade within
the IRU, (or other NUMAlinked IRUs) the memory is referred to as remote memory.
007-5832-002 41
3: System Overview
•The total memory within the NUMAlinked system is referred to as global memory.
CPU/DDR3 Power
DDR3
DDR3
VTT
VPPL
Clocks
Clocks
POWER
CONN
BMC to CMCEnet
BMC to CMCEnet
Vcore/
VSA
A
B
DDR3
A
B
DDR3
A
B
DDR3
A
B
DDR3
Chan-0
Chan-1
Chan-2
Chan-3
PCIe Gen 3 X16
TOP NODE PCA
PSOC
SGPIO
DDR3
BMC to CMCEnet
BMC to CMCEnet
SPI
PSOC
Spartan 6
FPGA
/
a
t
s
a
u
D
B
l
r
e
l
d
l
d
a
r
A
a
P
SGPIO
RSPI
BACKPLANE CONNECTOR
s
P
R
O
C
Q
P
I
1
QPI
l
e
n
n
6
a
L
h
N
C
2
1
(4) QFSP
iPass
C
O
N
PSOC
N
6
L
N
a
h
n
e
n
l
s
4
C
E
C
T
O
R
SGPIO
Local Bus 0 &1
BNI Bus 0 & 1
HARP CONNECTOR
QPI
QPI
HARP PCA
Only used
for BaseIO
PCIe X4
PCIe X4
Only used
for BaseIO
CPU/DDR3 Power
DDR3
DDR3
VTT
VPPL
Vcore/
VSA
A
B
DDR3
A
B
DDR3
A
B
DDR3
A
B
DDR3
BOTTOM NODE PCA W/BMC
Figure 3-4Blade Node Block Diagram Example
42007-5832-002
Chan-0
Chan-1
Chan-2
Chan-3
QPI
PCIe Gen 3 X16
Used for:
PCIe x16 slot or
BaseIO card
Distributed Shared I/O
Like DSM, I/O devices are distribu ted among the blade nod es within the IRUs. Each BaseIO riser
card equipped blade node is accessible by all compute nodes within the SSI (partition) through the
NUMAlink interconnect fabric.
Chassis Management Controller (CMC)
Each IRU has a chassis management contr olle r (CMC) located directly below the coo ling fans in
the rear of the IRU. The chassis manager supports powering up and down of the compute blades
and environmental monitoring of all units within the IRU.
One GigE port from each compute blade conn ects to the CMC blade via the internal IRU
backplane. A second GigE port from each blade slot is also connected to the CMC. This
connection is used to support a BaseIO riser card. Only one BaseIO is supported in an SSI. The
BaseIO must be the first blade (lowest) in the SSI.
ccNUMA Architecture
System Features
As the name implies, the cache-coherent non-unifor m memory access (ccNUMA) architecture has
two parts, cache coherency and nonuniform memory access, which are discussed in the sections
that follow.
Cache Coherency
The SGI UV 2000 server series use caches to reduce memory latency . Although data exists in local
or remote memory , cop ies of the data can exist in various processor caches throughout the system.
Cache coherency keeps the cached copies consistent.
T o keep the copies consistent, the ccNUMA architecture uses directory-based coherence protocol.
In directory-based coherence protocol, each block of memory (128 bytes) has an entry in a table
that is referred to as a directory . Lik e the blocks of memory that they represent, the directories are
distributed among the compute/memory blade nodes. A block of memory is also referred to as a
cache line.
Each directory entry indicates the state of the memory block that it represents. For example, when
the block is not cached, it is in an unowned state. When only one processor has a copy of the
007-5832-002 43
3: System Overview
memory block, it is in an exclusive state. And when more than one processor has a copy of the
block, it is in a shared state; a bit vector indicates which caches may contain a copy.
When a processor modifies a block of dat a, the processors that have the same block of data in their
caches must be notified of the modification. The SGI UV 2000 server series uses an invalidation
method to maintain cache coherence. The invalidation method purges all unmodified copies of the
block of data, and the processor that wants to modify the block receives exclusive ownership of
the block.
Non-uniform Memory Access (NUMA)
In DSM systems, memory is physically located at various distances from the processors. As a
result, memory access times (latencies) are different or “non-uniform.” For exa mple, it takes less
time for a processor blade to reference its locally installed memory than to reference remote
memory.
Reliability, Availability, and Serviceability (RAS)
The SGI UV 2000 server series components have the following features to increase the reliability ,
availability, and serviceability (RAS) of the systems.
•Power and cooling:
–IRU power supplies are redundant and can be hot-swapped under most circumstances.
Note that this might not be possible in a “fully loaded” system. If all the blade positions
are filled, be sure to consult with a service technician before removing a power supply
while the system is running.
–IRUs have overcurrent protection at the blade and power supply level.
–Fans are redundant and can be hot-swapped.
–Fans run at multiple speeds in the IRUs. Speed increases automatically when
temperature increases or when a single fan fai ls.
•System monitoring:
–System controllers monitor the inter nal power and temperature of the IRUs, and can
automatically shut down an enclosure to prevent overh eating.
–All main memory has Intel Single Dev ice Data Co rrection, to detect and correct 8
contiguous bits failing in a memory device. Additionally, the main memory can detect
and correct any two-bit errors coming from two memory devices (8 bits or more apart).
44007-5832-002
System Features
–All high speed links including Intel Quick Path Interconnect (QPI), Intel Scalable
Memory Interconnect (SMI), and PCIe have CRC check and retry.
–The NUMAlink interconnect network is protected by cyclic redundancy check (CRC).
–Each blade/node installed has status LEDs that indicate the blade’s operational
condition; LEDs are readable at the front of the IRU.
–Systems support the optional Embedded Support Partner (ESP), a tool that monitors the
system; when a condition occurs that may cause a failure, ESP notifies the appropriate
SGI personnel.
–Systems support remote console and maintenance activities.
•Power-on and boot:
–Automatic testing occurs after you power on the system. (These power-on self-tests or
POSTs are also referred to as power-on diagnostics or PODs).
–Processors and memory are automatically de-allocated when a self-test failure occurs.
–Boot times are minimized.
•Further RAS features:
–Systems have a local field-replaceable unit (FRU) analyzer.
–All system faults are lo gged in files.
–Memory can be scrubbed using error checking code (ECC) when a single-bit error
occurs.
007-5832-002 45
3: System Overview
System Components
The SGI UV 2000 series system features the following major components:
•42U rack. This is a custom rack used for both the compute and I/O rack in the SGI UV 2000
•Individual Rack Unit (IRU). This enclosure contains three power supplies, 2-8
•Compute blade. Holds two processor sockets and 8 or 16 memory DIMMs. Each compute
•BaseIO enabled compute blade. I/O riser enabled blade that supports all base system I/O
system. Up to four IRUs can be installed in each rack. There is also space reserved for a
system management node and other optional 19-inch rackmounted components.
compute/memory blades, BaseIO and other optional riser enabled blades for the SGI UV
2000. The enclosure is 10U high. Figure 3-5 on page 47 shows the SGI UV 2000 IRU
system components.
blade can be ordered with a riser card that enables the blade to support various I/O options.
functions including two ethernet connectors, one BMC ethernet port and three USB ports.
System disks are always controlled by a PCIe disk controller installed in the BaseIO blade’s
PCIe slot. Figure 3-6 on page 48 shows a front-view example of the BaseIO blade.
Note: While the BaseIO blade is capable of RAID 0 support, SGI does not recommend the
end user configure it in this way. RAID 0 offers no fault tolerance to the system disks, and a
decrease in overall system reliability. The SGI UV 2000 ships with RAID 1 functionality
(disk mirroring) configured if the option is ordered.
•Dual disk enabled compute blade. This riser enabled blade supports two hard disk drives
that normally act as the system disks for the SSI. This blade must be installed adjacent to and
physically connected with the BaseIO enabled compute blade. JBOD, RAID 0 and RAID 1
are supported. Note that you must have the B aseIO riser blade optionally enabled to use
RAID 1 mirroring on your system disk pair.
•Two-Slot Internal PCIe enabled compute blade. The internal PCIe riser based compute
blade supports two internally installed PCI Express option cards. Either two half-height or
one half-height and one full-height cards are supported.
•MIC/GPU PCIe enabled compute blade. This blade supports one optional MIC or GPU
card in the upper slot via PCIe interface to the bottom node board. Option cards are limited,
check with your SGI sales or service representative for available types supported.
•External PCIe enabled compute blade. This PCIe enabled board is used in conjunction
with an external PCIe expansion enclosure. A x16 adapter card connects from the blade to
the external expansion enclosure, supporting up to four PCIe option cards.
46007-5832-002
System Components
C
C
C
C
te
7
te
6
te
5
te
4
Note: PCIe card optio ns may be limite d, check with your SGI sales or support representat ive.
ompute
blade 3
ompute
blade 2
ompute
blade 1
Compu
blade
Compu
blade
Compu
blade
PCIE
ompute
blade 0
2
1
0
SAS0-3SAS4-7
VGA
USB
SERIAL
LAN1
LAN0BMC
Compu
blade
PS0PS2PS1
Figure 3-5SGI UV 2000 IRU System Components Example
007-5832-002 47
3: System Overview
Optional BaseIO SSDs
The BaseIO blade can be configured with one or two internal 1.8-inch solid state drives (SSDs).
The SSDs can be configured as JBOD or RAID1. Th e RAID1 SSD pair is a software RA ID1 and
two SSDs must be ordered with the system BaseIO to enable this configuration.
PCIE
VGA
2
1
SAS0-3SAS4-7
0
USB
SERIAL
Figure 3-6BaseIO Riser Enabled Blade Front Panel Example
LAN1
LAN0BMC
48007-5832-002
MIC/GPU Enabled Compute Blade
The single-socket MIC/GPU enabled compute blade has one single socket node blade and
supports one PCIe accelerator card. The MIC/GPU enabled compute blade has the following
features:
•One HARP ASIC based board assembly with twelve NUMALink six (NL6) ports that
connect the blade to the backplane and four NL6 port s connecting the blade to external
QSFP ports.
•Specialized connectors support the connection to both the bottom compute no de and the top
MIC or GPU board assembly.
•One Bottom compute node board assembly with a single processor socket also supports
eight memory DIMM slots (1600 MT/s memory DIMMs).
•One baseboard management controller (BMC) and one x16 Gen3 PCIe full-height
double-wide slot that supports a single MIC or GPU accelerator card.
•The accelerator card connects directly to the bottom compute board assembly via a ribbon
cable and draws power from the IRU backplane.
System Components
Figure 3-7MIC/GPU Enabled Compute Blade Example Front View
007-5832-002 49
3: System Overview
Bay (Unit) Numbering
Bays in the racks are numbered using standard units. A standard uni t (S U) or un it (U) i s equ al to
1.75 inches (4.445 cm). Because IRUs occupy mu ltiple standard units, IRU locations within a rack
are identified by the bottom unit (U) in which the IRU resides. For example, in a 42U rack, an IRU
positioned in U01 through U10 is identified as U01.
Rack Numbering
Each rack is numbered with a three-digit number sequentiall y beginning with 001. A rack contains
IRU enclosures, optional mass storage encl osures, and potentially other options. In a single
compute rack system, the rack number is always 001.
Optional System Components
A vailability of opti onal components for the SGI UV 2000 sys tems may vary based on new product
introductions or end-of-life components. Some options are listed in this manual, others may be
introduced after this document goes to production status. Check with your SGI sales or support
representative for current information on available product options not discussed in this manual.
50007-5832-002
Overview
Chapter 4
4.Rack Information
This chapter describes the physical characteristics of the tall (42U) SGI UV 2000 racks in the
following sections:
•“Overview” on page 51
•“SGI UV 2000 Series Rack (42U)” on page 52
•“SGI UV 2000 System Rack Technical Specifications” on page 56
At the time this document was published only the tall (42U) SGI UV 2000 rack (shown in
Figure 4-2) was available from the SGI factory for use with the SGI UV 2000 sys tems. Other racks
may be available to house the system IRUs, check with your SGI sales or service representative
for information.
007-5832-00251
4: Rack Information
SGI UV 2000 Series Rack (42U)
The tall rack (shown in Figure 4-1 on page 53) has the following features and components:
•Front and rear door. The front door is opened by grasping the outer end of the
rectangular-shaped door piece and pulling outward. It uses a key lock for security purposes
that should open all the front doors in a multi-rack system (see Figure 4-2 on page 54).
Note: The front door and rear door locks are keyed differently. The optional water-chilled
rear door (see Figure 4-3 on page 55) does not use a lock.
The standard rear door has a pu sh-button key lock to prevent unauthorized access to the
system. The rear doors have a master key that locks and unlocks all rear doors in a system
made up of multiple racks. You cannot use the rear door key to secure the front door lock.
•Cable entry/exit area. Cable access openings are located in the front floor and top of the
rack. Multiple cables are attached to the front of the IRUs; therefore, a significant part of the
cable management occurs in the front part of the rack. The stand-alone system management
nodes have cables that attach at the rear of the rack. Rear cable connections will also be
required for optional storage modules installed in the same rack with the IRU(s). Optional
inter-rack communication cables can pass through the top of the rack. These are necessary
whenever the system consists of multiple racks. I/O and power cables normally pass through
the bottom of the rack.
•Rack structural features. The rack is mounted on four casters; the two rear casters swivel.
There are four leveling pads available at the base of the rack. The base of the rack also has
attachment points to support an optional ground strap, and/or seismic tie-downs.
•Power distribution units in the rack. Up to 12 outlets are required for a single-rack IRU
system as follows:
–Allow three outlets for the fi rst IRU
–Two outlets for a maintenance node SMN (server)
–Two outlets for each storage or PCIe expansion chassis
–Allow three more outlets for each additional IRU in the syst em
Note than an e ight outlet single-pha se PDU may be used for the system manag ement node
and other optional equipment.
52007-5832-002
SGI UV 2000 Series Rack (42U)
Each three-phase power distribution unit has 9 outlet connections.
SGI UV 2000
Figure 4-1SGI UV 2000 Series Rack Example
007-5832-00253
4: Rack Information
SGI UV 2000
Figure 4-2Front Lock on Tall (42U) Rack
54007-5832-002
SGI UV 2000 Series Rack (42U)
SGI UV 2000
Figure 4-3Optional Water-Chilled Cooling Units on Rear of SGI 42U Rack
007-5832-00255
4: Rack Information
SGI UV 2000 System Rack Technical Specifications
Table 4-1 lists the technical specifications of the SGI UV 2000 series tall rack.
Table 4-1Tall Rack Technical Specifications
CharacteristicSpecification
Height79.5 in. (201.9 cm)
Width31.3 in. (79.5 cm)
Depth45.8 in. (116.3 cm)
Single-rack shipping
weight (approximate )
Single-rack system weight
(approximate)
Vol ta ge range
Nominal
Tolerance range
Frequency
Nominal
Tolerance range
Phase requiredSingle-phase or 3-phase
Power requirements (max) 34.57 kVA (33.88 kW) approximate
Hold time16 ms
Power cable8 ft. (2.4 m) pluggable cords
2,381 lbs. (1,082 kg) air cooled
2,581 (1,173 kg) water assist cooling
2,300 lbs. (1,045 kg) air cooled
2,500 lbs (1,136 kg) water assist cooling
North America/International
200-240 VAC /230 VAC
180-264 VAC
North America/International
60 Hz /50 Hz
47-63 Hz
56007-5832-002
Chapter 5
5.Optional Octal Router Chassis Information
This chapter describes the optional NUMAlink router technology available in SGI UV 2000
systems consisting of two or more racks. This router technology is available in an enclosure
“package” known as the Octal Router Chassis (ORC). This optional ORC chassis can be mounted
on the top of the SGI UV 2000 rack. NUMAl ink 6 advanced router technolog y reduces UV 20 00
system data transfer latency and increases bisection bandwidth performance. Router option
information is covered in the following sections:
•“Overview” on page 57
•“SGI UV 2000 Series NUMAlink Octal Router Chassis” on page 58
•“SGI UV 2000 External NUMAlink System Technical Specifications” on page 60
Overview
At the time this document was published, external NUMAlink router technology was available to
support from 2 to 512 SGI UV 2000 racks. Other “internal” NUMAlink router options are also
available for high-s peed communication bet ween smaller groups of SGI UV 2000 racks . For more
information on these topics, contact your SGI sales or service representative.
The standard routers used in the SGI UV 2000 systems are the NL6 router blades located internally
to each IRU. Each of these first level routers contain a single 16-port NL6 HARP router ASIC.
Twelve ports are used for internal connetions (connecting blades together), the remaining four
ports are used for external connections. The NUMAlink ORC enclosure is located at the top of
each SGI UV 2000 rack equipped with the option. Each top-mo unted NUMAlink ORC enclosure
contains one to eight 16-port HARP ASIC based rout er boards . Each of these router boar ds has a
single NL6 HARP router ASIC. This is the same rout er ASIC that is used in the NL6 router blades
installed inside the system IRUs.
Note that the ORC chassis also contains a chassis management controller (CMC) board, two
power supplies and its own cooling fans.
007-5832-00257
5: Optional Octal Router Chassis Information
SGI UV 2000 Series NUMAlink Octal Router Chassis
The NUMAlink 6 ORC router is a 7U-high fully self contained chassis that holds up to eight
16-port NL6 router blade assemblies. Figure 5-1 shows an example rear view of the ORC with no
power or NUMAlink cables connected.
The NUMAlink ORC is composed of the following:
•7U-high chassis
•4 or 8 HARP based router blade assemblies
•Cooling-fan assemblies
•Chassis Management Controll er (CMC)/power supply assembly (with two power supplies)
Figure 5-2SGI UV 2000 Optional ORC Chassis Example (Front View)
Note: The NUMAlink unit’s CMC is connected to the CMC in each IRU installed in the rack.
007-5832-00259
5: Optional Octal Router Chassis Information
SGI UV 2000 External NUMAlink System Technical Specifications
Table 5-1 lists the basic technical specifications of the SGI UV 2000 series external NUMAlink
ORC chassis.
Table 5-1External NUMAlink T ec hnic al Specificatio n s
CharacteristicSpecification
Height7U or 12.25 in. (31.1 cm)
Width13.83 in. (35.13 cm)
Depth14.66 in (37.24 cm)
T op -m ount NUMAlink
router weight
(approximate)
Power supplyThree 760-Watt hot-plug power supplies
Vol ta ge range
Nominal
Frequency
Nominal
Tolerance range
Phase requiredSingle-phase
Power cables6.5 ft. (2 m) pluggable cords
53 lbs. (24.1 kg) not including attached cables
North America/International
100-240 VAC /230 VAC
North America/International
60 Hz /50 Hz
47-63 Hz
60007-5832-002
Chapter 6
6.Add or Replace Procedures
This chapter provides information about installing and removing PCIe cards and system disk
drives from your SGI system, as follows:
•“Maintenance Precautions and Procedures” on page61
•“Adding or Replacing PCIe Cards in the Expansion Enclosure” on page 64
•“Removing and Replacing an IRU Enclosure Power Su pply” on page 68
Maintenance Precautions and Procedures
This section describes how to open the system for maintenance and upgrade, protect the
components from static damage, and return the system to operation. The following topics are
covered:
•“Preparing the System for Maintenance or Upgrade” on page 62
•“Returning the System to Operation” on page 62
W arning: To avoid problems that cou ld void your w arranty, your SGI or other approved
system support engineer (SSE) should perform all the setup, addition, or replacement of
parts, cabling, and service of your SGI UV 2000 system, with the exception of the following:
•Using your system console or net work access workstation to enter commands and perform
system functions such as powering on and powering off, as described in this guide.
•Installing, removing or replacing cards in the optional 1U PCIe expansion chassis.
•Adding and replacing disk drives used with your system and using the ESI/ops panel
(operating panel) on optional mass storage.
•Removing and replacing IR U power supplies.
007-5832-00261
6: Add or Replace Procedures
Preparing the System for Maintenance or Upgrade
To prepare the system for maintenance, follow these steps:
1.If you are logged on to the system, log out. Follow standard procedures for gracefully
halting the operating system.
2.Go to the section “Powering-On and Off From the SGI Management Center Interface” in
Chapter 1 if you are not familiar with power down procedures.
3.After the system is pow ered off, locate t he power di st ribution un i t(s) (PDUs) i n the front of
the rack and turn off the circuit breaker switches on each PDU.
Note: Powering the system off is not a requirement when replacing a RAID 1 system disk.
Addition of a non-RAID disk can be accomplished while the system is powered on, but the
disk is not automatically recognized by system software.
Returning the System to Operation
When you finish installing or removing components , retu rn th e system to operation as follows:
1.Turn each of the PDU circuit breaker switches to the “on” position.
2.Power up the system. If you are not familiar with the proper power down procedure, review
the section “Powering-On and Off From the SGI Management Center Interface” in
Chapter 1.
3.Verify that the LEDs on the system power supplies and system blades tur n on an d illuminate
green which indicates that the power-on procedure is proceeding properly.
If your system does not boot correctly, see “Troubleshooting Chart” in Chapter 7, for
troubleshooting procedures.
62007-5832-002
Overview of PCI Express (PCIe) Operation
P
This section provides a brief overview of the PCI Express (PCIe) technology available as an
option with your system. PCI Express has both compatibility and differences with older
PCI/PCI-X technology. Check with your SGI sales or service representative for more detail on
specific PCI Express board options available with the SGI UV 2000.
PCI Express is compatible with PCI/PCI-X in the following ways:
•Compatible software layers
•Compati ble device driver models
•Same basic board form factors
•PCIe controlled devices appear the same as PCI/PCI-X devices to most software
PCI Express technology is different from PCI/PCI-X in the following ways:
•PCI Express uses a point-to-point serial interface vs. a shared parallel bus interface used in
older PCI/PCI-X technology
•PCIe hardware connectors are not compatible with PCI/PCI-X, (see Figure 6-1)
Overview of PCI Express (PCIe) Operation
•Potential sustained throughput of x16 PCI Express is approximately four times that of the
fastest PCI-X throughput s
PCI 2.0 32-bit
PCI Express x1
CI Express x16
Figure 6-1Comparison of PCI/PCI-X Connector with PCI Express Connectors
PCI Express technology uses two pairs o f wires for each tran smit and receive connection (4 wires
total). These four wires are generally referr ed to as a lane or x1 connection - also called “by 1”.
007-5832-00263
6: Add or Replace Procedures
SGI UV 2000 PCIe technology is available up to a x16 connector (64 wires) or “by 16” in PCI
Express card slots. This techn ology will support PC Ie boards that use connectors up to x16 in size.
Table 6-1 shows this concept.
For information on which slots in the PCIe expansion chassis support wh at lan e levels , see
Table 6-2 on page 65.
Table 6-1SGI UV 2000 PCIe Support Levels
SGI x16 PCIe
Connectors
x1 PCIe cardsSupported in all four slots
x2 PCIe cardsSupported in all four slots
x4 PCIe cardsSupported in all four slots
x8 PCIe cardsSupported in two slots
x16 PCIe cards1 slot supported
x32 PCIe cardsNot supported
Support levels in
optional chassis
Adding or Replacing PCIe Cards in the Expansion Enclosure
Warning: Before installing, operating, or servicing any part of this product, read the
“Safety Inf ormation” on page 89.
This section provides instructions for adding or replacing a PCIe card in a PCIe expansion
enclosure installed in your system. T o maximize the operating efficiency of your cards , be sure to
read all the introductory matter before beginning the installation.
Caution: To protect the PCIe cards from ESD damage, SGI recommends that you use a
!
64007-5832-002
grounding wrist strap while installing a PCIe card.
Installing Cards in the 1U PCIe Expansion Chassis
The PCIe expansion chassis functions in a s imilar manner to a computer chassis that suppor ts PCIe
slots. Always follow the manufacturer’s instructions or restrictions for installing their card.
Important: Replacement (swapping) of a PCIe card in the 1U chassis may be done while the
system is powered on. Addition of a new card while the system is running requires a reboot to
initiate recognition and functionality. Removal (without replacement) of an existing PCIe card
may cause system error messages. When installing PCIe cards, ensure that the input current rating
specified on the AC input label is not exceeded.
The EB4-1U-SGI chassis provides space for up to four (4) PCIe cards in the following “lane”
bandwidth configurat ions:
Table 6-2PCIe Expansion Slot Bandwidth Support Levels
Adding or Replacing PCIe Cards in the Expansion Enclosure
PCIe expansion
enclosure slot #
Slot 1Up to x16Bottom-left side
Slot 2Up to x4Top-left side
Slot 3Up to x8Top-right side
Slot 4Up to x4Bottom-right side
PCIe connector level
supported by slot
PCIe slot number location
in board “carriage”
Note: Before installing the PCIe expansion cards, be sure to remove each respective slot cover
and use its screw to secure your expansion card in place.
1.Working from the front of the expansion chassis, locate the two “thumb screws” that hold
the PCIe board “carriage” in the expansion chassis.
2.Turn the two thumb screws counter-clockwise until they disengage from the 1U chassis.
3.Pull the T-shaped board “carriage” out of the chassis until the slots are clear of the unit.
4.Select an available slot based on the lane support your PCIe card requires, see Table 6-2.
5.Remove the metal slot cover from the selected slot and retain its screw.
007-5832-00265
6: Add or Replace Procedures
6.Fit the PCIe card into the slot connector with the connector(s) extending out the front of the
bracket, then secure the board with the screw that previously held the metal slot cover.
7.Push the PCIe board “carriage” back into the enclosure until it is seated and t wist the
retaining thumb screws clockwise (right) until fully secure.
Important: After installation, be sure to power on the PCIe expansion enclosure before
re-booting your system.
Figure 6-2The PCIe Expansion Enclosure
66007-5832-002
Adding or Replacing PCIe Cards in the Expansion Enclosure
Figure 6-3Card Slot Locations
007-5832-00267
6: Add or Replace Procedures
P
Removing and Replacing an IRU Enclosure Power Supply
T o remove an d replace power supplies in an SGI UV 2000 IRU, you do not need any tools. Under
most circumstances a single power supply in an IRU can be replaced without shutting down the
enclosure or the complete system. In the case of a fully configured (loaded) enclosure, th is may
not be possible.
Caution: The body of the power supply may be hot; allow time fo r cooling and handle wit h care.
Use the following steps to replace a power supply in the blade enclosure box:
1.Open the front door of the rack and locate the power supply that needs replacement.
2.Disengage the power-cord retention clip and disconnect the power cord from the power
supply that needs replacement.
ress latch
to release
Figure 6-4Removing an Enclosure Power Supply
68007-5832-002
Adding or Replacing PCIe Cards in the Expansion Enclosure
3.Press the retention latch of the power supply toward the power connector to release the
supply from the enclosure, see Figure 6-4 on page 68.
4.Using the power supply handle, pull the power supply straight out until it is partly out of the
chassis. Use one hand to support the bottom of the supply as you fully extract it from the
enclosure.
5.Align the rear of the replacement power supply with the enclosure opening.
6.Slide the power supply into the chassis until the retention latch engages - you should hear an
audible click.
7.Reconnect the power cord to the supply and engage the retention clip.
Note: If AC power to the rear fan assembly is dis c onnected prior to the replacement procedure,
all the fans will come on and run at top speed when power is reapplied. The speeds will readjust
when normal communication with the IRU’s CMC is fully established.
Replacing an IRU Enclosure Power Supply
007-5832-00269
Chapter 7
7.Troubleshooting and Diagnostics
This chapter provides the following sections to help you troubleshoot your system:
•“Troubles ho oti ng Char t” on page 72
•“LED Status Indicators” on page 73
•“SGI Electronic Support” on page 75
007-5832-00271
7: Troubleshooting and Diagnostics
Troub leshoo ting Chart
Table 7-1 lists recommended actions for problems that can occur. To solve problems that are not
listed in this table, use the SGI Electronic Support system or contact your SGI system support
representative. For more information about the SGI Electronic Support system, see the “SGI
Electronic Support” on page 75. For an international list of SGI support centers, see:
http://www.sgi.com/support/supportcenters.html
Table 7-1Troubleshootin g Chart
Problem DescriptionRecommended Action
The system will not power on.Ensure that the power cords of the IRU are seated properly
An individual IRU will not power on.Ensure the power cables of the IRU are plugged in.
in the power receptacles.
Ensure that the PDU circuit breakers are on an d properly
connected to the wall source.
If the power cord is plugged in and the circuit breaker is on,
contact your technical support organization.
Confirm the PDU(s) s upporting the IRU are on.
No status LEDs are lighted on an individual
blade.
The system will not boot the operating system. Contact your SGI support organization:
The amber (yellow) status LED of an IRU
power supply is lit or the LED is not lit at all.
See Table 7-2 on page 73.
The PWR LED of a populated PCIe slot is not
illuminated.
The Fault LED of a popu lated PCIe slot is
illuminated (on).
The amber LED of a disk drive is on. Replace the disk drive.
72007-5832-002
Confirm the blade is firmly seated in the IRU enclosure.
See also “Compute/Memory Blade LEDs” on page 74.
http://www.sgi.com/support/supportcenters.html
Ensure the power cable to the supply i s firmly connecte d at
both ends and that the PDU is turned to on. Check and
confirm the supply is fully plugged in. If the green LED
does not light, contact your support organization.
Reseat the PCI card.
Reseat the card. If the fault LED remains on, replace the
card.
LED Status Indicators
There are a number of LEDs on the front of the IRUs that can help you detect, identify and
potentially correct functional interruptions in the system. The followin g subsections describe
these LEDs and ways to use them to understand potential problem areas.
IRU Power Supply LEDs
Each power supply installed in an IRU has a bi-color status LED. The LED will either light green
or amber (ye l low), or fl ash green or yellow to indicate the status of the individual supply. See
Tabl e 7- 2 fo r a complete li st.
Table 7-2Power S upply LED States
Power supply statusG reen LEDAmber LED
No AC power to the supplyOffOff
Power supply has failedOffOn
LED Status Indicators
Power supply problem warningOffBlinking
AC available to supply (standby)
but IRU is off
Power supply on (IRU on)OnOff
BlinkingOff
007-5832-00273
7: Troubleshooting and Diagnostics
Compute/Memory Blade LEDs
Each compute/memory blade installed in an IRU has a total of seven LED indicators visable
behind the perforated sheetmetal of the blade.
At the bottom end (or left side) of the blade (from left to right):
•System power good green LED
•BMC heartbeat green LED
•Blue unit identifier (UID) LED
•BMC Ethernet 1 green LED
•BMC Ethernet 0 green LED
•Green 3.3V auxiliary power LED
•Green 12V power good LED
If the blade is properly seated and the system is powered on and there is no LED activity s howing
on the blade, it must be replaced. Figure 7-1 shows the locations of the blade LEDs.
Figure 7-1UV Compute Blade Status LED Locations Example
74007-5832-002
Green LEDsGreen LEDs
Blue LED
SGI Electronic Support
S
rt
SGI Electronic Support provides system support and problem-solving services that function
automatically, which helps resolve problems before they can affect system availability or develop
into actual failures. SGI Electronic Support integrates several services so they wo rk to geth er to
monitor your system, notify you if a problem exists, and search for solutions to problems.
Figure 7-2 shows the sequence of events that occurs if you use all of the SGI Electronic Support
capabilities.
SGI Electronic Support
1
Customer's system
Implement
solution
6
2
e-mail
3
upportfolio
Online
Page or e-mail
5
alert
SGI customer and
SGI support engineer
View the case
solutions
SGI global
customer suppo
center
4
SGI Knowledgebase
Figure 7-2Full Support Sequence Example
007-5832-00275
7: Troubleshooting and Diagnostics
The sequence of events can be described as follows:
1.Embedded Support Partner (ESP) software monitors your system 24 hours a day.
2.When a specified system event is detected, ESP notifies SGI via e-mail (plain text or
3.Applications that are running at SGI analyze the information, determine whether a support
4.SGI Knowledgebase searches thousands of tested solutions for possible fixes to the problem.
5.You and the SGI support engineers can view and manage the case by using Supportfolio
6.Implement the solution.
Most of these actions occur autom atically, and you may receive solutions to problems before they
affect system availability. You also may be able to return your system to service s ooner if it is out
of service.
encrypted).
case should be opened, and open a case if necessary. You and SGI support engineers are
contacted (via pager or e-mail) with the case ID and problem description.
Solutions that are located in SGI Knowledgebase are attached to the service cas e.
Online as well as search for additional solutions or schedule maintenance.
In addition to the event monitoring and problem reporti ng, SGI Electroni c Support monit ors both
system configuration (to help with asset management) and system availability and performance
(to help with capacity planning).
76007-5832-002
SGI Electronic Support
The following three components compose the integrated SGI Electronic Support system:
SGI Embedded Support Partner (ESP) is a set of tools and utilities that are embedded in the
SGI Linux ProPack release. ESP can monitor a single system or group of system s for system
events, software and hardware failures, availability, p erformance, and configuration changes, and
then perform actions based on those events. ESP can detect system conditions that indicate
potential problems, and then alert appropriate personnel by pager, console messages, or e-mail
(plain text or encrypted). Y ou also can configure ESP to notify an SGI call center about problems;
ESP then sends e-mail to SGI with information about the event.
SGI Knowledgebase is a database of solutions to problems and answers to questions that can be
searched by sophisticated knowledge management tools. You can log on to SGI Knowledgebase
at any time to describe a problem or ask a qu estion. Knowledgebase searches thousands of
possible causes, problem descriptions, fixes, and how-to in structions for the solutions that best
match your description or question.
Supportfolio Online is a customer support resource that includes the latest information about
patch sets, bug reports, and software releases.
The complete SGI Electronic Support services are available to customers who have a valid SGI
Warranty, FullCare, FullExpress, or Mission-Critical support contract. To purchase a support
contract that allows you to use the complete SGI Electronic Support services, contact your SGI
sales representative. F or mor e in for mat ion ab out t he various support contracts, see the following
Web page:
http://www.sgi.com/support
For more information about SGI Electronic Support, see the following Web page:
http://www.sgi.com/support/es
007-5832-00277
Appendix A
A.Te chnical Specifications and Pinouts
This appendix contains technical specification information about your system, as follows:
•“System-level Specifications” on page 79
•“Physical Specifications” on page 80
•“Environmental Specifications” on page 81
•“Power Specifications” on page 82
•“I/O Port Specifications” on page 83
System-level Specifications
Table A-1 summarizes the SGI UV 2000 system configuration ranges. Note that while each
compute/memory blade holds two processor sockets (one per node board); each socket can
support four, six, or eight processor “cores”.
Table A-1SGI UV 2000 System Configuration Ranges
CategoryMinimumMaximum
Processors32 processor cores (2 blades)
Individual Rack Units (IRUs)1 per rack4 per rack
Blades per IRU2 per IRU8 per IRU
Compute/memory blade DIMM
capacity
CMC units1 per IRU1 per IRU
Number of baseIO riser enabled
blades
a. Dual-node blades support eight to 16 c ores per blade.
007-5832-00279
8 DIMMs per blade16 DIMMs per blad e
One per SSIOne per SSI
a
2,048 processor cores (per SSI)
A: Technical Specifications and Pinouts
Physical Specifications
Table A -2 shows the physical specifications of the SGI UV 2000 system.
Table A-2SGI UV 2000 Physical Specifications
FeatureSpecification
Dimensions for a single 24-inch
wide tall rack, including doors and
side panels
Shipping dime nsionsHeight: 81.2 5 in. (206.4 cm)
Single-rack shipping weight
(approximate)
Single-rack system weight
(approximate)
Access requirements
Front
Rear
Side
10U-high Indivi dual Rack Unit
(IRU) enclosure specifications
Height: 79.5 in. (201.9 cm)
Width: 31.3 in. (79.5 cm)
Depth: 43.45 in. (110. 4 cm)
Width: 42 in. (106.7 cm)
Depth: 52 in. (132.1 cm)
2,381 lbs. (1, 082 kg) air cooled
2,581 lbs. (1, 173 kg) water assist cooling
2,300 lbs. (1, 045 kg) air cooled
2,500 lbs. (1, 136 kg) water assist cooling
48 in. (121.9 cm)
48 in. (121.9 cm)
None
Dimensions: 17.5 in high x 19 in (flange width) wide x 27 in deep
44.45 cm high x 48. 26 cm wide x 68.58 cm deep
Note: Racks equipped with optional top-mounted NUMAlink (ORC) routers have an additional
weight of 53 lbs. (24.1 kg) plus the weight of additional cables.
80007-5832-002
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.