Silicon Graphics Rackable C1104G-RP5 User Manual

SGI® Rackable™ C1104G-RP5 System User Guide
007-5839-001
COPYRIGHT
© 2012 SGI. All rights reserved; provided portions may be cop yright in third parties, as indicated elsewhere herein. No permission is granted to copy , distribute, or create derivative works from the contents of this electronic documentation in any manner, in whole or in part, without the prior written permission of SGI.
LIMITED RIGHTS LEGEND
The software described in this document is “commercial computer software” provided with restricted rights (except as to included open/free source) as specified in the F AR 52.227-19 and/or the DFAR 227.7202, or successive sections. Use beyond license provisions is a violation of worldwide intellectual property laws, treaties and conventions. This document is provided with limited rights as defined in 52.227-14.
The electronic (software) version of this document was developed at private expense; if acquired under an agreement with the USA government or any contractor thereto, it is acquired as “commercial computer software” subject to the provisions of its applicable license agreement, as specified in (a) 48 CFR
12.212 of the FAR; or, if acquired for Department of Defense units, (b) 48 CFR 227-7202 of the DoD FAR Supplement; or sections succeeding thereto. Contractor/manufacturer is SGI, 46600 Landing Parkway, Fremont, CA 94538.
TRADEMARKS AND ATTRIBUTIONS
SGI, and the SGI logo are registered trademarks and Rackable is a trademark of Silicon Graphics International in the United States and/or other countries worldwide.
Intel, Intel QuickPath Interconnect (QPI) and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Fusion-MPT, Integrated RAID, MegaRAID, and LSI Logic are trademarks or registered trademarks of LSI Logic Corporation. HyperTransport is a licensed trademark of the HyperTransport Technology Consortium. InfiniBand is a registered trademark of the InfiniBand Trade Association. Internet Explorer and MS-DOS are registered trademarks of Microsoft Corporation. Java and Java Virtual Machine are trademarks or registered trademarks of Sun Microsystems, Inc. Linux is a registered trademark of Linus Torvalds, used with permission by SGI. Novell and Novell Netware are registered trademarks of Novell Inc. PCIe and PCI-X are registered trademarks of PCI SIG. Phoenix and PhoenixBIOS are registered trademarks of Phoenix Technologies Ltd. Red Hat and all Red Hat-based trademarks are trademarks or registered trademarks of Red Hat, Inc. in the United States and other countries. SUSE LINUX and the SUSE logo are registered trademarks of Novell, Inc. UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company, Ltd.
All other trademarks mentioned herein are the property of their respective owners.
007-5839-001 iii
Record of Revision
Version Description
001 June, 2012
First release
007-5839-001 v
About This Guide
This guide provides an overview of the installation, architecture, general operation, and descriptions of the major components in the SGI
®
Rackable™ C1104G-RP5 server. It also provides basic troubleshooting and maintenance information, BIOS information, and important safety and regulatory specifications.
Audience
This guide is written for users of SGI Rackable C1104G-RP5 server systems. It is written with the assumption that the reader has a good working knowledge of computers and computer syst ems. This guide may be useful to installers and system administrators looking for overview information on the server.
Chapter Descriptions
The following topics are covered in this guide:
Chapter 1, “Introd uctio n”
Provides an overview of the server’s components.
Chapter 2, “Server Installation”
Provides a quick setup checklist to get the server operational.
Chapter 3, “System Interface”
Describes several LEDs on the control panel as well as others on the SATA drive carriers that keep you constantly informed of the overall status of the system as well as the activity and health of specific components.
Chapter 4, “System Safety”
Provides general system safety information.
Chapter 5, “System Severboard Information”
vi 007-5839-001
Provides best practice procedures to work with a node board in the C1104G-RP5 chassis, install memory DIMMs, PCIe expansion cards and hard disk drives.
Chapter 6, “Basic Troubleshooting and Chassis Service” Describes some basic steps required to troubleshoot your system. Additional sections in this chapter are intended to guide you through basic component remove and replace procedures.
Appendix A, “BIOS Error Codes,” Provides a brief listing of BIOS error code information.
Appendix B, “System Specifications,”
Describes system component environmental specifications and compliance.
:
007-5839-001 vii
Related Publications
The following SGI and LSI documents may be relevant to the use of your server:
MegaRAID SAS Software User’s Guide, publication number, 860-0488-00x
SGI Performance Suite series documentation
SGI InfiniteStorage series documentation
Man pages (online)
You can obtain SGI documentation (as well as the pertinent LSI books), release notes, or man pages in the following ways:
Refer to the SGI Technical Publications Library at http://docs.sgi.com. Various formats are
available. This library contains the most recent and most comprehensive set of online books, release notes, man pages, and other information.
You can also view man pages by typing man <title> on a command line.
SGI systems include a set of Linux
®
man pages, formatted in the standard UNIX® “man page” style. Important system configuration files and commands are documented on man pages. These are found online on the internal system disk (or DVD-CD) and are displayed using the man command. For additional information about displaying man pages using the man command, see man(1).
In addition, the apropos command locates man pages based on keywords. For examp le, to display a list of man pages that describe disks, type the following on a command line:
apropos disk
For information about setting up and using apropos, see apropos(1).
viii 007-5839-001
Conventions
The following conventions are used throughout this document:
Product Support
SGI provides a comprehensive product support and maintenance program for its products. SG I also offers services to implement and integrate Linux applications in your environment.
Refer to http://www.sgi.com/support/
If you are in North America, contact the Technical Assistance Center at +1 800 800 4SGI or contact your authorized service provider.
If you are outside North America, contact the SGI subsidiary or authorized distributor in your country.
Convention Meaning
Command This fixed-space font denotes literal items such as commands, files,
routines, path names, signals, messages, and programming language structures.
variable The italic typeface denotes variable entries and words or concepts being
defined. Italic typeface is also used for book titles.
user input This bold fixed-space font denotes literal items that the user enters in
interactive sessions. Output is shown in nonbold, fixed-space font. [ ] Brackets enclose optional portions of a command or directive line. ... Ellipses indicate that a preceding element can be repeated. man page(x) Man page section identifiers appear in parentheses after man page names. GUI element This font denotes the names of graphical user interface (GUI) elements such
as windows, screens, dialog boxes, menus, toolbars, icons, buttons, boxes,
fields, and lists.
:
007-5839-001 ix
Reader Comments
If you have comments about the technical accuracy, content, or organization of this document, contact SGI. Be sure to include the title and document number of the manual with your comments. (Online, the document number is located in the front matter of the manual. In printed manuals, the document number is located at the bottom of each page.)
You can contact SGI in any of the following ways:
Send e-mail to the following address: techpubs@sgi.com
Contact your customer service representative and ask that an incident be filed in the SGI incident tracking system.
Send mail to the following address: SGI
Technical Publications 46600 Landing Parkway Fremont, CA 94538
SGI values your comments and will respond to them promp tly.
007-5839-001 xi
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 1
Server Board Features . . . . . . . . . . . . . . . . . . . . . . 3
Processors . . . . . . . . . . . . . . . . . . . . . . . . 3
QPI Interconnect . . . . . . . . . . . . . . . . . . . . . . 3
Memory. . . . . . . . . . . . . . . . . . . . . . . . . 3
Serial ATA and Optional SAS . . . . . . . . . . . . . . . . . . . 3
PCI Express Expansion Slots . . . . . . . . . . . . . . . . . . . 4
Onboard Controllers/Ports . . . . . . . . . . . . . . . . . . . . 4
Onboard Graphics Controller . . . . . . . . . . . . . . . . . . . 4
IPMI . . . . . . . . . . . . . . . . . . . . . . . . . 4
Other Features . . . . . . . . . . . . . . . . . . . . . . . 4
Server Chassis Features . . . . . . . . . . . . . . . . . . . . . . 5
System Power . . . . . . . . . . . . . . . . . . . . . . . 5
Serial ATA Subsystems . . . . . . . . . . . . . . . . . . . . 5
Front Control Panel. . . . . . . . . . . . . . . . . . . . . . 5
Serverboard and GPU Subsystem . . . . . . . . . . . . . . . . . . 5
GPU Features . . . . . . . . . . . . . . . . . . . . . . 6
Cooling System. . . . . . . . . . . . . . . . . . . . . . . 6
2 Server Installation . . . . . . . . . . . . . . . . . . . . . . . 9
Unpack the System . . . . . . . . . . . . . . . . . . . . . . . 9
Prepare for Setup . . . . . . . . . . . . . . . . . . . . . . 9
Choose a Setup Location . . . . . . . . . . . . . . . . . . . . 9
System Warnings and Precautions . . . . . . . . . . . . . . . . . . . 10
Server Precautions . . . . . . . . . . . . . . . . . . . . . . 11
Rack Mounting Considerations . . . . . . . . . . . . . . . . . . . . 11
Ambient Operating Temperature . . . . . . . . . . . . . . . . . . 11
Reduced Airflow . . . . . . . . . . . . . . . . . . . . . . 11
xii 007-5839-001
Contents
Mechanical Loading. . . . . . . . . . . . . . . . . . . . . . 12
Circuit Overloading . . . . . . . . . . . . . . . . . . . . . . 12
Reliable Ground . . . . . . . . . . . . . . . . . . . . . . . 12
Install the System into a Rack . . . . . . . . . . . . . . . . . . . . 12
Separate the Sections of the Rack Rails. . . . . . . . . . . . . . . . . 12
Inner Rail Extensions . . . . . . . . . . . . . . . . . . . . . 14
Installing the Inner Rail Extensions . . . . . . . . . . . . . . . . 14
Assembling the Outer Rails . . . . . . . . . . . . . . . . . . . . 15
Assembling the Outer Rails. . . . . . . . . . . . . . . . . . . 15
Attaching the Outer Rack Rails . . . . . . . . . . . . . . . . . . . 16
Using the Rail Locking Tabs . . . . . . . . . . . . . . . . . . 17
Install the Server in a Rack . . . . . . . . . . . . . . . . . . . . 18
Supply Power to the System . . . . . . . . . . . . . . . . . . 19
3 System Interface. . . . . . . . . . . . . . . . . . . . . . . . 21
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Control Panel Buttons . . . . . . . . . . . . . . . . . . . . . . 21
Control Panel LEDs . . . . . . . . . . . . . . . . . . . . . . . 22
Power Fail LED . . . . . . . . . . . . . . . . . . . . . . . 23
Overheat/Fan Fail/UID LED . . . . . . . . . . . . . . . . . . . 23
NIC1. . . . . . . . . . . . . . . . . . . . . . . . . . 23
NIC2. . . . . . . . . . . . . . . . . . . . . . . . . . 24
HDD . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Power . . . . . . . . . . . . . . . . . . . . . . . . . 24
Drive Carrier LEDs . . . . . . . . . . . . . . . . . . . . . . . 25
4 System Safety . . . . . . . . . . . . . . . . . . . . . . . . 27
Electrical Safety Precautions. . . . . . . . . . . . . . . . . . . . . 27
Serverboard Battery . . . . . . . . . . . . . . . . . . . . . . 28
ESD Precautions. . . . . . . . . . . . . . . . . . . . . . . 28
Mainboard Replaceable Soldered-in Fuses . . . . . . . . . . . . . . . . 29
General Safety Precautions . . . . . . . . . . . . . . . . . . . . . 29
5 System and Serverboard Information. . . . . . . . . . . . . . . . . . 31
Handling Circuit Boards and Drives . . . . . . . . . . . . . . . . . . . 31
Contents
007-5839-001 xiii
ESD Precautions . . . . . . . . . . . . . . . . . . . . . . 32
Unpacking . . . . . . . . . . . . . . . . . . . . . . . . 32
System Rear I/O Ports . . . . . . . . . . . . . . . . . . . . . . 33
Serverboard Details. . . . . . . . . . . . . . . . . . . . . . 33
CPUs . . . . . . . . . . . . . . . . . . . . . . . . 34
Memory . . . . . . . . . . . . . . . . . . . . . . . 34
GPUs . . . . . . . . . . . . . . . . . . . . . . . . 34
PCIe Expansion Slots . . . . . . . . . . . . . . . . . . . . 34
System Health Monitoring. . . . . . . . . . . . . . . . . . . 34
ACPI Features . . . . . . . . . . . . . . . . . . . . . . 35
Onboard I/O . . . . . . . . . . . . . . . . . . . . . . 35
Serverboard Dimensions . . . . . . . . . . . . . . . . . . . 35
Hard Disk Drives (C1104G-RP5 Chassis) . . . . . . . . . . . . . . . . . 37
Drive Configurations . . . . . . . . . . . . . . . . . . . . . 37
PCIe Expansion Cards . . . . . . . . . . . . . . . . . . . . . . 38
Power Supply Functional Rating . . . . . . . . . . . . . . . . . . 39
6 Basic Troubleshooting and Chassis Service . . . . . . . . . . . . . . . . 41
Basic Troubleshooting Procedures . . . . . . . . . . . . . . . . . . . 41
If the System Does Not Power Up . . . . . . . . . . . . . . . . . . 41
System Powers Up But Will Not Boot . . . . . . . . . . . . . . . . . 42
No Video After System Power Up . . . . . . . . . . . . . . . . . . 42
Memory Errors . . . . . . . . . . . . . . . . . . . . . . . 42
Chassis Service Information. . . . . . . . . . . . . . . . . . . . . 43
Static-Sensitive Devices . . . . . . . . . . . . . . . . . . . . 43
Precautions . . . . . . . . . . . . . . . . . . . . . . . . 43
Unpacking . . . . . . . . . . . . . . . . . . . . . . . . 44
Control Panel . . . . . . . . . . . . . . . . . . . . . . . . 44
Drive Bay Installation/Removal. . . . . . . . . . . . . . . . . . . . 44
Accessing the Drive Bays . . . . . . . . . . . . . . . . . . . . 44
Removing Hard Drives or Carriers from the Chassis . . . . . . . . . . . . . 45
The Hard Drive Backplane . . . . . . . . . . . . . . . . . . 45
Disk Drive Installation . . . . . . . . . . . . . . . . . . . . . 45
Hard Drive Carrier Assembly Usage . . . . . . . . . . . . . . . . . 46
xiv 007-5839-001
Contents
Power Supply. . . . . . . . . . . . . . . . . . . . . . . . . 49
Power Supply Failure . . . . . . . . . . . . . . . . . . . . . 49
Removing/Replacing a Power Supply . . . . . . . . . . . . . . . . . 50
Removing the Power Supply . . . . . . . . . . . . . . . . . . 50
Installing a New Power Supply. . . . . . . . . . . . . . . . . . 50
Accessing the Inside of the Chassis . . . . . . . . . . . . . . . . . . . 52
System Fans . . . . . . . . . . . . . . . . . . . . . . . . . 52
System Fan Failure . . . . . . . . . . . . . . . . . . . . . . 52
Replacing System Fans . . . . . . . . . . . . . . . . . . . . . 53
Remove/Replace a Fan . . . . . . . . . . . . . . . . . . . . 53
Install/Replace a PCIe Expansion Card . . . . . . . . . . . . . . . . . . 57
Install/Replace a Low-profile or Full-height PCIe Card . . . . . . . . . . . . 57
A BIOS Error Codes . . . . . . . . . . . . . . . . . . . . . . . 59
B System Operating and Regulatory Overview . . . . . . . . . . . . . . . . 61
Operating Environment . . . . . . . . . . . . . . . . . . . . . . 61
System Input Requirements . . . . . . . . . . . . . . . . . . . . . 61
Power Supply. . . . . . . . . . . . . . . . . . . . . . . . . 61
Regulatory Compliance . . . . . . . . . . . . . . . . . . . . . . 62
007-5839-001 1
Chapter 1
1. Introduction
Important: SGI Rackable server systems may sometimes require driver versions that are not included in the original operating system release. When required, SGI provides these drivers on an SGI Driver CD, which may ship with the system, or on the system disk (pre-installed in the factory). For more information on this topic check with your sales or service representative.
The Rackable C1104G-RP5 server is a 1U rackmount system (see Figure 1-1 on page 2)
.
In addition to the serverboard and chassis, various hardware components may be included with the system, as listed:
Ten 4-cm chassis fans
One internal air shroud
T wo passive 1U CPU heatsinks
Riser cards as follows: – One riser for a single PCIe x16 card, (left-front side internal GPU card) – One riser for a single PCIe x16 card, (left-back side internal GPU card) – One riser for a single PCIe x16 card, (right-front side internal GPU card) – One riser for one low-profile PCIe 3.0 x8 card (external-facing rear card)
Three power cables for GPU cards
SATA accessories: – One SAS/SATA backplane – Four hot-swap disk drive carriers (RAID must be enabled for hot swap)
Two power supplies
One rackmount kit
2 007-5839-001
1: Introduction
Note: The Rackable C1104G-RP5 server does not use an internal CD/DVD drive. Check with your SGI sales or service representative for information on optional external CD/DVD drive units.
Figure 1-1 Rackable C1104G-RP5 Server Front and Rear Views
System
reset
Four disk drive bays
21
System
LEDs
Main
power
VGA
port
Ethernet
ports
USB
ports
IPMI LAN
PCIe low-profile
expansion slot
Optional GPU or
x16 full-height PCIe slot
Server Board Features
007-5839-001 3
Server Board Features
At the heart of the system is a dual-processor serverboard based on the Intel C602 chipset and designed to provide maximum performance. The main features of the serverboard are described in the following subsections.
Processors
The serverboard supports two four, six or eight-core, Intel Xeon E5-2600 series processors. Each processor sits in an LGA 2011 socket and is interconnected via Intel QuickPath Interconnect (QPI) link support; see the next subsection for more information on the QPI interconnects.
QPI Interconnect
Separate QPI link pairs connect the two processors and the I/O hub in a network on the motherboard, allowing all of the components to access other components via the network.
Each QPI comprises two 20-lane point-to-point data links, one in each direction (full duplex), with a separate clock pair in each direction, for a total of 42 signals. Each signal is a differential pair, so the total number of pins is 84. The 20 data lanes are divided onto four “quadrants” of 5 lanes each. The basic unit of transfer is the 80-bit “flit”, which is transferred in two clock cycles (four 20 bit transfers, two per clock.) The 80-bit “flit” has 8 bits for error detection, 8 bits for “link-layer header” and 64 bits for “data”. QPI bandwidths are advertised by computing the transfer of 64 bits (8 bytes) of data every two clock cycles in each direction
Memory
The serverboard has eight DIMM slots (four per processor) that support DDR3 1600/1333/1066/800 MHz RDIMMs. Note that memory speed support is dependent on the type of CPU used on the mother board. Up to 256 GBs of total memory is supported.
Serial ATA and Optional SAS
A Serial ATA controller ASIC is integrated into the system serverboard to provide a standard four-port SATA disk subsystem. Drive ports 0 and 1 are SATA 3.0 ports and the remaining drive slots are SATA 2.0 ports. The hot-swappable SATA drives are connected to a backplane that provides power, bus termination and configuration settings. Optional RAID 0, 1, 5, 6 and 10 are supported. Note that your operating system must have RAID support enabled to accommodate hot
4 007-5839-001
1: Introduction
swapping of disk drives. Certain RAID configurations require use of optional hardware. Check with your sales or service representative to obtain information on set-up procedures if your system did not come pre-configured with SATA or SAS RAID.
PCI Express Expansion Slots
The dual processor serverboard has three PCIe 3.0 x16 slots to support internal double-width GPU cards. An additional slot at the rear of the server supports one PCIe 3.0 x8 low-profile card.
See the section “PCIe Expansion Cards” in Chapter 5 for more information on these topics.
Onboard Controllers/Ports
The color-coded I/O ports include (an internal COM header located on the serverboard), VGA (monitor) port, two external USB 2.0 ports and two gigabit LAN Ethernet ports. A dedicated external IPMI LAN port is also included.
Onboard Graphics Controller
The dual-processor serverboard features an integrated Matrox G200 video controller providing a 16MB DDR2 graphics interface through the system VGA connector. The Matrox video controller in the 1U server features low power consumption, high reliability and superior longevity.
IPMI
IPMI (Intelligent Platform Management Interface) is a hardware-level interface specification that provides remote access, monitoring and administration for your SGI Rackable C1104G-RP5 server platforms. IPMI allows server administrators to view a server’s hardware status remotely, receive an alarm automatically if a failure occurs, and power cycle a system that is non-responsive.
Other Features
Other onboard features that promote system health include onboard voltage monitors, a chassis intrusion header, auto-switching voltage regulators, chassis and CPU overheat sensors, virus protection and BIOS rescue.
Server Chassis Features
007-5839-001 5
Server Chassis Features
The following subsections provide a general outline of the main features of the SGI Rackable C1104G-RP5 server chassis.
System Power
The Rackable C1104G-RP5 1U server chassis features a redundant power supply composed of two separate power modules. This power redundancy feature allows you to replace a failed power supply without shutting down the system. Note that each power supply provides up to 1800 W atts of power to the system.
Serial ATA Subsystems
The server chassis supports up to four 2.5-in SATA drives. Drives 0 and 1 are SATA 3.0 6-Gb/second slots, drives 2 and 3 are 3-Gb/second (SATA 2.0) disk drives. RAID drives are hot-swappable units and are connected to a backplane that provides power and control. Note that the operating system you have installed must support RAID to enable the hot-swap capability of RAID drives. Certain RAID levels require use of optional hardware to support RAIDed hard disk drives in the server.
Front Control Panel
The control panel on the C1104G-RP5 server provides you with system monitoring and control. LEDs that indicate system power, HDD activity, network activity, system overheat and a system overheat/fan-fail/ UID LED. A main power button and a system reset button are also included.
Serverboard and GPU Subsystem
The C1104G-RP5 server chassis is an ATX form factor chassis designed to be used in a 1U rackmount configuration. The serverboard’s I/O backplane supports up to three standard size (double-width) GPUs to enable high-quality GPU computing solutions. A 15-pin VGA port, two USB 2.0 ports, two Gigabit LAN ports and a dedicated RJ-45 IPMI LAN port are also supported. The GPUs process complex image calculations and then route the data out through the VGA port on the serverboard. The GPUs, which come with a passive heatsink attached, have been tested for use with this system.
6 007-5839-001
1: Introduction
Important: Check with your SGI sales or service representative prior to using any GPU not sourced at the SGI factory.
Any combination of these cards (up to a total of three) may have come bundled with the system. Power for the GPU cards is provided via a GPU power cable from each of the GPUs to JPW3, 4, 6 and 7 on the serverboard (one cable for each card). Figure 1-2 on page 7 shows a general block diagram of the C1104G-RP5 server’s processor and I/O chipset.
GPU Features
Each of the GPUs will feature some or all of the following:
Hundreds of GPU cores in each card that can deliver more than 660 Gigaflops of double-precision and over one Teraflop of single-precision calculations.
ECC protected internal register files, L1/L2 caches, shared memory, and external DRAM.
Up to 6 GB of GDDR5 memory per GPU enhances performance and reduces data transfers by keeping larger data sets in local memory attached directly to the GPU.
Integrates the GPU subsystem with the C1104G-RP5 server’s monitoring and management capabilities such as IPMI.
Onboard L1 and L2 caches that accelerate algorithms and sparse-matrix multiplication.
Provides faster context switching, concurrent kernel execution and improved thread block scheduling.
Enhances overall system performance by transferring data over the PCIe bus while the computing cores are processing other data.
Provides a flexible programming environment with broad support for various programming languages and APIs.
Cooling System
The 1U server chassis has a cooling design that includes ten internal 4-cm counter-rotating Pulse Width Modulated (PWM) system cooling fans located in the chassis. An air shroud channels the airflow from the system fans to efficiently cool the processor and GPU areas of the system. Each power supply module also includes an internal cooling fan. All chassis and power supply fans operate continuously.
Server Chassis Features
007-5839-001 7
Figure 1-2 Processor, Memory and I/O Chipset System Block Diagram
I350/X540
LAN
USB 2.0
USB
2 Rear
SATA3/SAS
PORTs#0~3
800/1066/1333/1600
800/1066/1333/1600
DDRIII
DDRII
I
P1
P1
P0
P0
#0-4
#0-3
#0-2
#0-1
#1-4
#1-3
#1-2
#1-1
QPI
8G
LANE6
SLOT 1
E5-2600 Series
PCH
C602/C604
PCI-E X16 G3
DMI2
LANE5
LANE1/2/3/4
SPI
SIO
W83527
DMI2
BMC
WPCM450
PCI-E X16
PCI-E X16 G3
PCI
SATA3
E5-2600 Series 8 SNB CORE DDR-III
8 SNB CORE
DDR-III
DMI2
DMI2
#2 #3 #1 #1 #2 #3
QPI
8G
4GB/s
PCI-E X16
VGA
PCI-E X8
PCI-E X16
SAS
#0~#3
#0~#1
PCI-E X8 G3
PCI-E X16 G3
PCI-E X8 G3
COM Port
Internal
PCI
SLOT 2
SLOT 3
SLOT 4
007-5839-001 9
Chapter 2
2. Server Installation
This chapter provides a quick setup checklist to get the SGI Rackable C1104G-RP5 operational. If your system came already mounted in a rack, you can skip the rack installation procedures.
Unpack the System
Inspect the shipping container that the C1104G-RP5 was shipped in and note if it was damaged in any way. If the server shows damage, file a damage claim with the carrier who delivered it.
Decide on a suitable location for the rack that supports the weight, power requirements, and environmental requirements of the C1104G-RP5 server. It should be situated in a clean, dust-free environment that is well ventilated. Avoid areas where heat, electrical noise, and electromagnetic fields are generated. Place the server rack near a grounded power outlet. Refer also to “System
Warnings and Precautions” on page 10.
Prepare for Setup
The shipping container should include two sets of rail assemblies, two rail mounting brackets and the mounting screws that you will use to install the system into a rack. Note that the inner rails should already be attached to the server. Read this section in its entirety before you begin the installation procedure.
Choose a Setup Location
Leave enough clearance in front of the rack to enable you to open the front door completely (~25 inches) and approximately 30 inches of clearance in the back of the rack to allow for sufficient airflow and ease in servicing. This clearance may vary depending on the type of rack and installation site chosen. This product is for installation only in an (IEC 60950) Restricted Access Location - dedicated equipment rooms, service closets and labs, etc. See also, “Regulatory
Compliance” in Appendix B.
10 007-5839-001
2: Server Installation
System Warnings and Precautions
Warning: The SGI Rackable C1104G-RP5 server weighs up to 37 lbs (16.8 kg). Always
use proper lifting techniques when you move the server. Always get the assistance of another qualified person when you install the sever in a location above your shoulders. Failure to do so may result in serious personal injury or damage to the equipment.
Warning: Extend the leveling jacks on the bottom of the rack to the floor with the full
weight of the rack resting on them. Failure to do so can result in serious injury or death.
Warning: Attach stabilizers to the rack in single rack installations. Failure to do so can
result in serious injury or death. Couple racks together in multiple rack installations. Failure to do so can result in serious injury or death.
W arning: Be sure the rack is stable befor e extending a component fr om the rack. Failur e
to do so can result in serious injury or death.
W arning: Extend only one rack component at a time. Extending tw o or more component s
simultaneously may cause the rack to tip over and result in serious injury or death.
Figure 2-1 Slide/Rail Equipment Usage Caution
!
!
!
!
!
Rack Mounting Considerations
007-5839-001 11
Server Precautions
Review the electrical and general safety precautions.
Determine the placement of each component in the rack before you install the rails.
Install the heaviest server components in the bottom of the rack first, and then work up.
Add a regulating uninterruptible power supply (UPS) to protect the server from power surges and voltage spikes and to keep your system operating in case of a power failure.
Allow the hot-pluggable disk drives and power supply modules to cool before touching.
Always keep the rack’ s front door and all panels and components on the servers closed when not servicing to maintain proper cooling.
The server is not considered suitable for visual display work place devices under the German government ordinance for work with visual display units.
Rack Mounting Considerations
Use the guidelines provided in the following subsections to properly install the server in a rack.
Ambient Operating Temperature
If installed in a closed or multi-unit rack assembly, the ambient operating temperature of the rack environment may be greater than the ambient temperature of the room. Therefore, consideration should be given to installing the equipment in an environment compatible with the manufacturer’s maximum rated ambient temperature (
35º C or 95º F).
Important: In a system using three NVIDIA M2090 GPUs the maximum rated ambient temperature of the operational environment should be 30
º C (86º F). A system using two M2090
GPUs in the front slot positions can be operated at 35º C (95º F) with 115 watt system CPUs. Check with your SGI sales or service representative for additional information on this topic.
Reduced Airflow
Equipment should be mounted into a rack so that the amount of airflow required for safe operation is not compromised.
12 007-5839-001
2: Server Installation
Mechanical Loading
Equipment should be mounted into a rack so that a hazardous condition does not arise due to uneven mechanical loading. Racks should generally be filled with equipment from the bottom up.
Circuit Overloading
Consideration should be given to the connection of the equipment to the power supply circuitry and the effect that any possible overloading of circuits might have on overcurrent protection and power supply wiring. Appropriate consideration of equipment nameplat e ratings should be used when addressing this concern.
Reliable Ground
A reliable ground must be maintained at all times. To ensure this, the rack itself should be grounded. Particular attention should be given to power supply connections other than the direct connections to the branch circuit (for example, the use of power strips, and so on).
Install the System into a Rack
This section provides information on installing the C1104G-RP5 into a rack. If the system has already been mounted into a rack, refer to “Supply Power to the System” on page 19. There are a variety of rack units on the market, which may mean the assembly procedure will differ slightly. You should also refer to the installation instructions that came with the rack unit you are using. Note that this system’s rail kit is designed to fit a rack between 26-in and 33.5-in deep.
Separate the Sections of the Rack Rails
The chassis package includes two rail assemblies in the rack mounting kit.
Each assembly consists of two sections: an inner fixed chassis rail that secures directly to the server chassis and an outer fixed rack rail that secures directly to the rack itself. Note that the inner rail may be pre-installed on the server by the SGI factory making separation steps unnecessary.
To separate the inner and outer rails, perform the following steps:
Install the System into a Rack
007-5839-001 13
1. Locate the rail assembly in the chassis packaging as shown in Figure 2-2.
2. Extend the rail assembly by pulling it outward.
3. Press the quick-release tab
4. Separate the inner rail from the outer rail assembly.
Figure 2-2 Separating the System Rack Rail Components
2
1
14 007-5839-001
2: Server Installation
Inner Rail Extensions
The chassis includes a set of inner rack rails in two sections:
inner rails
inner rail extensions
The inner rails are preattached to the server chassis and do not interfere with normal use of the system if you decide not to install it into a server rack. Attaching the inner rail extensions to the inner rails stabilizes the chassis within the rack is described in the following subsection.
Installing the Inner Rail Extensions
1. Place the inner rail extensions ov er the preattached inner rails which are attached to the side of the chassis. Align the hooks of the inner rail with the rail extension holes. Make sure the extension faces “outward” just like the inner rail.
2. Slide the extension toward the front of the chassis.
3. Secure the chassis with screws as illustrated in Figure 2-3 on page 15.
4. Repeat steps 1-3 for the other inner rail extension.
Install the System into a Rack
007-5839-001 15
Figure 2-3 Rail to Server Chassis Attachment Example
Assembling the Outer Rails
Each outer rail is in two sections that must be assembled before mounting on to the rack.
Assembling the Outer Rails
1. Identify the left and right outer rails by examining the ends, which bend outward.
2. Slide the front section of the outer rail into the rear section of the outer rail.
3. The assembly should look similar to the example in Figure 2-4 on page 16.
2
1
16 007-5839-001
2: Server Installation
Figure 2-4 Outer Rail Assembly Example
Attaching the Outer Rack Rails
Outer rails attach to the rack and hold the chassis in place. They extend between 26.5 and 36.4 inches.
1. Measure the depth of the rack (distance from the front rail to the rear rail) to ensure it complies with the limitations listed.
2. Adjust the outer rails to the proper length to fit within the rack. See the placement example in
Figure 2-5 on page 17.
3. Hang the hooks of the front of the outer rail onto the slots on the front of the rack. Use screws to secure the outer rails to the rack.
4. Pull out and adjust both the short and long brackets to the proper distance so that the rail can fit snugly into the rack, reference Figure 2-6 on page 18.
5. Hang the hooks of rear portion of the outer rail into the slots on the rear of the rack. Secure the long bracket to the rear side of the outer rail with the screws provided.
6. Repeat the previous steps to properly install the left outer rail.
Install the System into a Rack
007-5839-001 17
Figure 2-5 Outer Rack Rail Assembly/Placement Example
Using the Rail Locking Tabs
Both chassis rails have a locking tab, which serves two functions:
The tabs can lock the server into place when installed and pushed fully into the rack, (its
normal operating position).
The tabs also lock the server in place when fully extended from the rack. This prevents the
server from coming completely out of the rack when pulled out for servicing. Depress bo th tabs at the same time to fully remove the server from its rail mounting and extract it from the rack.
18 007-5839-001
2: Server Installation
Install the Server in a Rack
Warning: The SGI Rackable C1104G-RP5 server weighs up to 37 lbs (16.8 kg) Always
use proper lifting techniques when your move the server. Always get the assistance of another qualified person when you install the sever in a location above your shoulders. Failure to do so may result in serious personal injury or damage to the equipment.
1. Extend the outer rails on either side of the rack rail assembly.
2. Align the inner rails of the chassis with the outer rails on the rack, see Figure 2-6.
Figure 2-6 Installing the Server in the Rack
!
Install the System into a Rack
007-5839-001 19
3. Slide the inner rails into the outer rails, keeping the pressure even on both sides. When the
chassis has been pushed completely into the rack, it should click into the locked position.
4. Optional screws are recommended to secure and hold the front of the chassis to the rack.
Supply Power to the System
Connect the power cords from the power supply modules into a power strip or power distribution unit (PDU) within the rack. An optionally available uninterruptible power supply (UPS) can ensure continued operation in case of a failure of the regular power source.
After all power connections are verified, push the power-on button on the front of the server when you wish to power on the unit.
007-5839-001 21
Chapter 3
3. System Interface
Overview
There are a number of LEDs on the front control panel as well as others on the drive carriers and power supplies to keep you constantly informed of the overall status of the system. See Figure 3-1 for an example of the front control panel. These LEDs provide constant information on the system and on the overall health of system components.
Figure 3-1 System Front Control Panel Indicator Components
Control Panel Buttons
In addition to monitoring the activity and health of specific components using LEDs, the system uses two buttons located on the front of the chassis: a reset button and a power on/off button. Use the reset button to reboot the system as shown in Figure 3-2.
21
22 007-5839-001
3: System Interface
Figure 3-2 System Reset Button
Figure 3-3 shows the main power button, which is used to apply or turn off the main system power.
Turning off system power with this button removes the main power but keeps standby power supplied to the system.
Figure 3-3 System Power On Button
Control Panel LEDs
The control panel located on the front of the chassis has several LEDs. These LEDs provide you with critical information related to different parts of the system. This section explains what each LED indicates when illuminated or flashing and any corrective action you may need to take.
Control Panel LEDs
007-5839-001 23
Power Fail LED
The power fail LED indicates a power supply module has failed as shown in Figure 3-4. The second power supply module will take the load and keep the system running but the failed module will need to be replaced. Refer to Chapter 6 for details on replacing the power supply. This LED should be off when the system is operating normally.
Figure 3-4 Power Fail LED
Overheat/Fan Fail/UID LED
When the red overheat/fan/UID LED flashes (shown in Figure 3-5), it indicates a fan failure. When on continuously it indicates an overheat condition, which may be caused by cables obstructing the airflow in the system or the ambient room temperature being too warm.
Check the routing of the cables and make sure all fans are present and operating normally. You should also check to make sure that the chassis covers and fan shrouds are installed properly. This LED will remain flashing or on as long as the indicated condition exists.
The “blue light” function (UID) of this LED is used to identify a specific server in large racks filled with equipment. When activated through the system software the “blue light” will remain on until shut down by the administrator.
Figure 3-5 Overheat/Fan Fail/UID LED
NIC1
When flashing, the NIC1 LED indicates network activity on the LAN1 port (see Figure 3-6).
24 007-5839-001
3: System Interface
Figure 3-6 LAN1 Network Activity NIC1 LED
NIC2
When flashing, the NIC2 LED indicates network activity on the LAN2 port (see Figure 3-7).
Figure 3-7 LAN2 Network Activity NIC2 LED
HDD
The HDD LED indicates hard drive activity when flashing (see Figure 3-8).
Figure 3-8 Hard Drive Activity LED
Power
The power LED indicates power is being supplied to the system's power supply unit(s). An example LED is shown in Figure 3-9. This LED should normally be illuminated when the system is operating.
1
2
Drive Carrier LEDs
007-5839-001 25
Figure 3-9 Power On LED
Drive Carrier LEDs
The system hard disk drives each have two LEDs, that function as listed in the following two paragraphs:
Green: When illuminated, the green LED on the drive carrier indicates drive activity. A
connection to the drive backplane enables this LED to blink on and off when that particular drive is being accessed. Please refer to Chapter 6 for instructions on replacing failed drives.
Red: When this LED is flashing it indicates that a raided drive is rebuilding. A solidly lit red
LED indicates a drive failure. If the drives fails, you should be notified by your system management software. Refer to Chapter 6 for instructions on replacing failed drives.
007-5839-001 27
Chapter 4
4. System Safety
This chapter describes basic safety precautions when using the server.
Electrical Safety Precautions
Basic electrical safety precautions should be followed to protect yourself from harm and the Rackable C1104G-RP5 system from damage, as follows:
Be aware of the locations of the power on/off switch on the chassis as well as the room's
emergency power-off switch, disconnection switch or electrical outlet. If an electrical accident occurs, you can then quickly remove power from the system.
Do not work alone when working with high voltage components.
Power should always be disconnected from the system when removing or installing main
system components, such as the memory modules and disk drives. When disconnecting power, you should first power down the operating system and then unplug the power cords. The unit can have more than one power supply cord. Disconnect two power supply cords before servicing to avoid electrical shock.
When working around exposed electrical circuits, another person who is familiar with the
power-off controls should be nearby to switch off the power if necessary.
Use only one hand when working with powered-on electrical equipment. This is to avoid
making a complete circuit, which will cause electrical shock. Use extreme caution when using metal tools, which can easily damage any electrical components or circuit boards they come into contact with.
Do not use mats designed to decrease static electrical discharge as protection from electrical
shock. Instead, use rubber mats that have been specifically designed as electrical insulators.
The power supply power cords must include a grounding plug and must be plugged into
grounded electrical outlets or power distribution unit (PDUs).
28 007-5839-001
4: System Safety
Serverboard Battery
Caution: There is a danger of explosion if an onboard battery is installed upside down, which will reverse its polarities (see Figure 4-1). This battery must be replaced only with the same or an equivalent type recommended by the manufacturer. Check with your service representative if you have any questions.
Figure 4-1 Installing the Onboard Battery
Important: Handle used batteries carefully and do not damage the battery in any way; a damaged battery may release hazardous materials into the environment. Do not discard a used battery in the garbage or a public landfill. Dispose of used batteries according to the manufacturer's instructions and in compliance with the regulations set up by your local hazardous waste management agency .
ESD Precautions
Caution: This server contains electronic components and printed circuit boards which are susceptible to electrostatic discharge (ESD) damage. ESD is generated by two objects with different electrical charges coming into contact with each other. An electrical discharge is created to neutralize this difference, which can damage electronic components and printed circuit boards.
The following measures are generally sufficient to neutralize this difference before contact is made to protect your equipment from ESD:
Use a grounded wrist strap designed to prevent static discharge.
!
Lithium battery
Battery holder
!
General Safety Precautions
007-5839-001 29
Keep all components and printed circuit boards (PCBs) in their antistatic bags until ready for
use.
Touch a grounded metal object before removing the board from the antistatic bag.
Do not let components or PCBs come into contact with your clothing, which may retain a
charge even if you are wearing a wrist strap.
Handle a board by its edges only; do not touch its components, peripheral chips, memory
modules or contacts.
When handling chips or modules, avoid touching their pins.
Put the serverboard and peripherals back into their antistatic bags when not in use.
For grounding purposes, make sure your computer chassis provides excellent conductivity
between the power supply, the case, the mounting fasteners and the serverboard.
Mainboard Replaceable Soldered-in Fuses
Important: If your system comes with self-resetting PTC (Positive Temperature Coefficient) fuses on the serverboard, they must be replaced by trained service technicians only . The new fuse must be the same or equivalent as the one replaced. Contact your technical support organization for details and support.
General Safety Precautions
Follow these rules to ensure general safety:
Keep the area around the Rackable C1104G-RP5 system clean and free of clutter.
The Rackable C1104G-RP5 system weighs approximately 37 lbs (16.8 kg.) when fully
loaded. When lifting the system, two people at either end should lift slowly with their feet spread out to distribute the weight. Always keep your back straight and lift with your legs.
Place the chassis top cover and any system components that have been removed away from
the system or on a table so that they won't accidentally be stepped on.
While working on the system, do not wear loose clothing such as neckties and unbuttoned
shirt sleeves, which can come into contact with electrical circuits or be pulled into a cooling fan.
30 007-5839-001
4: System Safety
Remove any jewelry or metal objects from your body, which are excellent metal conductors that can create short circuits and harm you if they come into contact with printed circuit boards or areas where power is present.
After accessing the inside of the system, close the system back up and secure it to the rack unit with the retention screws after ensuring that all connections have been made.
Chapter 5
5. System and Serverboard Information
This chapter includes best practice procedures to work with a node board in the C1104G-RP5 chassis and understand the system PCIe expansion cards and hard disk drives. Use the information in Chapter 6 to troubleshoot your server and add, remove, or replace system components.
A layout and quick reference chart is included in this chapter for your reference.
Some software products are protected with software license keys derived from the Media Access Control (MAC) Ethernet address. If your system requires the replacement of a node board, the MAC Ethernet address changes. If you are using such a product, you or your service representative must request a new license key after replacement of a node board. Contact your local customer support office:
http://www.sgi.com/support/supportcenters.html
Caution: Install the chassis cover after you have completed accessing the components inside the server to maintain proper airflow and cooling for the system.
Handling Circuit Boards and Drives
Caution: Electrostatic discharge (ESD) can damage electrostatic-sensitive devices inside the C1104G-RP5 server. Use the ESD precautions described below when you handle printed circuit boards or other components in the system. The following measures are generally sufficient to protect your equipment from electro-static discharge.
!
!
ESD Precautions
Use a grounded wrist strap designed to prevent electrostatic discharge.
Touch a grounded metal object before removing any board from its antistatic bag.
Handle each printed circuit board (PCB) by the edges; do not touch the components, peripheral chips, memory modules, or gold contacts on the PCB.
When handling chips or modules, avoid touching the pins.
Store PCIe cards, or other boards and components in antistatic bags when not in use.
Make sure your computer chassis provides a conductive path between the power supply, the case, the mounting fasteners, and the node board to chassis ground.
Unpacking
Caution: System options are shipped in antistatic packaging to avoid electrostatic discharge damage. Be sure to use ESD precautions when you unpack upgrade or replacement components for the C1104G-RP5 server. Failure to do so can result in damage to the equipment.
!
System Rear I/O Ports
The rear external system I/O ports are color coded in conformance with the PC 99 specification. See Figure 5-1 below for the colors and locations of the various I/O ports. Table 5-1 identifies the functions of each of the I/O ports on the backpanel.
Figure 5-1 I/O Port Locations
Table 5-1 System Backpanel I/O Port Functions
Serverboard Details
The 1U C1104G-RP5 system chassis has one node board. The C1104G-RP5 serverboard is configured with two processors. When configured with two processors, the following rules apply:
Both processor sockets must have identical revisions, core voltage, and bus/core speed.
The stepping between the processors on the board must be identical.
See Figure 5-2 on page 36 for CPU locations on the serverboard - note that the drawing is
not to scale.
1. USB port 0
2. USB port 1
3. Dedicated IPMI LAN port
4. LAN port 1
5. LAN port 2
6. VGA port
7. UID switch Note that the dedicated IPMI LAN port runs at 100 Mb/sec.
1
2
3
4 5
6
7
CPUs
The C602 chipsets are used on the system serverboard
Memory
Eight DIMM slots supporting 1600/1333/1066/800 MHz registered ECC SDRAM
Note: Check with your authorized sales/service representative for installation of approved DIMM types.
GPUs
A total of three GPUs are supported (true PCI-E 3.0 x16 signal) - GPU types are limited, check with your sales or support representative
PCIe Expansion Slots
An external PCI-Express (PCIe) slot with the following features: – One PCIe Gen 3.0 x8 low-profile card (in x16 slot)
System Health Monitoring
Onboard voltage monitors
Fan status monitor with firmware/software on/off and speed control
Watch Dog
Environmental temperature monitoring via BIOS
Power-up mode control for recovery from AC power loss
System resource alert (via included utility program)
Auto-switching voltage regulator for the CPU core
CPU thermal trip support
I2C temperature sensing logic
Chassis intrusion detection
ACPI Features
Slow blinking LED for suspend state indicator
BIOS support for USB keyboard
Wake-On-LAN (WOL)
Internal/external modem ring-o n
Hardware BIOS Virus protection
Onboard I/O
Four disk drive bays supported by an on-chip SATA controller (RAID 0, 1, 5. 6 and 10 are
supported in this system)
Two (2) USB (Universal Serial Bus 2.0) po rts (r ear [external] type A)
Two (2) LAN ports supported by an onboard Intel® Ethernet cont roller for
10/100/1000Base-T
One (1) dedicated (RJ-45) IPMI LAN port
One (1) VGA port supported by an onboard Matrox
®
G200 graphics controller (with 16 MB
DDR2 memory)
Serverboard Dimensions
Proprietary board format is: 19.7" x 9.2" (500.4 mm x 233.7 mm).
Figure 5-2 Node Board Features
JPCIE6
JSD1
X9DRG-HF
LE4
SW1
JPW10
JLAN2
JLAN1
JRK1
JI2C2 JI2C1
JOH1
JSPK1
JL1
JBT1
J21
JCOM1
JPCIE1
JPCIE2
JPCIE3
JPCIE4
JPCIE5
JPW11
JPW3
JPW4
JPW5
JPW7
JPW8
JVGA1
JPW9
JPW1
I-SATA0
S-S ATA0
S-S ATA1
S-SATA2
S-SATA3
DM1
LE1
DM2
JPL1
JWD1
J30
JPB1
J29
JPBR1
JPME1
JWP1
JPG1
FAN2
FAN1
FANF
FAND
FANH
FANC
FANG
FANE
FAN4
FAN3
FANA
FANB
T-SGPIO5
T-SGPIO1
T-SGPIO2
I-SATA3
I-SATA4
I-SATA5
JF1
JTPM1
JPW2
USB/0/1
IPMI LAN
PCH Slot6 PCI-E 2.0 x4
CPU1 Slot1 PCI-E 3.0 x8
PCI-E 3.0 X16
CPU2 Slot4 PCI-E 3.0 X16
CPU2 Slot 3 PCI-E 3.0 X16
CPU1Slot1PCI-E 3.0 X16
P2-DIMME
P2-DIMMF
P2-DIMMG
P2-DIMMH
P1-DIMMD
P1-DIMMC
P1-DIMMB
P1-DIMMA
CPU2
BIOS
JPW6
PHY
I-SATA1
I-SATA2
Battery
S/IO
BMC CTRL
CLOSE 1st
OPEN 1st
CPU1
CLOSE 1st
OPEN 1st
CPU2
LAN CTRL
PCH
Rev.
1.01
(in x16)
(in x8)
CPU1 Slot2
Hard Disk Drives (C1104G-RP5 Chassis)
The 1U chassis supports a maximum of four 2.5-inch hard disk drives, see Figure 5-3. Install the drives from left to right starting in the lower-left bay. Disk drive bays must be populated with either a drive or a “drive blank” to maintain system thermals. Failure to follow this guideline may cause system overheating and thermal shutdown of the unit.
Important: The operating system you use must have RAID support to enable the hot-swap capability and RAID functions of the SATA drives.
Figure 5-3 C1104G-RP5 System Disk Drive Locations
Drive Configurations
The disk drive configurations supported in the Rackable C1104G-RP5 server are outlined in the paragraphs that follow. Note that some configurations are dependent on use of optional hardware to support RAID configurations.
The supported disk drive configurations are as follows:
JBOD
This non-RAID disk array supports any number of drives between one and four. The operating system is placed on the disk drive in location 0 (system disk). All other drives are data drives.
RAID 0
Disk striping without parity, supports any number of drives between two and four. Note that all drives must be the same type, speed and capacity. The operating system will be striped across all drives in the system. This configuration is not recommended.
System
reset
Four disk drive bays
21
System
LEDs
Main
power
RAID 1 Disk “mirroring”, supports exactly two drives. The two drives represent one RAID 1 logical drive. The operating system will be installed on the drives located in Drive positions 0 and 1.
Note that both drives 0 and 1 must be of the same type, speed and capacity.
RAID 5 Disk striping with distributed parity - each RAID 5 drive has its own parity resource. A minimum of three drives are required for a functional RAID 5 array. A single drive failure will result in decreased performance until the damaged drive is replaced and the parity rebuild is complete. RAID 5 is generally favored in a “read-heavy” data environment as write operations will execute at a slower rate.
Note that all drives must be the same type, speed and capacity.
RAID 6 RAID 6 enhances RAID 5 style disk striping parity distribution by creating enough parity data to handle two disk failures. You can lose a disk and have an unrecoverable error (URE) during reconstruction and still reconstruct your system data. Calculations for the RAID 6 parity stripes are more complicated than those for RAID 5; virtually doubling the workload for the processor on the RAID controller and exacting a performance penalty on write operations. Up to 40% of data space on each disk must be devoted to RAID 6 parity information. Optional hardware is required to support RAID 6 functionality.
Note that all drives must be the same type, speed and capacity.
RAID 10 Mirrored disk striping, the data is striped across one set of drives and then mirrored on another set of drives. A minimum of four drives of the same type are required. The total number of drives must be an even number (4 in this case). A total of four drives is a 2+2 configuration. The operating system will be striped across the drives in the primary set and then mirrored on the secondary set of drives.
Note that all drives must be the same type, speed and capacity.
PCIe Expansion Cards
There are three internal double-width (GPU) PCIe 3.0 x16 expansion slots and two external PCIe slots available with the C1104G-RP5 server. The external option slot functions as listed:
External PCI-Express 3.0 x8 low-profile card
PCIe Expansion Cards
007-5839-001 39
Note: Only specific GPU cards will fit and function in the internal PCIe GPU slots, contact your SGI sales or service representative for information on approved GPU cards.
Power Supply Functional Rating
The C1104G-RP5 server default configuration is two rear-installed 1800-Watt power supplies. The second power supply acts as a redundant power unit for the server. The supplies are “auto-ranging” and can operate from either 100-140V or 180-240V levels at 50 or 60Hz.
Each power supply module has its own cooling fan.
The supplies used have an 80 Plus Platinum Certif i cation rating.
007-5839-001 41
Chapter 6
6. Basic Troubleshooting and Chassis Service
Use the procedures in the first half of this chapter to troubleshoot your system. If you have followed all of the procedures below and still need assistance, check with your authorized support organization.
The subsections in the second half of this chapter starting with “Chassis Service Information” on
page 43 are intended to guide you through basic component remove and replace procedures.
Basic Troubleshooting Procedures
Use the information in the following subsections to remedy basic problems you might encounter when working with the Rackable C1104G-RP5 server.
If the System Does Not Power Up
If the system will not power up when the front power button is pushed, use the following checklist to identify common sources for the problem:
Make sure that both ends of each system power cable are firmly connected to the power
supply and the corresponding power source(s) or power distribution unit (PDU).
Check to see if the power fail LED is lit on the front of the unit. This LED should be off if
the system is operating normally.
Check that the LED on each power supply is properly lit. The power supply has one status
LED located on the left side of the front of the power supply. The LED has three states: – Dark or off - indicates no AC power present – Yellow - AC power is present, the server is not turned on (no DC power) – Green - AC power is present and the server is turned on (DC power present)
Open the system cover, remove the air shroud and check to make sure that no obvious short
circuits exist between the serverboard and chassis.
42 007-5839-001
6: Basic Troubleshooting and Chassis Service
System Powers Up But Will Not Boot
If the system powers up but will not boot the Operating System, check the following:
Check the system order document(s) - the C1104G-RP5 server may have been ordered with no operating system. If so, check with your system administrator for OS loading information.
Check the system disk (drive 0) for drive activity and confirm that it is firmly seated in the disk bay. A red light on the front of the disk indicates a functi onal error. Check with your service provider or local system administrator.
No Video After System Power Up
If the system powers up and appears to be booting normally but no video is present, try the following basic solutions:
Confirm your monitor is plugged in and switched on.
Check all video cables and ensure they are properly connected.
Listen for a BIOS “beep code” error message - one long beep plus 8 short beeps indicates a video error. This beep code message could indicate a video memory error or other video malfunction; contact your service provider.
If using an optional PCIe video card check the back of the card for LED activity or a fault indicator. Try opening the system, reseating the PCIe card and rebooting; see the section
“Install/Replace a PCIe Expansion Card” on page 57.
If you cannot get a video signal after trying basic solutions contact your support provider.
Memory Errors
If your system experiences memory related errors, try these basic troubleshooting steps to resolve or better identify the problem:
Confirm that the power supply LED is not indicating an error.
Listen for memory error beep codes - five short beeps followed by one long beep is a BIOS signal that no system memory has been detected - See Appendix A, “BIOS Error Codes”.
Shut the system down, remove the covers over the serverboard and make sure that all the DIMM modules are properly and fully installed.
Chassis Service Information
007-5839-001 43
You should be using registered ECC DDR3 memory. Also, it is recommended that you use
the same memory type and speed for all DIMMs in the system.
Contact your administrator or support provider if the memory errors contin ue.
Chassis Service Information
The following sections cover the steps required to install components and perform maintenance on the C1104G-RP5 chassis. For component installation, follow the steps in the order given to eliminate the most common problems encountered. If some steps are unnecessary, skip ahead to the step that follows.
Important: Always disconnect the AC power cord(s) before adding, changing or installing any internal hardware components.
Tools Required: The only tool you will need to install components and perform main tenance is a Phillips screwdriver.
Static-Sensitive Devices
Electrostatic discharge (ESD) can damage electronic components. To prevent damage to any printed circuit boards (PCBs), it is important to handle them very carefully. The following measures are generally sufficient to protect your equipment from ESD damage.
Precautions
Use a grounded wrist strap designed to prevent static discharge.
Touch a grounded metal object before removing any board from its antistatic bag.
Handle a board by its edges only; do not touch its components, peripheral chips, memory
modules or gold contacts.
When handling chips or modules, avoid touching their pins.
Put the serverboard, add-on cards and peripherals back into their antistatic bags when not in
use.
44 007-5839-001
6: Basic Troubleshooting and Chassis Service
For grounding purposes, make sure your computer chassis provides excellent conductivity between the power supply, the case, the mounting fasteners and the serverboard.
Unpacking
Replacement components are usually shipped in antistatic packaging to avoid static damage. When unpacking an upgrade or replacement component, make sure the person handling it is static protected.
Control Panel
The control panel (located on the front of the chassis) must be connected to the JF1 connector on the serverboard to provide you with system status indications. A ribbon cable has bundled these wires together to simplify the connection. Connect the cable from JF1 on the serverboard to the Control Panel PCB (printed circuit board). Make sure the red wire plugs into pin 1 on both connectors. Pull all excess cabling out of the airflow path. The LEDs inform you of system status. See Chapter 3 for details on the LEDs and the control panel buttons.
Drive Bay Installation/Removal
This section describes hard drive installation and removal.
Accessing the Drive Bays
Drives: You do not need to access the inside of the chassis or remove power to replace or swap a RAIDed hard disk drive. Data may be lost or corrupted if you “hot swap” a JBOD disk drive. Shut down system power before removing or replacing a JBOD disk. Removing either a RAID or JBOD drive without replacing it may cause system errors. Proceed to the next section for further hard drive instructions.
Note: You must use approved 2.5" disk drives in the system.
Drive Bay Installation/Removal
007-5839-001 45
Removing Hard Drives or Carriers from the Chassis
1. Press the release button on the drive carrier. This extends the drive carrier handle.
2. Use the handle to pull the drive carrier out of the chassis.
Important: Empty carriers without drives must stay in the chassis during operation for proper airflow/cooling purposes except during remove/replace operations. Do not operate the server with carriers removed.
The Hard Drive Backplane
The hard drives plug into a backplane that provides power, drive ID and bus termination. A RAID controller and/or optional RAID software can be used with the backplane to provide data security. The operating system you use must have RAID support to enable the hot-swap capability of the hard drives. The backplane is preconfigured, so no jumper/switch configuration is required.
Caution: Be careful when working around the drive backplane. Do not touch the backplane with your fingers or any metal objects and make sure no ribbon cables touch the backplane or obstruct the holes, which aid in proper airflow.
Disk Drive Installation
The drives are mounted in drive carriers to simplify their installation and removal from the chassis, see Figure 6-1 on page 46 for a disk drive removal example. These carriers also help promote proper airflow for the drives. For this reason, even empty carriers without hard drives installed must remain in the chassis during operation.
See Figure 6-2 on page 48 for an example drive carrier and the “dummy” drive blank used when a working disk is not installed in a drive slot.
!
46 007-5839-001
6: Basic Troubleshooting and Chassis Service
Figure 6-1 Remove Drive and Carrier from Front of Server
Hard Drive Carrier Assembly Usage
1. Remove the four screws securing the dummy/bad drive to the hard drive carrier.
2. Insert a new/replacement hard drive into the carrier with the PCB side facing down and the connector end toward the rear of the carrier.
3. Align the hard drive in the disk drive carrier so that the mounting holes of the carrier are aligned with the mounting holes of the drive. Note that there are holes in the carrier which are marked “SATA” to aid in correct installation.
4. Secure the drive to the carrier with four screws. Use the M3 flat-head screws included in the HDD bag of your accessory box. Note: the screws used to secure a dummy drive to the carrier should not be used to secure the hard drive.
Drive Bay Installation/Removal
007-5839-001 47
5. Insert the hard drive carrier assembly into its bay vertically, keeping the carrier oriented so
that the release button is on the bottom. When the carrier reaches the rear of the drive bay, the handle will retract.
6. Using your thumb, push against the upper part of the hard drive handle until the assembly
clicks into the locked (fully seated) position, see Figure 6-3 on page 49 for an example.
Note: Your operating system must have RAID support to enable the hot-plug capability of the drives.
Caution: Regardless of how many hard drives are installed, all drive carriers must remain in the drive bays to maintain proper airflow and system cooling.
!
48 007-5839-001
6: Basic Troubleshooting and Chassis Service
Figure 6-2 Drive Carrier Attachment to Dummy Drive Blank
Power Supply
007-5839-001 49
I
Figure 6-3 Hard Disk Drive Installation Example
Power Supply
The system offers a redundant power supply assembly consisting of two 1800-watt power modules. Each power supply module has an auto-switching capability, which enables it to automatically sense and operate at a 100V - 240V input voltage at 50 or 60Hz.
Power Supply Failure
If either of the two power supply modules fail, the other module will take the full load and allow the system to continue operation without interruption. The PWR Fail LED will illuminate and remain on until the failed unit has been replaced. The power supply units have a hot-swap capability, meaning you can replace the failed unit without powering down the system, see
Figure 6-4 on page 51 for an example.
50 007-5839-001
6: Basic Troubleshooting and Chassis Service
Removing/Replacing a Power Supply
Y ou do not need to shut down the system to replace a failed power supply unit. The backup power supply module will keep the system up and running while you replace the failed unit. Replace with the same model.
Removing the Power Supply
1. First unplug the AC power cord from the failed power supply module.
2. Depress the locking tab on the power supply module.
3. Pull it straight out using the rounded handle.
Installing a New Power Supply
1. Replace the failed hot-swap unit with another identical power supply unit.
2. Push the new power supply unit into the power bay until you hear a click.
3. Secure the locking tab on the unit.
4. Finish by plugging the AC power cord back into the unit.
Power Supply
007-5839-001 51
Figure 6-4 Power Supply Remove/Replace Example
Accessing the Inside of the Chassis
1. Grasp the two handles on either side and pull the unit straight out until it locks (you will hear a “click”).
2. Next, depress the two buttons on the top of the chassis to release the top cover and at the same time, push the cover away from you until it stops. You can then lift the top cover from the chassis to gain full access to the inside of the server.
Note: Normally you would power down the system before installing or removing internal components - but it may be necessary to leave system power on to determine which fan has failed.
System Fans
T en 4-cm counter-rotating fans provide the cooling for the system. Each fan unit is actually made up of two fans joined back-to-back, which rotate in op posite directions. This counter-rotating action generates exceptional airflow and works to dampen vibration levels. It is very important that the chassis top cover is properly installed and making a good seal in order for the cooling air to circulate properly through the chassis and cool the components.
System Fan Failure
Fan speed is controlled by system temperature via a BIOS setting. If a fan fails, the remaining fans will ramp up to full speed and the overheat/fan fail LED on the control panel will flash. Replace any failed fan as soon as possible with the same type and model (the system can continue to run with a failed fan).
Your system administrator may be able to identify which fan has failed using the system BIOS.
If an administrator or service representative is not using the BIOS to determine which fan has failed, you can remove the top chassis cover while the system is still running to determine which of the fans has failed. After determining which is the failed fan, remove power from the system by unplugging the server’s cord. Never run the server for an extended period of time with the top cover open.
System Fans
007-5839-001 53
Replacing System Fans
This section describes how to remove or install a system fan.
Remove/Replace a Fan
1. If you have not already done so, remove the chassis cover to access the fans, see the example
in Figure 6-5 on page 54.
2. Turn off the power to the system and unplug the AC power cord.
3. Remove the failed fan's wiring connectors from the serverboard.
4. Remove and retain the four pins securing the fan assembly to the fan tray.
5. Lift the assembly housing the failed fan from the fan tray and out of the chassis, see the
example in Figure 6-6 on page 55.
6. Place the new fan into the vacant space in the fan tray, while making sure the arrows on the
top of the fan (indicating air direction) point in the same direction as the arrows on the other fans in the same fan tray. See Figure 6-7 on page 56 for a fan assembly example.
7. Reconnect the fan wires to the exact same chassis fan headers as the previous fan.
8. Reconnect the AC power cord, power up the system and check that the fan is working
properly before replacing the chassis cover.
54 007-5839-001
6: Basic Troubleshooting and Chassis Service
Figure 6-5 Cooling Fans Access Example
System Fans
007-5839-001 55
Figure 6-6 Remove/Replace Fan Assembly Example
56 007-5839-001
6: Basic Troubleshooting and Chassis Service
Figure 6-7 Individual Fan Remove/Replace Example
Install/Replace a PCIe Expansion Card
007-5839-001 57
Install/Replace a PCIe Expansion Card
Confirm that you have the correct PCIe card for your chassis and the card includes a standard bracket. The following type card is supported in the server chassis:
One low-profile PCIe 3.0 x8 card
One optional full-height PCIe 3.0 x16 card (optional if only front GPU slots are populated)
Figure 6-8 PCIe Low-profile and Optional Full-height Slot Locations
Install/Replace a Low-profile or Full-height PCIe Card
Use the following steps and illustration to install or replace a PCIe card at the rear of the system:
1. Remove the chassis cover and disconnect both the power cables from the server.
2. Confirm that you have the correct size and type of PCIe expansion card (low-profile or
full-height depending on the slot used).
Note: If your system uses three GPU boards then the x16 full-height PCIe slot is not usable.
3. Remove the screw securing the low-profile or full-height PCIe slot cover at the rear of the
chassis and slide it sideways to remove from the chassis.
4. Select the appropriate riser connector for your low-profile card. Note that both the
low-profile and full-height riser cards use a x16 connector.
5. Align the PCIe card with the rear slot opening and the riser connector, then simultaneously
slide the rear bracket into place as you insert the PCIe connector into the riser.
6. Secure the rear bracket in the slot with the screw removed in step 3 and connect cables to the
add-on card as necessary. See Figure 6-9 on page 58 for an example.
Low-profile PCIe slot
Optional GPU or
x16 full-height PCIe slot
58 007-5839-001
6: Basic Troubleshooting and Chassis Service
7. Replace the system cover and plug in the power cords prior to rebooting the server.
Figure 6-9 Low-profile PCIe Card Remove/Replace Example
007-5839-001 59
Appendix A
A. BIOS Error Codes
During Power-On Self-Test (POST) routines, which are performed each time the system is powered on, errors may occur.
Non-fatal errors are those which, in most cases, allow the system to continue the boot-up process. The error messages normally appear on the screen.
Fatal errors are those which will not allow the system to continue the boot-up procedure. If a fatal error occurs, you should consult with your system manufacturer for possible repairs.
These fatal errors are usually communicated through a series of audible beeps. The numbers on the fatal error list (see Table A-1) correspond to the number of beeps for the corresponding error.
Table A-1 BIOS Error Codes
Beep Code Error Message Description
1 beep Refresh Circuits have been reset (Ready to power up) 5 short beeps + 1 long beep Memory error No memory detected in the system 1 long beep +8 short beeps Video display error or video
memory read/write error
Video error - adapter missing or with faulty memory
007-5839-001 61
Appendix B
B. System Operating and Regulatory Overview
This appendix provides basic environmental operating requirements and regulatory information for the server.
Operating Environment
Operating Temperature: 10º to 35º C (32º to 95º F)
Non-operating Temperature: -40º to 70º C (-40º to 158º F)
Operating Relative Humidity: 8% to 90% (non-condensing)
Non-operating Relative Humidity: 5% to 95% (non-condensing)
System Input Requirements
AC Input Voltage: 180-240 VAC
Rated Input Current: 1000W: 100-120V/12-10A, 1200W: 120-140V/12-10A,
1800W: 200-240V/10-8.5A
Rated Input Frequency: 50-60 Hz
Power Supply
Rated Output Power: 1800W
Rated Output Voltages: +12V (150A), +5Vsb (4A)
62 007-5839-001
B: System Operating and Regulatory Overview
Regulatory Compliance
This product is for installation in a Restricted Access Location only per clause 1.7.14 of IEC document 60950
The SGI compliance number for this product is CMN1104-118-18
Electromagnetic Emissions: FCC Class A, EN 55022 Class A, EN 61000-3-2/-3-3, CISPR 22 Class A
Electromagnetic Immunity: EN 55024/CISPR 24, (EN 61000-4-2, EN 61000-4-3, EN 61000-4-4, EN 61000-4-5, EN 61000-4-6, EN 61000-4-8, EN 61000-4-11)
Safety: CSA/EN/IEC/UL 60950-1 Compliant, UL or CSA Listed (USA and Canada), CE Marking (Europe)
California Best Management Practices Regulations for Perchlorate Materials: This Perchlorate warning applies only to products containing CR (Manganese Dioxide) Lithium coin cells. “Perchlorate Material-special handling may apply. See: www.dtsc.ca.gov/hazardouswaste/perchlorate”
Loading...