IBM 755 User Manual

IBM United States Hardware Announcement
110-008, dated February 9, 2010
IBM Power 755 server brings IBM POWER7 technology to the High Performance Computing marketplace
1 Overview 19 Publications 2 Key prerequisites 20 Technical information 3 Planned availability date 28 Terms and conditions 3 Description 32 Prices 13 Statement of general direction 47 Order now 14 Product number
At a glance
The Power® 755 server is a 3.3 GHz 32-core POWER7 processor-based server optimized for high performance computing. Up to sixty-four 32-core nodes can be clustered together, providing up to 2,048 cores. Each 755 server node features:
• Four 8-core POWER7 modules, each with 4 MB L3 cache/core and also 256 KB L2 cache/core
• Up to 256 GB of 1066 GHz DDR3 memory
• Five PCI slots (three PCIe and two PCI-X)
• One slot for a 12X InfiniBand adapter
• Eight SFF SAS bays in the CEC for disk or solid-state drives
• Up to 72 TB disk storage using CEC and EXP12S SAS I/O drawers
• Integrated 10/100/1000 Mb quad-port Virtual Ethernet or dual-port 10 Gb Virtual Ethernet
EnergyScaleTM technology
• Integrated DVD-RAM drive
• 4U rack-mount configuration
For ordering, contact your IBM® representative, an IBM Business Partner, or IBM Americas Call Centers at 800-IBM-CALL (Reference: YE001).
Overview
The IBM Power 755 compute node is designed for organizations that require a scalable system with extreme parallel processing performance and dense packaging. Ideal workloads for Power 755 include high performance computing (HPC) applications such as weather and climate modeling, computational chemistry,
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
1
physics, and petroleum reservoir modeling that require highly intense computations where the workload is aligned with parallel processing methodologies.
The Power 755 server (8236-E8C) is a 3.3 GHz 32-core POWER7 server that should be very popular in HPC environments such as weather and climate modeling, computational chemistry, physics, computer-aided engineering, computational fluid dynamics, and petroleum reservoir modeling. A single Power 755 provides four 64­bit, eight-core processor POWER7 modules with 4 MB of L3 cache/core and 256 KB of L2 cache/core. Each module is packaged on its own processor card, which has eight DDR3 DIMM slots offering a maximum of 256 GB memory when all 32 DIMM slots are filled with 8 GB DIMMs. The memory DIMMs run at 1066 MHz.
Using 12X InfiniBand adapters, up to 64 Power 755 nodes, each with 32 cores, can be clustered together, providing up to 2,048 POWER7 cores. The IBM HPC software stack provides the necessary development tools, libraries, and system management software necessary to manage a Power 755 server cluster.
The Power 755 system unit provides up to five PCI slots, one GX slot for a 12X adapter, eight SFF (small form factor) SAS bays, and a DVD-RAM. Three of the five PCI slots are PCIe 8x and two are PCI-X DDR. The GX slot can hold a 12X InfiniBand adapter supporting 4x connection to other Power 755s. The eight SAS bays contain a minimum of two and a maximum of eight disks or SSDs, providing up to 2.4 TB storage capacity. Up to an 156 additional SAS bays are available using the EXP12S SAS disk/SSD drawer (#5886), providing up to 70 TB of additional capacity. All drives are direct dock and hot pluggable.
The Power 755 system unit also provides a choice of quad gigabit or dual 10 Gb integrated host Ethernet adapters, which can be extensively virtualized. These ports are selected at the time of initial order and do not use a PCI slot.
The Power 755 server contains a minimum of two and a maximum of either eight SFF SAS disks or eight SFF SAS SSDs. The maximum internal disk storage available is 2400 GB. All DASD are direct dock and hot pluggable. A slim media bay is available for a mandatory SATA DVD-RAM.
Also available in the Power 755 system unit is a choice of quad gigabit or dual 10 Gb integrated host Ethernet adapters. These native ports can be selected at the time of initial order. Virtualization of these integrated Ethernet adapters is supported.
Other integrated features include:
• Service Processor
• Integrated SAS/SATA controller for disk/SSD/DVD in system unit
• EnergyScale technology
• Two system ports and three USB ports
• Two HMC ports and two SPCN ports
• Redundant and hot-swap power and cooling
• 4U 19-inch rack-mount packaging
Key prerequisites
If installing the AIX® operating system (one of these):
• AIX Version 6.1 with the 6100-04 Technology Level and Service Pack 2, or later
• AIX Version 6.1 with the 6100-03 Technology Level and Service Pack 5, or later (planned availability: June 25, 2010)
• AIX Version 6.1 with the 6100-02 Technology Level and Service Pack 8, or later (planned availability: June 25, 2010)
• AIX Version 5.3 with the 5300-11 Technology Level and Service Pack 2, or later (planned availability: March 16, 2010)
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
2
• AIX Version 5.3 with the 5300-10 Technology Level and Service Pack 4, or later (planned availability: May 28, 2010)
• AIX Version 5.3 with the 5300-09 Technology Level and Service Pack 7, or later (planned availability: May 28, 2010)
Visit the IBM Prerequisite Web site for compatibility information for hardware features and the corresponding AIX Technology Levels
http://www-912.ibm.com/e_dir/eserverprereq.nsf
If installing the Linux® operating system (one of these):
• SUSE Linux Enterprise Server 11 for the Power 755 Server, or later, with current maintenance updates available from Novell to enable all planned functionality
• SUSE Linux Enterprise Server 10 Service Pack 3 for the Power 755 Server, with current maintenance updates available from Novell to enable all planned functionality
Users should also update their systems with the latest Linux for Power service and productivity tools available at
http://www14.software.ibm.com/webapp/set2/sas/f/lopdiags/home.html
If installing VIOS:
• VIOS 2.1.2.11 with Fix Pack 22.1 and Service Pack 1, or later
JavaTM 1.4.2 on POWER7
There are unique considerations when running Java 1.4.2 on POWER7. For best exploitation of the outstanding performance capabilities and most recent improvements of POWER7 technology, IBM recommends upgrading Java-based applications to Java 6 or Java 5 whenever possible.
For more information, visit
http://www.ibm.com/developerworks/java/jdk/aix/service.html
Planned availability date
February 19, 2010, except for feature 4526, which is planned to be available on March 16, 2010.
Description
Power 755
Summary of standard features:
• Rack-mount (4U) configuration
• 32-core design with four 3.3 GHz processor cards
• 128 GB of PC3-8500 1066 MHz ECC memory (error checking and correcting) memory, maximum of 64 GB per processor card (256 GB system maximum)
• 8 x 2.5-inch DASD/SSD/Media backplane with an external SAS port
– 2 to 8 SFF DASD or SSDs (mixing allowed)
• Choice of two integrated virtual Ethernet daughter cards:
– Quad-port 1 Gb IVE
– Dual-port 10 Gb IVE
• One media bay:
– Slim bay for a DVD-RAM (required)
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
3
• A maximum of five hot-swap slots:
– Two PCIe x8 slots, short card length (slots 1 and 2)
– One PCIe x8 slot, full card length (slot 3)
– Two PCIX DDR slots, full card length (slots 4 and 5)
– One GX++ slot (shares same space as PCIe x8 slot 1)
• Integrated:
– Service Processor
– Quad-port 10/100/1000 Mb Ethernet
– EnergyScale technology
– Hot-swap and redundant cooling
– Three USB ports; two system ports
– Two HMC ports; two SPCN ports
• Two Power Supplies, 1725 Watt AC, Hot-swap
The minimum Power 755 configuration must include four processor cards, 32 processor activations, memory, two power supplies and power cords, two DASD, a DASD/SSD/Media backplane, an operator panel cable, an Ethernet daughter card, a DVD-RAM, an operating system indicator, and a Language Group Specify.
The minimum defined configuration, if no choice is made, is:
Feature number Description
4 x 8332 0/8 core 3.3 GHz POWER7 Processor -- 32-core system 32 x 2325 32 Zero-priced Processor Activations 16 x 4526 8 GB (2 x 4096 MB) Memory -- Total 128 GB memory 2 x 1883 Two 73.4 GB 15k SFF DASD 1878 Operator Panel Cable, Rack-mount drawer with
2.5-inch DASD Backplane
8340 DASD/Media Backplane for 2.5-inch DASD/SATA DVD/Tape with External SAS Port 5624 Quad-port 1 Gb Integrated Ethernet Daughter Card 2 x 7740 Two Power Supplies, 1725 Watt AC, Base 5762 SATA DVD-RAM 9300/97xx) Language Group Specify 2146 or 2147 Primary Operating System Indicator - IBMAIX (2146) or Linux (2147) 2 x 6xxx Two Power Cords
Notes:
• The 8 GB memory feature (#4526) is planned to be available on March 16, 2010. Eight of feature 4527 can replace 16 of feature 4526.
• The GX Dual-port 12X Channel Attach Adapter Card (#5609) will be defaulted on every 8236-E8C order but may be deselected.
• No internal DASD is required if feature 0837 (Boot from SAN) is selected. In this case, a Fibre Channel or Fibre Channel over Ethernet adapter must also be ordered.
I/O drawer availability
The EXP12S disk-only I/O drawers (#5886) are supported on the Power 755, providing large storage capacity and multiple partition support.
EXP 12S SAS Drawer (#5886)
The EXP 12S SAS drawer (#5886) is a 2 EIA drawer and mounts in a 19-inch rack. The drawer can hold either SAS disk drives or SSD. The EXP 12S SAS drawer has twelve 3.5-inch SAS disk bays with redundant data paths to each bay. The drawer supports redundant hot-plug power and cooling and redundant hot-swap SAS
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
4
expanders (Enclosure Services Manager-ESM). Each ESM has an independent SCSI Enclosure Services (SES ) diagnostic processor.
The SAS disk drives or SSDs contained in the EXP12S are controlled by one or two PCIe or PCI-X SAS adapters connected to the EXP12S via SAS cables. The SAS cable will vary, depending upon the adapter being used, the operating system being used, and the protection desired.
• The large cache PCI-X (#5904/#5908) uses a SAS Y cable when a single port is running the EXP12S. A SAS X cable is used when a pair of adapters are used for controller redundancy.
• The medium cache PCI-X feature 5902 and PCIe feature 5903 adapters are always paired and use a SAS X cable to attach the feature 5886 I/O drawer.
• The zero cache PCI-X feature 5912 and PCIe feature 5901 use a SAS Y cable when a single port is running the EXP12S. A SAS X cable is used for AIX/Linux environments when a pair of adapters is used for controller redundancy.
In all of the above configurations, all 12 SAS bays are controlled by a single controller or a single pair of controllers.
A second EXP12S drawer can be attached to another drawer using two SAS EE cables, providing 24 SAS bays instead of 12 bays for the same SAS controller port. This is called cascading . In this configuration all 24 SAS bays are controlled by a single controller or a single pair of controllers.
Feature 5886 can also be directly attached to the SAS port on the rear of the Power 755, providing a very low-cost disk storage solution. When used this way, the imbedded SAS controllers augmented by the 175 MB write cache RAID enabler (#5679) in the system unit drive the disk drives in EXP12S. A second unit cannot be cascaded to a feature 5886 attached in this way.
Reliability, availability, and serviceability (RAS) features
Reliability, fault tolerance, and data correction
The reliability of systems starts with components, devices, and subsystems that are designed to be fault-tolerant. POWER7 uses lower voltage technology, improving reliability with stacked latches to reduce soft error (SER) susceptibility. During the design and development process, subsystems go through rigorous verification and integration testing processes. During system manufacturing, systems go through a thorough testing process to help ensure the highest level of product quality.
The system cache and memory offer ECC (error checking and correcting) fault­tolerant features. ECC is designed to correct environmentally induced, single-bit, intermittent memory failures and single-bit hard failures. With ECC, the likelihood of memory failures will be substantially reduced. ECC also provides double-bit memory error detection that helps protect data in the event of a double-bit memory failure.
The AIX operating system provides disk drive mirroring and disk drive controller duplexing. The Linux operating system supports disk drive mirroring (RAID 1) through software, while other RAID protection schemes are provided via hardware RAID adapters.
The Journaled File System, also known as JFS or JFS2, helps maintain file system consistency and reduces the likelihood of data loss when the system is abnormally halted due to a power failure. JFS, the recommended file system for 32-bit kernels, now supports extents on the Linux operating system. This feature is designed to substantially reduce or eliminate fragmentation. Its successor, JFS2, is the recommended file system for 64-bit kernels.
With 64-bit addressing, a maximum file system size of 32 TB, and maximum file size of 16 TB, JFS2 is highly recommended for systems running the AIX operating system.
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
5
Memory error correction extensions
The memory has single-bit-error correction and double-bit-error detection ECC circuitry. The ECC code is also designed such that the failure of any one specific memory module within an ECC word by itself can be corrected absent any other fault.
Memory protection features include scrubbing to detect errors, a means to call for the deallocation of memory pages for a pattern of correctable errors detected, and signaling deallocation of a logical memory block when an error occurs that cannot be corrected by the ECC code.
Redundancy for array self-healing
Although the most likely failure event in a processor is a soft single-bit error in one of its caches, other events can occur, and they need to be distinguished from one another. For caches and their directories, hardware and firmware keep track of whether errors are being corrected beyond a threshold. If exceeded, a deferred repair error log is created.
Caches and directories on the POWER7 chip are manufactured with spare bits in their arrays that can be accessed via programmable steering logic to replace faulty bits in the respective arrays. This is analogous to the redundant bit steering employed in main storage as a mechanism that is designed to help avoid physical repair, and is also implemented in POWER7 systems. The steering logic is activated during processor initialization and is initiated by the built-in system-test (BIST) at power-on time.
When correctable error cache exceeds a set threshold, systems using the POWER7 processor invoke a dynamic cache line delete function, which enables them to stop using bad cache and eliminates exposure to greater problems.
Fault monitoring functions
• When a POWER7-based system is powered on, BIST and POST (power-on self­test) check processor, cache, memory, and associated hardware required for proper booting of the operating system. If a noncritical error is detected or if the errors occur in resources that can be removed from the system configuration, the restarting process is designed to proceed to completion. The errors are logged in the system nonvolatile RAM (NVRAM).
• Disk drive fault tracking is designed to alert the system administrator of an impending disk drive failure before it impacts customer operation.
Mutual surveillance
The Service Processor monitors the operation of the firmware during the boot process, and also monitors the HypervisorTM for termination. The Hypervisor
monitors the Service Processor and will perform a reset/reload if it detects the loss of the Service Processor. If the reset/reload does not correct the problem with the Service Processor, the Hypervisor will notify the operating system and the operating system can take appropriate action, including calling for service.
Environmental monitoring functions
POWER7-based servers include a range of environmental monitoring functions:
• Temperature monitoring warns the system administrator of potential environmental-related problems by monitoring the air inlet temperature. When the inlet temperature rises above a warning threshold, the system initiates an orderly shutdown. When the temperature exceeds the critical level, or if the temperature remains above the warning level for too long, the system will shut down immediately.
• Fan speed is controlled by monitoring actual temperatures on critical components and adjusting accordingly. If internal component temperatures reach critical levels, the system will shut down immediately, regardless of fan speed. When a
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
6
redundant fan fails, the system calls out the failing fan and continues running. When a nonredundant fan fails, the system shuts down immediately.
Availability enhancement functions
The POWER7 family of systems continues to offer and introduce significant enhancements designed to increase system availability.
POWER7 processor functions
As in POWER6TM, the POWER7 processor has the ability to do processor instruction retry and alternate processor recovery for a number of core-related faults. This significantly reduces exposure to both hard (logic) and soft (transient) errors in the processor core. Soft failures in the processor core are transient (intermittent) errors, often due to cosmic rays or other sources of radiation, and generally are not repeatable. With this function, when an error is encountered in the core, the POWER7 processor will first automatically retry the instruction. If the source of the error was truly transient, the instruction will succeed and the system will continue as before. On IBM systems prior to POWER6, this error would have caused a checkstop.
Hard failures are more difficult, being true logical errors that will be replicated each time the instruction is repeated. Retrying the instruction will not help in this situation because the instruction will continue to fail. As in POWER6, POWER7 processors have the ability to extract the failing instruction from the faulty core and retry it elsewhere in the system for a number of faults, after which the failing core is dynamically deconfigured and called out for replacement. The entire process is transparent to the partition owning the failing instruction. These systems are designed to avoid a full system outage.
POWER7 single processor checkstopping
As in POWER6, POWER7 provides single processor checkstopping. This significantly reduces the probability of any one processor affecting total system availability.
Partition availability priority
Also available is the ability to assign availability priorities to partitions. If an alternate processor recovery event requires spare processor resources in order to protect a workload, when no other means of obtaining the spare resources is available, the system will determine which partition has the lowest priority and attempt to claim the needed resource. On a properly configured POWER7 processor­based server, this allows that capacity to be first obtained from, for example, a test partition instead of a financial accounting system.
POWER7 cache availability
The POWER® processor-based line of servers continues to be at the forefront of cache availability enhancements. The L3 cache is now integrated on the POWER7 processor. The POWER7 processor provides both L2 and L3 cache line delete functions.
Special uncorrectable error handling
Uncorrectable errors are difficult for any system to tolerate, although there are some situations where they can be shown to be irrelevant. For example, if an uncorrectable error occurs in cached data that will never again be read or where a fresh write of the data is imminent, it would be unwise to "protect" the user by forcing an immediate reboot.
Special Uncorrectable Error (SUE) handling was an IBM innovation introduced for POWER5TM processors, where an uncorrectable error in memory or cache does not
immediately cause the system to terminate. Rather, the system tags the data and determines whether it will ever be used again. If the error is irrelevant, it will not force a checkstop.
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
7
PCI extended error handling
PCI extended error handling (EEH) enabled adapters respond to a special data packet generated from the affected PCI slot hardware by calling system firmware, which will examine the affected bus, allow the device driver to reset it, and continue without a system reboot. For Linux, EEH support extends to the majority of frequently used devices, although some third-party PCI devices may not provide native EEH support.
Predictive failure and dynamic component deallocation
Servers with POWER processors have long had the capability to perform predictive failure analysis on certain critical components such as processors and memory. When these components exhibit symptoms that would indicate a failure is imminent, the system can dynamically deallocate and call home about the failing part before the error is propagated system-wide. In many cases, the system will first attempt to reallocate resources in such a way that will avoid unplanned outages. In the event that insufficient resources exist to maintain full system availability, these servers will attempt to maintain partition availability by user-defined priority.
Uncorrectable error recovery
When the auto-restart option is enabled, the system can automatically restart following an unrecoverable software error, hardware failure, or environmentally induced (ac power) failure.
Serviceability
The purpose of serviceability is to repair the system while attempting to minimize or eliminate service cost (within budget objectives), while maintaining high customer satisfaction. Serviceability includes system installation, MES (system upgrades/downgrades), and system maintenance/repair. Depending upon the system and warranty contract, service may be performed by the customer, an IBM representative, or an authorized warranty service provider.
The Serviceability features delivered in this system provide a highly efficient service environment by incorporating the following attributes:
• Design for Customer Set Up (CSU), Customer Installed Features (CIF), and Customer Replaceable Units (CRU)
• Error detection and Fault Isolation (ED/FI)
• First Failure Data Capture (FFDC)
• Converged service approach across multiple IBM server platforms
Service environments
The HMC is a dedicated server that provides functions for configuring and managing servers for either partitioned or full-system partition using a GUI or command-line interface (CLI). An HMC attached to the system allows support personnel (with client authorization) to remotely log in to review error logs and perform remote maintenance if required.
The POWER7 processor-based platforms support two main service environments:
• Attachment to one or more HMCs is a supported option by the system. This is the default configuration for servers supporting logical partitions with dedicated or virtual I/O. In this case, all servers have at least one logical partition.
• No HMC.
• Full system partition: A single partition owns all the server resources and only one operating system may be installed.
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
8
Service Interface
The Service Interface allows support personnel to communicate with the service support applications in a server using a console, interface, or terminal. Delivering a clear, concise view of available service applications, the Service Interface allows the support team to manage system resources and service information in an efficient and effective way. Applications available via the Service Interface are carefully configured and placed to give service providers access to important service functions.
Different service interfaces are used depending on the state of the system and its operating environment. The primary service interfaces are:
• LEDs
• Operator Panel
• Service Processor menu
• Operating system service menu
• Service Focal Point on the HMC
In the light path LED implementation, when a fault condition is detected on the POWER7 system, an amber FRU fault LED will be illuminated, which will be rolled up to the system fault LED. The light path system pinpoints the exact part by turning on the amber FRU fault LED associated with the part to be replaced.
The system can clearly identify components for replacement by using specific component-level LEDs, and can also guide the servicer directly to the component by signaling (turning on solid) the system fault LED, enclosure fault LED, and the component FRU fault LED. The servicer can also use the identify function to blink the FRU-level LED. When this function is activated, a roll-up to the blue enclosure locate and system locate LEDs will occur. These LEDs will turn on solid and can be used to follow the light path from the system to the enclosure and down to the specific FRU.
First Failure Data Capture and Error Data Analysis
First Failure Data Capture (FFDC) is a technique that helps ensure that when a fault is detected in a system, the root cause of the fault will be captured without the need to re-create the problem or run any sort of extending tracing or diagnostics program. For the vast majority of faults, a good FFDC design means that the root cause can also be detected automatically without servicer intervention.
FFDC information, error data analysis, and fault isolation are necessary to implement the advanced serviceability techniques that enable efficient service of the systems and to help determine the failing items.
In the rare absence of FFDC and Error Data Analysis, diagnostics are required to re­create the failure and determine the failing items.
Diagnostics
General diagnostic objectives are to detect and identify problems such that they can be resolved quickly. Elements of IBM's diagnostics strategy include:
• Provide a common error code format equivalent to a system reference code, system reference number, checkpoint, or firmware error code.
• Provide fault detection and problem isolation procedures.
• Support remote connection ability to be used by the IBM Remote Support Center or IBM Designated Service.
• Provide interactive intelligence within the diagnostics with detailed online failure information while connected to IBM's back-end system.
Automatic diagnostics
Because of the FFDC technology designed into IBM Servers, it is not necessary to perform re-create diagnostics for failures or require user intervention. Solid and
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
9
intermittent errors are designed to be correctly detected and isolated at the time the failure occurs. Runtime and boot-time diagnostics fall into this category.
Stand-alone diagnostics
As the name implies, stand-alone or user-initiated diagnostics require user intervention. The user must perform manual steps, including:
• Compact disk-based diagnostics
• Keying in commands
• Interactively selecting steps from a list of choices
Concurrent maintenance
The system will continue to support concurrent maintenance of power, cooling, PCI adapters, DASD, DVD, and firmware updates (when possible). The determination of whether a firmware release can be updated concurrently is identified in the readme information file released with the firmware.
Service labels
Service providers use these labels to assist them in performing maintenance actions. Service labels are found in various formats and positions, and are intended to transmit readily available information to the servicer during the repair process. Following are some of these service labels and their purpose:
Location diagrams
Location diagrams are strategically located on the system hardware, relating information regarding the placement of hardware components. Location diagrams may include location codes, drawings of physical locations, concurrent maintenance status, or other data pertinent to a repair. Location diagrams are especially useful when multiple components are installed such as DIMMs, CPUs, processor books, fans, adapter cards, LEDs, and power supplies.
Remove/replace procedures
Service labels that contain remove/replace procedures are often found on a cover of the system or in other spots accessible to the servicer. These labels provide systematic procedures, including diagrams, detailing how to remove/replace certain serviceable hardware components.
Arrows
Numbered arrows are used to indicate the order of operation and serviceability direction of components. Some serviceable parts such as latches, levers, and touch points need to be pulled or pushed in a certain direction and certain order for the mechanical mechanisms to engage or disengage. Arrows generally improve the ease of serviceability.
Packaging for service
The following service enhancements are included in the physical packaging of the systems to facilitate service:
• Color coding (touch points): Terracotta colored touch points indicate that a component (FRU/CRU) can be concurrently maintained. Blue colored touch points delineate components that are not concurrently maintained -- those that require the system to be turned off for removal or repair.
• Tool-less design: Selected IBM systems support tool-less or simple tool designs. These designs require no tools or simple tools such as flat head screwdrivers to service the hardware components.
• Positive retention: Positive retention mechanisms help to assure proper connections between hardware components such as cables to connectors, and between two cards that attach to each other. Without positive retention, hardware
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
10
components run the risk of becoming loose during shipping or installation, preventing a good electrical connection. Positive retention mechanisms like latches, levers, thumb-screws, pop Nylatches (U-clips), and cables are included to help prevent loose connections and aid in installing (seating) parts correctly. These positive retention items do not require tools.
Error Handling and Reporting
In the unlikely event of system hardware or environmentally induced failure, the system runtime error capture capability systematically analyzes the hardware error signature to determine the cause of failure. The analysis result will be stored in system NVRAM. When the system can be successfully restarted either manually or automatically, the error will be reported to the operating system. Error Log Analysis (ELA) can be used to display the failure cause and the physical location of the failing hardware.
With the integrated Service Processor, the system has the ability to automatically send out an alert via phone line to a pager or call for service in the event of a critical system failure. A hardware fault will also turn on the amber system fault LED located on the system unit to alert the user of an internal hardware problem. The indicator may also be set to blink by the operator as a tool to allow system identification. For identification, the blue locate LED on the enclosure and at the system level will turn on solid. The amber system fault LED will be on solid when an error condition occurs.
On POWER7 processor-based servers, hardware and software failures are recorded in the system log. When an HMC is attached, an ELA routine analyzes the error, forwards the event to the Service Focal Point (SFP) application running on the HMC, and notifies the system administrator that it has isolated a likely cause of the system problem. The Service Processor event log also records unrecoverable checkstop conditions, forwards them to the SFP application, and notifies the system administrator. Once the information is logged in the SFP application, if the system is properly configured, a call home service request will be initiated and the pertinent failure data with service parts information and part locations will be sent to an IBM Service organization. Customer contact information and specific system-related data such as the machine type, model, and serial number, along with error log data related to the failure are sent to IBM Service.
Service Processor
The Service Processor provides the capability to diagnose, check the status of, and sense the operational conditions of a system. It runs on its own power boundary and does not require resources from a system processor to be operational to perform its tasks.
The Service Processor supports surveillance of the connection to the HMC and to the system firmware (Hypervisor). It also provides several remote power control options, environmental monitoring, reset, restart, remote maintenance, and diagnostic functions, including console mirroring. The Service Processors menus (ASMI) can be accessed concurrently with system operation allowing nondisruptive abilities to change system default parameters.
Call Home
Call Home refers to an automatic or manual call from a customer location to IBM support structure with error log data, server status, or other service-related information. Call Home invokes the service organization in order for the appropriate service action to begin. Call Home can be done through HMC or non-HMC managed systems. While configuring Call Home is optional, clients are encouraged to implement this feature in order to obtain service enhancements such as reduced problem determination and faster and potentially more accurate transmittal of error information. In general, using the Call Home feature can result in increased
system availability. The Electronic Service AgentTM application can be configured for automated call home. Refer to the next section for specific details on this application.
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
11
IBM Electronic Services
Electronic Service Agent and the IBM Electronic Services Web portal comprise the IBM Electronic Services solution -- dedicated to providing fast, exceptional support to IBM customers. IBM Electronic Service Agent is a no-charge tool that proactively monitors and reports hardware events such as system errors, performance issues, and inventory. Electronic Service Agent can help focus on the customer's company strategic business initiatives, save time, and spend less effort managing day-to-day IT maintenance issues.
Integrated in the operating system in addition to the HMC, Electronic Service Agent is designed to automatically and electronically report system failures and customer­perceived issues to IBM, which can result in faster problem resolution and increased availability. System configuration and inventory information collected by Electronic Service Agent also can be viewed on the secure Electronic Services Web portal and used to improve problem determination and resolution between the customer and the IBM support team. As part of an increased focus to provide even better service to IBM customers, Electronic Service Agent tool configuration and activation comes standard with the system. In support of this effort, a new HMC External Connectivity security whitepaper has been published, which describes data exchanges between the HMC and the IBM Service Delivery Center (SDC) and the methods and protocols for this exchange. To read the whitepaper and prepare for Electronic Service Agent installation, go to the "Reference Guide" section of
http://www.ibm.com/support/electronic
Select your country.
Click on "IBM Electronic Service Agent Connectivity Guide."
Benefits
Increased uptime: Electronic Service Agent is designed to enhance the warranty and maintenance service by providing faster hardware error reporting and uploading system information to IBM Support. This can optimize the time monitoring the symptoms, diagnosing the error, and manually calling IBM Support to open a problem record. And 24 x 7 monitoring and reporting means no more dependency on human intervention or off-hours customer personnel when errors are encountered in the middle of the night.
Security: Electronic Service Agent is secure in monitoring, reporting, and storing the data at IBM. Electronic Service Agent securely transmits via the Internet (HTTPS or VPN) and can be configured to communicate securely through gateways to provide customers a single point of exit from their site. Communication between the customer and IBM only flows one way; activating Service Agent does not enable IBM to call into a customer's system. System inventory information is stored in a secure database, which is protected behind IBM firewalls. The customer's business applications or business data is never transmitted to IBM.
More accurate reporting: Because system information and error logs are automatically uploaded to the IBM Support Center in conjunction with the service request, customers are not required to find and send system information, decreasing the risk of misreported or misdiagnosed errors. Once inside IBM, problem error data is run through a data knowledge management system and knowledge articles are appended to the problem record.
Customized support: Using the IBM ID entered during activation, customers can view system and support information in the "My Systems" and "Premium Search" sections of the Electronic Services Web site.
The Electronic Services Web portal is a single Internet entry point that replaces the multiple entry points traditionally used to access IBM Internet services and support. This Web portal enables you to gain easier access to IBM resources for assistance in resolving technical problems. The newly improved My Systems and Premium Search
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
12
functions make it even easier for Electronic Service Agent-enabled customers to track system inventory and find pertinent fixes.
My Systems provides valuable reports of installed hardware and software using information collected from the systems by IBM Electronic Service Agent. Reports are available for any system associated with the customer's IBM ID. Premium Search combines the function of search and the value of Electronic Service Agent information, providing advanced search of the technical support knowledgebase. Using Premium Search and the Service Agent information that has been collected from the system, customers are able to see search results that apply specifically to their systems.
For more information on how to utilize the power of IBM Electronic Services, visit the following Web site or contact an IBM Systems Services Representative
http://www.ibm.com/support/electronic
Accessibility by people with disabilities
A U.S. Section 508 Voluntary Product Accessibility Template (VPAT) containing details on accessibility compliance can be requested at
http://www.ibm.com/able/product_accessibility/index.html
Section 508 of the U.S. Rehabilitation Act
IBM Power 755 server is capable as of February 19, 2010, when used in accordance with associated IBM documentation, of satisfying the applicable requirements of Section 508 of the Rehabilitation Act, provided that any assistive technology used with the product properly interoperates with it. A U.S. Section 508 Voluntary Product Accessibility Template (VPAT) can be requested at
http://www-03.ibm.com/able/product_accessibility/index.html
Statement of general direction
IBM is working with Red Hat on POWER7 support. Red Hat plans to support the Power 750, 755, 770, and 780 models in an upcoming release targeted for availability in first half 2010. For additional questions on the availability of this release, contact Red Hat.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Any reliance on these Statements of Direction is at the relying party's sole risk and will not create liability or obligation for IBM.
The information on the new product is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information on the new product is for informational purposes only and may not be incorporated into any contract. The information on the new product is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion.
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
13
Product number
The following are newly announced features on the specific models of the IBM Power Systems 8236 machine type:
Feature Description MT Model number
IBM Power 755 8236 E8C
AIX Partition Specify 8236 E8C 0265 Linux Partition Specify 8236 E8C 0266 CSC Specify 8236 E8C 0275 V.24/EIA232 6.1m (20-Ft) PCI Cable 8236 E8C 0348 V.35 6.1m (20-Ft) PCI Cable 8236 E8C 0353 X.21 6.1m (20-Ft) PCI Cable 8236 E8C 0359
Customer Specified Placement 8236 E8C 0456 SSD Placement Indicator - CEC 8236 E8C 0462 SSD Placement Indicator - 5886 8236 E8C 0464 19 inch, 1.8 meter high rack 8236 E8C 0551 19 inch, 2.0 meter high rack 8236 E8C 0553 Rack Filler Panel Kit 8236 E8C 0599 SAN Load Source Specify 8236 E8C 0837 US TAA Compliance Indicator 8236 E8C 0983
1.5 Meter 12X to 4X Channel Conversion Cable 8236 E8C 1828
3 Meter 12X to 4X Channel Conversion Cable 8236 E8C 1841 10 Meter 12X to 4X Enhanced Channel Conversion Cable 8236 E8C 1854
0.6 Meter 12X DDR Cable 8236 E8C 1861
Op Panel Cable for Rack-mount Drawer w/2.5" DASD 8236 E8C 1878
146.8GB 10K RPM SAS SFF Disk Drive 8236 E8C 1882
73.4 GB 15K RPM SAS SFF Disk Drive 8236 E8C 1883
300GB 10K RPM SFF SAS Disk Drive 8236 E8C 1885 146GB 15K RPM SFF SAS Disk Drive 8236 E8C 1886 69GB SFF SAS Solid State Drive 8236 E8C 1890 Primary OS - AIX 8236 E8C 2146 Primary OS - Linux 8236 E8C 2147 Zero-priced Processor Activation for #8332 8236 E8C 2325 2M LC-SC 50 Micron Fiber Converter Cable 8236 E8C 2456 2M LC-SC 62.5 Micron Fiber Converter Cable 8236 E8C 2459 4 port USB PCIe Adapter 8236 E8C 2728 PCIe 2-Line WAN w/Modem 8236 E8C 2893
3M Asynchronous Terminal/Printer Cable EIA-232 8236 E8C 2934 Asynchronous Cable EIA-232/V.24 3M 8236 E8C 2936 Serial-to-Serial Port Cable for Drawer/Drawer-
3.7M 8236 E8C 3124
Serial-to-Serial Port Cable for Rack/Rack- 8M 8236 E8C 3125 69GB 3.5" SAS Solid State Drive 8236 E8C 3586 Widescreen LCD Monitor 8236 E8C 3632 146GB 15K RPM SAS Disk Drive 8236 E8C 3647 300GB 15K RPM SAS Disk Drive 8236 E8C 3648 450GB 15K RPM SAS Disk Drive 8236 E8C 3649 SAS Cable (EE) Drawer to Drawer 1M 8236 E8C 3652 SAS Cable (EE) Drawer to Drawer 3M 8236 E8C 3653 SAS Cable (EE) Drawer to Drawer 6M 8236 E8C 3654 SAS Cable (X) Adapter to SAS Enclosure, Dual Controller/Dual Path 3M: 8236 E8C 3661 SAS Cable (X) Adapter to SAS Enclosure, Dual Controller/Dual Path 6M: 8236 E8C 3662 SAS Cable (X) Adapter to SAS Enclosure, Dual Controller/Dual Path 15M: 8236 E8C 3663 SAS Cable, DASD Backplane to Rear Bulkhead 8236 E8C 3668 SAS Cable (AI)- Adapter to Internal drive 1M 8236 E8C 3679 3M SAS CABLE, ADPTR TO ADPTR (AA) 8236 E8C 3681 6M SAS CABLE, ADPTR TO ADPTR (AA) 8236 E8C 3682 SAS Cable (AE) Adapter to Enclosure, single
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
14
controller/single path 3M 8236 E8C 3684 SAS Cable (AE) Adapter to Enclosure, single controller/single path 6M 8236 E8C 3685 SAS Cable (YI) System to SAS Enclosure, Single Controller/Dual Path 1.5M 8236 E8C 3686 SAS Cable (YI) System to SAS Enclosure, Single Controller/Dual Path 3M 8236 E8C 3687 SAS Cable (YO) Adapter to SAS Enclosure, Single Controller/Dual Path 1.5 M 8236 E8C 3691 SAS Cable (YO) Adapter to SAS Enclosure, Single Controller/Dual Path 3 M 8236 E8C 3692 SAS Cable (YO) Adapter to SAS Enclosure, Single Controller/Dual Path 6 M 8236 E8C 3693 SAS Cable (YO) Adapter to SAS Enclosure, Single Controller/Dual Path 15 M 8236 E8C 3694
0.3M Serial Port Converter Cable, 9-Pin to 25-Pin 8236 E8C 3925
Asynch Printer/Terminal Cable, 9-pin to 25-pin, 4M 8236 E8C 3926 Serial Port Null Modem Cable, 9-pin to 9-pin,
3.7M 8236 E8C 3927
Serial Port Null Modem Cable, 9-pin to 9-pin, 10M 8236 E8C 3928
1.8 M (6-ft) Extender Cable for Displays (15-pin
D-shell to 15-pin D-shell) 8236 E8C 4242 Extender Cable - USB Keyboards, 2M 8236 E8C 4256 VGA to DVI Connection Converter 8236 E8C 4276 8GB (2x4GB) Memory DIMMs, 1066 MHz, 2Gb DDR3 DRAM 8236 E8C 4526 16GB (2x8GB) Memory DIMMs, 1066 MHz, 2Gb DDR3 DRAM 8236 E8C 4527 Rack Indicator- Not Factory Integrated 8236 E8C 4650 Rack Indicator, Rack #1 8236 E8C 4651 Rack Indicator, Rack #2 8236 E8C 4652 Rack Indicator, Rack #3 8236 E8C 4653 Rack Indicator, Rack #4 8236 E8C 4654 Rack Indicator, Rack #5 8236 E8C 4655 Rack Indicator, Rack #6 8236 E8C 4656 Rack Indicator, Rack #7 8236 E8C 4657 Rack Indicator, Rack #8 8236 E8C 4658 Rack Indicator, Rack #9 8236 E8C 4659 Rack Indicator, Rack #10 8236 E8C 4660 Rack Indicator, Rack #11 8236 E8C 4661 Rack Indicator, Rack #12 8236 E8C 4662 Rack Indicator, Rack #13 8236 E8C 4663 Rack Indicator, Rack #14 8236 E8C 4664 Rack Indicator, Rack #15 8236 E8C 4665 Rack Indicator, Rack #16 8236 E8C 4666 PCI-X Cryptographic Coprocessor (FIPS 4) 8236 E8C 4764 RFID TAGS FOR SERVERS, BLADES, BLADECENTERS, RACKS, AND HMCS 8236 E8C 5524 GX Dual-port 12X Channel Attach 8236 E8C 5609 Dual Port (SR) Integrated Virtual Ethernet 10Gb Daughter Card 8236 E8C 5613 4-Port 1Gb Integrated Virtual Ethernet Daughter Card 8236 E8C 5624 Blind Swap Type III Cassette- PCIe, Short Slot 8236 E8C 5646 Blind Swap Type III Cassette- PCI-X or PCIe, Standard Slot 8236 E8C 5647 IBM 2-Port 10/100/1000 Base-TX Ethernet PCI-X Adapter 8236 E8C 5706 10Gb FCoE PCIe Dual Port Adapter 8236 E8C 5708 1 Gigabit iSCSI TOE PCI-X on Copper Media Adapter 8236 E8C 5713 4-Port 10/100/1000 Base-TX PCI Express Adapter 8236 E8C 5717 10 Gigabit Ethernet-CX4 PCI Express Adapter 8236 E8C 5732 8 Gigabit PCI Express Dual Port Fibre Channel Adapter 8236 E8C 5735 POWER GXT145 PCI Express Graphics Accelerator 8236 E8C 5748 4 Gb Dual-Port Fibre Channel PCI-X 2.0 DDR Adapter 8236 E8C 5759 SATA Slimline DVD-RAM Drive 8236 E8C 5762 2-Port 10/100/1000 Base-TX Ethernet PCI Express Adapter 8236 E8C 5767 2-Port Gigabit Ethernet-SX PCI Express Adapter 8236 E8C 5768 10 Gigabit Ethernet-SR PCI Express Adapter 8236 E8C 5769 10 Gigabit Ethernet-LR PCI Express Adapter 8236 E8C 5772 4 Gigabit PCI Express Dual Port Fibre Channel
IBM United States Hardware Announcement 110-008
IBM is a registered trademark of International Business Machines Corporation
15
Loading...
+ 32 hidden pages