IBM NeXtScale nx360 M4 Installation And Service Manual

IBM NeXtScale nx360 M4 Ty p e 5 4 55
Installation and Service Guide

IBM NeXtScale nx360 M4 Ty p e 5 4 55
Installation and Service Guide

Note
Before using this information and the product it supports, read the general information in Appendix D, “Getting help and technical assistance,” on page 373, “Notices” on page 377, the Warranty Information document, and the Safety Information and Environmental Notices and User Guide documents on the IBM Documentation CD.
© Copyright IBM Corporation 2014.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

Contents

Safety ...............vii
Guidelines for trained service technicians ....viii
Inspecting for unsafe conditions ......viii
Guidelines for servicing electrical equipment . . ix
Safety statements .............x
Chapter 1. The IBM NeXtScale nx360 M4
Compute Node Type 5455 .......1
The IBM Documentation CD .........2
Hardware and software requirements .....3
The Documentation Browser ........3
Related documentation ...........4
Notices and statements in this document .....4
Features and specifications..........5
What your compute node offers ........8
Reliability, availability, and serviceability features. . 10
Major components of the compute node .....10
Major components of the storage tray......11
Major components of the GPU tray ......12
Power, controls, and indicators ........13
Compute node controls, connectors, and LEDs . 13
Console breakout cable .........15
Turning on the compute node .......15
Turning off the compute node .......16
System-board layouts ...........16
System-board internal connectors ......16
System-board external connectors ......17
System-board switches and jumpers .....18
System-board LEDs and controls ......19
Chapter 2. Configuration information
and instructions...........21
Updating the firmware ..........21
Configuring the server...........22
Using the ServerGuide Setup and Installation CD 24
Using the Setup utility..........25
Using the Boot Manager .........32
Starting the backup server firmware .....32
The UpdateXpress System Pack Installer ....32
Changing the Power Policy option to the default
settings after loading UEFI defaults .....33
Using the integrated management module . . . 33
Using the remote presence and blue-screen
capture features ............34
Using the embedded hypervisor ......36
Configuring the Ethernet controller .....37
Enabling Features on Demand Ethernet software 37
Enabling Features on Demand RAID software . . 37
Configuring RAID arrays .........38
IBM Advanced Settings Utility program ....38
Updating IBM Systems Director ......38
Updating the Universal Unique Identifier (UUID) 39
Updating the DMI/SMBIOS data ......42
Chapter 3. Troubleshooting ......47
Start here ...............47
Diagnosing a problem ..........47
Undocumented problems.........49
Service bulletins .............49
Checkout procedure ...........50
About the checkout procedure .......50
Performing the checkout procedure .....51
Diagnostic tools .............51
Power-supply LEDs ..........53
System pulse LEDs ...........55
Event logs ..............55
POST ...............58
IBM Dynamic System Analysis .......58
Automated service request (call home) .....61
IBM Electronic Service Agent .......61
Error messages .............61
Error messages .............61
Troubleshooting by symptom ........62
General problems ...........62
Hard disk drive problems ........62
Hypervisor problems ..........63
Intermittent problems ..........64
Keyboard, mouse, or USB-device problems . . . 64
Memory problems ...........66
Microprocessor problems .........67
Monitor and video problems .......68
Network connection problems .......70
Optional-device problems ........70
Power problems ............72
Serial-device problems..........73
ServerGuide problems..........74
Software problems ...........75
Universal Serial Bus (USB) port problems . . . 75
Video problems ............75
Solving power problems ..........75
Solving Ethernet controller problems ......77
Solving undetermined problems .......78
Problem determination tips .........79
Recovering the server firmware (UEFI update
failure) ................80
In-band manual recovery method ......80
In-band automated boot recovery method . . . 82
Out-of-band method ..........82
Automated boot recovery (ABR) .......82
Nx-boot failure .............83
Chapter 4. Parts listing, IBM NeXtScale nx360 M4 Compute Node Type 5455 . . 85
Replaceable server components ........85
Structural parts ............89
Power cords ..............90
© Copyright IBM Corp. 2014 iii
Chapter 5. Removing and replacing
components ............93
Installation tools.............93
Installing an optional device.........93
Installation guidelines ...........93
System reliability guidelines ........95
Handling static-sensitive devices ......95
Returning a device or component ......96
Updating the compute node configuration . . . 96
Removing a compute node from a chassis ....96
Installing a compute node in a chassis .....97
Removing a storage tray from a compute node . . 105 Installing a storage tray into a compute node. . . 106 Removing a GPU tray from a compute node . . . 108 Installing a GPU tray into a compute node . . . 109
Removing and replacing structural parts ....110
Removing the compute node cover .....110
Installing the compute node cover .....111
Removing the air baffle .........113
Replacing the air baffle .........114
Removing a RAID adapter battery holder . . . 115
Replacing a RAID adapter battery holder . . . 115
Removing the PCI riser filler .......116
Replacing the PCI riser filler .......117
Removing the filler from the GPU tray ....117
Replacing the filler on to the GPU tray ....118
Removing the front handle ........119
Installing the front handle ........120
Removing the hard disk drive cage .....121
Installing the hard disk drive cage .....123
Removing and replacing Tier 1 CRUs .....125
Removing the operator information panel . . . 125
Installing the operator information panel . . . 127
Removing the power paddle card from the GPU
tray ...............128
Replacing the power paddle card on to the GPU
tray ...............129
Removing the system battery .......130
Replacing the system battery .......131
Removing a memory module .......132
Installing a memory module .......133
Removing the optional 3.5-inch hard disk drive
hardware RAID cage ..........138
Installing the optional 3.5-inch hard disk drive
hardware RAID cage ..........140
Removing the hard disk drive backplate . . . 142
Installing the hard disk drive backplate . . . 143
Removing and installing drives ......145
Removing a PCI riser-cage assembly ....154
Replacing a PCI riser-cage assembly.....155
Removing a PCI riser-cage assembly in the GPU
tray ...............156
Replacing a PCI riser-cage assembly in the GPU
tray ...............157
Removing an adapter/GPU adapter .....159
Replacing an adapter/GPU adapter .....160
Removing the USB flash drive.......162
Installing the USB flash drive .......163
Removing and replacing Tier 2 CRUs .....165
Removing a microprocessor and heat sink . . . 165
Replacing a microprocessor and heat sink . . . 168
Removing the compute node .......176
Installing the compute node .......178
Internal cable routing and connectors .....180
Cabling hard disk drive with software RAID
signal cable .............180
Cabling hard disk drive with ServeRAID
SAS/SATA controller ..........181
Appendix A. Integrated Management Module II (IMM2) error messages . . . 185
Appendix B. UEFI (POST) error codes 309
Appendix C. DSA diagnostic test
results ..............321
DSA Broadcom network test results ......321
DSA Brocade test results..........324
DSA checkpoint panel test results ......326
DSA CPU stress test results.........327
DSA Emulex adapter test results .......328
DSA EXA port ping test results .......329
DSA hard drive test results .........330
DSA Intel network test results ........330
DSA LSI hard drive test results .......332
DSA Mellanox adapter test results ......332
DSA memory isolation test results ......333
DSA memory stress test results .......360
DSA Nvidia GPU test results ........361
DSA optical drive test results ........363
DSA system management test results .....364
DSA tape drive test results .........369
Appendix D. Getting help and
technical assistance ........373
Before you call .............373
Using the documentation .........374
Getting help and information from the World Wide
Web................374
How to send DSA data to IBM .......374
Creating a personalized support web page . . . 374
Software service and support ........375
Hardware service and support .......375
IBM Taiwan product service ........375
Notices ..............377
Trademarks ..............377
Important notes ............378
Particulate contamination .........379
Documentation format ..........380
Telecommunication regulatory statement ....380
Electronic emission notices .........380
Federal Communications Commission (FCC)
statement..............380
Industry Canada Class A emission compliance
statement..............381
Avis de conformité à la réglementation
d'Industrie Canada ..........381
Australia and New Zealand Class A statement 381
iv IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
European Union EMC Directive conformance
statement..............381
Germany Class A statement .......382
Japan VCCI Class A statement.......383
Japan Electronics and Information Technology Industries Association (JEITA) statement . . . 383 Korea Communications Commission (KCC)
statement..............383
Russia Electromagnetic Interference (EMI) Class
A statement .............383
People's Republic of China Class A electronic
emission statement ..........383
Taiwan Class A compliance statement ....384
German Ordinance for Work gloss
statement .............385
Index ...............387
Contents v
vi IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide

Safety

Before installing this product, read the Safety Information.
Antes de instalar este produto, leia as Informações de Segurança.
Læs sikkerhedsforskrifterne, før du installerer dette produkt.
Lees voordat u dit product installeert eerst de veiligheidsvoorschriften.
Ennen kuin asennat tämän tuotteen, lue turvaohjeet kohdasta Safety Information.
Avant d'installer ce produit, lisez les consignes de sécurité.
Vor der Installation dieses Produkts die Sicherheitshinweise lesen.
Prima di installare questo prodotto, leggere le Informazioni sulla Sicurezza.
© Copyright IBM Corp. 2014 vii
Les sikkerhetsinformasjonen (Safety Information) før du installerer dette produktet.
Antes de instalar este produto, leia as Informações sobre Segurança.
Antes de instalar este producto, lea la información de seguridad.
Läs säkerhetsinformationen innan du installerar den här produkten.
Bu ürünü kurmadan önce güvenlik bilgilerini okuyun.

Guidelines for trained service technicians

This section contains information for trained service technicians.

Inspecting for unsafe conditions

Use this information to help you identify potential unsafe conditions in an IBM product that you are working on.
Each IBM product, as it was designed and manufactured, has required safety items to protect users and service technicians from injury. The information in this section addresses only those items. Use good judgment to identify potential unsafe conditions that might be caused by non-IBM alterations or attachment of non-IBM features or optional devices that are not addressed in this section. If you identify
®
viii IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
an unsafe condition, you must determine how serious the hazard is and whether you must correct the problem before you work on the product.
Consider the following conditions and the safety hazards that they present: v Electrical hazards, especially primary power. Primary voltage on the frame can
cause serious or fatal electrical shock.
v Explosive hazards, such as a damaged CRT face or a bulging capacitor. v Mechanical hazards, such as loose or missing hardware.
To inspect the product for potential unsafe conditions, complete the following steps:
1. Make sure that the power is off and the power cords are disconnected.
2. Make sure that the exterior cover is not damaged, loose, or broken, and observe
any sharp edges.
3. Check the power cords: v Make sure that the third-wire ground connector is in good condition. Use a
meter to measure third-wire ground continuity for 0.1 ohm or less between the external ground pin and the frame ground.
v Make sure that the power cords are the correct type. v Make sure that the insulation is not frayed or worn.
4. Remove the cover.
5. Check for any obvious non-IBM alterations. Use good judgment as to the safety
of any non-IBM alterations.
6. Check inside the system for any obvious unsafe conditions, such as metal filings, contamination, water or other liquid, or signs of fire or smoke damage.
7. Check for worn, frayed, or pinched cables.
8. Make sure that the power-supply cover fasteners (screws or rivets) have not
been removed or tampered with.

Guidelines for servicing electrical equipment

Observe these guidelines when you service electrical equipment. v Check the area for electrical hazards such as moist floors, nongrounded power
extension cords, and missing safety grounds.
v Use only approved tools and test equipment. Some hand tools have handles that
are covered with a soft material that does not provide insulation from live electrical current.
v Regularly inspect and maintain your electrical hand tools for safe operational
condition. Do not use worn or broken tools or testers.
v Do not touch the reflective surface of a dental mirror to a live electrical circuit.
The surface is conductive and can cause personal injury or equipment damage if it touches a live electrical circuit.
v Some rubber floor mats contain small conductive fibers to decrease electrostatic
discharge. Do not use this type of mat to protect yourself from electrical shock.
v Do not work alone under hazardous conditions or near equipment that has
hazardous voltages.
v Locate the emergency power-off (EPO) switch, disconnecting switch, or electrical
outlet so that you can turn off the power quickly in the event of an electrical accident.
v Disconnect all power before you perform a mechanical inspection, work near
power supplies, or remove or install main units.
Safety ix
v Before you work on the equipment, disconnect the power cord. If you cannot
disconnect the power cord, have the customer power-off the wall box that supplies power to the equipment and lock the wall box in the off position.
v Never assume that power has been disconnected from a circuit. Check it to
make sure that it has been disconnected.
v If you have to work on equipment that has exposed electrical circuits, observe
the following precautions: – Make sure that another person who is familiar with the power-off controls is
near you and is available to turn off the power if necessary.
– When you work with powered-on electrical equipment, use only one hand.
Keep the other hand in your pocket or behind your back to avoid creating a complete circuit that could cause an electrical shock.
– When you use a tester, set the controls correctly and use the approved probe
leads and accessories for that tester.
– Stand on a suitable rubber mat to insulate you from grounds such as metal
floor strips and equipment frames.
v Use extreme care when you measure high voltages. v To ensure proper grounding of components such as power supplies, pumps,
blowers, fans, and motor generators, do not service these components outside of their normal operating locations.
v If an electrical accident occurs, use caution, turn off the power, and send another
person to get medical aid.

Safety statements

These statements provide the caution and danger information that is used in this documentation.
Important:
Each caution and danger statement in this documentation is labeled with a number. This number is used to cross reference an English-language caution or danger statement with translated versions of the caution or danger statement in the Safety Information document.
For example, if a caution statement is labeled Statement 1, translations for that caution statement are in the Safety Information document under Statement 1.
Be sure to read all caution and danger statements in this documentation before you perform the procedures. Read any additional safety information that comes with your system or optional device before you install the device.
Statement 1
x IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
DANGER
Electrical current from power, telephone, and communication cables is hazardous.
To avoid a shock hazard:
v Do not connect or disconnect any cables or perform installation,
maintenance, or reconfiguration of this product during an electrical storm.
v Connect all power cords to a properly wired and grounded electrical outlet.
v Connect to properly wired outlets any equipment that will be attached to
this product.
v When possible, use one hand only to connect or disconnect signal cables.
v Never turn on any equipment when there is evidence of fire, water, or
structural damage.
v Disconnect the attached power cords, telecommunications systems,
networks, and modems before you open the device covers, unless instructed otherwise in the installation and configuration procedures.
v Connect and disconnect cables as described in the following table when
installing, moving, or opening covers on this product or attached devices.
To Connect: To Disconnect:
1. Turn everything OFF.
2. First, attach all cables to devices.
3. Attach signal cables to connectors.
4. Attach power cords to outlet.
5. Turn device ON.
1. Turn everything OFF.
2. First, remove power cords from outlet.
3. Remove signal cables from connectors.
4. Remove all cables from devices.
Statement 2
CAUTION: When replacing the lithium battery, use only IBM Part Number 33F8354 or an equivalent type battery recommended by the manufacturer. If your system has a module containing a lithium battery, replace it only with the same module type made by the same manufacturer. The battery contains lithium and can explode if not properly used, handled, or disposed of.
Do not:
v Throw or immerse into water
v Heat to more than 100°C (212°F)
v Repair or disassemble
Dispose of the battery as required by local ordinances or regulations.
Safety xi
Statement 3
CAUTION: When laser products (such as CD-ROMs, DVD drives, fiber optic devices, or transmitters) are installed, note the following:
v Do not remove the covers. Removing the covers of the laser product could
result in exposure to hazardous laser radiation. There are no serviceable parts inside the device.
v Use of controls or adjustments or performance of procedures other than those
specified herein might result in hazardous radiation exposure.
DANGER
Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following.
Laser radiation when open. Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam.
Class 1 Laser Product Laser Klasse 1 Laser Klass 1 Luokan 1 Laserlaite Appareil A Laser de Classe 1
`
Statement 4
CAUTION: Use safe practices when lifting.
18 kg (39.7 lb) 32 kg (70.5 lb) 55 kg (121.2 lb)
xii IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Statement 5
CAUTION: The power control button on the device and the power switch on the power supply do not turn off the electrical current supplied to the device. The device also might have more than one power cord. To remove all electrical current from the device, ensure that all power cords are disconnected from the power source.
2
1
Statement 6
CAUTION: If you install a strain-relief bracket option over the end of the power cord that is connected to the device, you must connect the other end of the power cord to an easily accessible power source.
Statement 8
CAUTION: Never remove the cover on a power supply or any part that has the following label attached.
Hazardous voltage, current, and energy levels are present inside any component that has this label attached. There are no serviceable parts inside these components. If you suspect a problem with one of these parts, contact a service technician.
Safety xiii
Statement 12
CAUTION: The following label indicates a hot surface nearby.
Statement 26
CAUTION: Do not place any object on top of rack-mounted devices.
Statement 27
CAUTION: Hazardous moving parts are nearby.
Rack Safety Information, Statement 2
xiv IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
DANGER
v Always lower the leveling pads on the rack cabinet.
v Always install stabilizer brackets on the rack cabinet.
v Always install servers and optional devices starting from the bottom of the
rack cabinet.
v Always install the heaviest devices in the bottom of the rack cabinet.
Safety xv
xvi IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide

Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 5455

The IBM NeXtScale nx360 M4 Compute Node Type 5455 is a high-availability, scalable compute node that is optimized to support the next-generation microprocessor technology and is ideally suited for medium and large businesses.
The IBM NeXtScale nx360 M4 Compute Node Type 5455 is supported in the IBM NeXtScale n1200 Enclosure only.
This documentation provides the following information about setting up and troubleshooting the compute node:
v Starting and configuring the compute node v Installing the operating system v Diagnosing problems v Installing, removing, and replacing components
Packaged with the compute node are software CDs that help you configure hardware, install device drivers, and install the operating system.
If firmware and documentation updates are available, you can download them from the IBM website. The server might have features that are not described in the documentation that comes with the server, and the documentation might be updated occasionally to include information about those features, or technical updates might be available to provide additional information that is not included in the server documentation. To check for updates, go to http://www.ibm.com/ supportportal.
The compute node comes with a limited warranty. For information about the terms of the warranty and getting service and assistance, see the Warranty Information document for your compute node.
You can download the IBM ServerGuide Setup and Installation CD to help you configure the hardware, install device drivers, and install the operating system.
For a list of supported optional devices for the server, see http://www.ibm.com/ systems/info/x86servers/serverproven/compat/us.
See the Rack Installation Instructions document on the IBM System x Documentation CD for complete rack installation and removal instructions.
You can obtain up-to-date information about the server and other IBM server products at http://www.ibm.com/systems/x. At http://www.ibm.com/ supportportal, you can create a personalized support page by identifying IBM products that are of interest to you. From this personalized page, you can subscribe to weekly email notifications about new technical documents, search for information and downloads, and access various administrative services.
The compute node might have features that are not described in the documentation that comes with the compute node. The documentation might be updated occasionally to include information about those features. Technical updates might also be available to provide additional information that is not
© Copyright IBM Corp. 2014 1
included in the compute node documentation. To obtain the most up-to-date documentation for this product, go to http://publib.boulder.ibm.com/infocenter/ flexsys/information/index.jsp.
You can subscribe to information updates that are specific to your compute node at http://www.ibm.com/support/mynotifications.
The model number and serial number are on the ID label on the bezel on the front of the compute node, and on a label on the bottom of the compute node that is visible when the compute node is not in the IBM NeXtScale n1200 Enclosure. If the compute node comes with an RFID tag, the RFID tag covers the ID label on the bezel on the front of the compute node, but you can open the RFID tag to see the ID label behind it.
Note: The illustrations in this document might differ slightly from your hardware.
Node serial number
Figure 1. NeXtScale nx360 M4 compute node
In addition, the system service label, which is on the cover of the server, provides a QR code for mobile access to service information. You can scan the QR code using a QR code reader and scanner with a mobile device and get quick access to the IBM Service Information website. The IBM Service Information website provides additional information for parts installation and replacement videos, and error codes for server support.
The following illustration shows the QR code (http://ibm.co/1hrOZP0):
Figure 2. QR code

The IBM Documentation CD

The IBM Documentation CD contains documentation for the server in Portable Document Format (PDF) and includes the IBM Documentation Browser to help you find information quickly.
2 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide

Hardware and software requirements

The hardware and software requirements of the IBM Documentation CD.
The IBM Documentation CD requires the following minimum hardware and software:
v Microsoft Windows or Red Hat Linux v 100 MHz microprocessor v 32 MB of RAM v Adobe Acrobat Reader 3.0 (or later) or xpdf, which comes with Linux operating
systems

The Documentation Browser

Use the Documentation Browser to browse the contents of the CD, read brief descriptions of the documents, and view documents, using Adobe Acrobat Reader or xpdf.
The Documentation Browser automatically detects the regional settings in use in your server and displays the documents in the language for that region (if available). If a document is not available in the language for that region, the English-language version is displayed. Use one of the following procedures to start the Documentation Browser:
v If Autostart is enabled, insert the CD into the CD or DVD drive. The
Documentation Browser starts automatically.
v If Autostart is disabled or is not enabled for all users, use one of the following
procedures: – If you are using a Windows operating system, insert the CD into the CD or
DVD drive and click Start > Run. In the Open field, type:
e:\win32.bat
where e is the drive letter of the CD or DVD drive, and click OK.
– If you are using Red Hat Linux, insert the CD into the CD or DVD drive;
then, run the following command from the /mnt/cdrom directory:
sh runlinux.sh
Select the server from the Product menu. The Available Topics list displays all the documents for the server. Some documents might be in folders. A plus sign (+) indicates each folder or document that has additional documents under it. Click the plus sign to display the additional documents.
When you select a document, a description of the document is displayed under Topic Description. To select more than one document, press and hold the Ctrl key while you select the documents. Click View Book to view the selected document or documents in Acrobat Reader or xpdf. If you selected more than one document, all the selected documents are opened in Acrobat Reader or xpdf.
To search all the documents, type a word or word string in the Search field and click Search. The documents in which the word or word string appears are listed in order of the most occurrences. Click a document to view it, and press Crtl+F to use the Acrobat search function, or press Alt+F to use the xpdf search function within the document.
Click Help for detailed information about using the Documentation Browser.
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 5455 3

Related documentation

This Installation and Service Guide contains general information about the server including how to set up and cable the server, how to install supported optional devices, how to configure the server, and information to help you solve problems yourself and information for service technicians.
The following documentation also comes with the server:
v Warranty Information
This document is in printed format and comes with the server. It contains warranty terms and a pointer to the IBM Statement of Limited Warranty on the IBM website.
v Important Notices
This document is in printed format and comes with the server. It contains information about the safety, environmental, and electronic emission notices for your IBM product.
v Environmental Notices and User Guide
This document is in PDF format on the IBM Documentation CD. It contains translated environmental notices.
v IBM License Agreement for Machine Code
This document is in PDF on the IBM Documentation CD. It provides translated versions of the IBM License Agreement for Machine Code for your product.
v Licenses and Attributions Document
This document is in PDF on the IBM Documentation CD. It provides the open source notices.
v Safety Information
This document is in PDF on the IBM Documentation CD. It contains translated caution and danger statements. Each caution and danger statement that appears in the documentation has a number that you can use to locate the corresponding statement in your language in the Safety Information document.
Depending on the server model, additional documentation might be included on the IBM Documentation CD.
The ToolsCenter for System x and BladeCenter is an online information center that contains information about tools for updating, managing, and deploying firmware, device drivers, and operating systems. The ToolsCenter for System x and BladeCenter is at http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/.
The server might have features that are not described in the documentation that you received with the server. The documentation might be updated occasionally to include information about those features, or technical updates might be available to provide additional information that is not included in the server documentation. These updates are available from the IBM website. To check for updates, go to http://www.ibm.com/supportportal.

Notices and statements in this document

The caution and danger statements in this document are also in the multilingual Safety Information document, which is on the IBM Documentation CD. Each statement is numbered for reference to the corresponding statement in your language in the Safety Information document.
4 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
The following notices and statements are used in this document:
v Note: These notices provide important tips, guidance, or advice. v Important: These notices provide information or advice that might help you
avoid inconvenient or problem situations.
v Attention: These notices indicate potential damage to programs, devices, or data.
An attention notice is placed just before the instruction or situation in which damage might occur.
v Caution: These statements indicate situations that can be potentially hazardous
to you. A caution statement is placed just before the description of a potentially hazardous procedure step or situation.
v Danger: These statements indicate situations that can be potentially lethal or
extremely hazardous to you. A danger statement is placed just before the description of a potentially lethal or extremely hazardous procedure step or situation.

Features and specifications

Use this information to view specific information about the compute node, such as compute node hardware features and the dimensions of the compute node.
Notes:
1. Power, cooling, and chassis systems management are provided by the IBM NeXtScale n1200 Enclosure chassis.
2. The operating system in the compute node must provide USB support for the compute node to recognize and use USB media drives and devices. The IBM NeXtScale n1200 Enclosure chassis uses USB for internal communication with these devices.
The following table is a summary of the features and specifications of the NeXtScale nx360 M4 compute node.
Microprocessor (depending on the model):
v Supports up to two multi-core microprocessors (one installed) v Level-3 cache v Two QuickPath Interconnect (QPI) links speed up to 8.0 GT per second
Note:
v Use the Setup utility to determine the type and speed of the
microprocessors in the server.
v For a list of supported microprocessors, see http://www.ibm.com/
systems/info/x86servers/serverproven/compat/us.
Memory:
v 8 dual inline memory module (DIMM) connectors v Type: Low-profile (LP) double-data rate (DDR3) DRAM v Supports 4 GB, 8 GB, and 16 GB DIMMs with up to 128 GB of total
memory on the system board
v Support for UDIMMs and RDIMMs (combining is not supported)
Integrated functions:
v Integrated Management Module II (IMM2), which consolidates multiple
management functions in a single chip.
v Concurrent COM/VGA/2x USB (KVM)
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 5455 5
v System error LEDs v Software RAID supportability for RAID level-0, RAID level-1, or RAID
level-10
v Hardware RAID supportability for RAID level-0, RAID level-1, RAID
level-5, or RAID level-10
v Wake on LAN (WOL)
Drive expansion bays (depending on the model):
Supports up to eight 3.5-inch SATA (if the storage tray is installed, up to 7 in the storage tray and 1 in the compute node), two 2.5-inch SATA/SAS, or four 1.8-inch solid-state drives.
1
Attention: As a general consideration, do not mix standard 512-byte and advanced 4-KB format drives in the same RAID array because it might lead to potential performance issues.
Upgradeable firmware:
All firmware is field upgradeable.
PCI expansion slots (depending on your model):
v Compute node
– PCI Express x16 (x8 mechanically) slots (PCIe3.0, full-height,
half-length)
v GPU tray
– Two PCI Express x16 (x16 mechanically) slots (PCIe3.0, full-height,
full-length)
Size:
v Compute node
– Height: 41 mm (1.6 in) – Depth: 659 mm (25.9 in) – Width: 216 mm (8.5 in) – Weight estimation (based on the LFF HDD within computer node):
v Storage tray
– Height: 58.3 mm (2.3 in) – Depth: 659 mm (25.9 in) – Width: 216 mm (8.5 in) – Weight estimation (with 7 hard disk drives installed): 8.64 kg (19 lb)
v GPU tray
– Height: 58.3 mm (2.3 in) – Depth: 659 mm (25.9 in) – Width: 216 mm (8.5 in) – Weight estimation (with no GPU adapter installed): 3.33 kg (7.34 lb)
Environment:
The NeXtScale nx360 M4 compute node complies with ASHRAE class A3 specifications.
Server on
v Temperature: 5°C to 40°C (41°F to 104°F) v Humidity, non-condensing: -12°C dew point (10.4°F) and 8% to 85%
relative humidity
v Maximum dew point: 24°C (75°F) v Maximum altitude: 3048 m (10,000 ft) v Maximum rate of temperature change: 5°C/hr (41°F/hr)
6.05 kg (13.31 lb)
2
3
4,5
6
6 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Environment:
Server off7:
v Temperature: 5°C to 45°C (41°F to 113°F) v Relative humidity: 8% to 85% v Maximum dew point: 27°C (80.6°F)
Storage (non-operating):
v Temperature: 1°C to 60°C (33.8°F to 140.0°F) v Maximum altitude: 3,050 m (10,000 ft) v Relative humidity: 5% to 80% v Maximum dew point: 29°C (84.2°F)
Shipment (non-operating):
v Temperature: -40°C to 60°C (-40°F to 140.0°F) v Maximum altitude: 10,700 m (35,105 ft) v Relative humidity: 5% to 100% v Maximum dew point: 29°C (84.2°F)
Particulate contamination
Attention:
v Design to ASHRAE Class A3, temperature: 36°C - 40°C (96.8°F - 104°F),
with relaxed support: – Support cloud such as workload with no performance degradation
– Under no circumstance, can any combination of the worst case
– The worst case workload (such as linpack and turbo-on) may have
v Airborne particulates and reactive gases acting alone or in combination
with other environmental factors such as humidity or temperature might pose a risk to the compute node. For information about the limits for particulates and gases, see “Particulate contamination” on page 379.
8
9
acceptable (turbo-off)
workload and configuration result in system shutdown or design exposure at 40°C
performance degradation
Notes:
1. Onboard LSI software SATA RAID supports SATA drives and Solid state drives (SSD). SAS drives are not supported for software RAID. The booting and use of internal drives with VMware is not supported with the ServeRAID C100 (software RAID) controller.
2. Chassis is powered on.
3. A3 - Derate maximum allowable temperature 1°C/175 m above 950 m.
4. The minimum humidity level for class A3 is the higher (more moisture) of the
-12°C dew point and the 8% relative humidity. These intersect at approximately 25°C. Below this intersection (~25°C), the dew point (-12°C) represents the minimum moisture level; above the intersection, relative humidity (8%) is the minimum.
5. Moisture levels lower than 0.5°C DP, but not lower -10 °C DP or 8% relative humidity, can be accepted if appropriate control measures are implemented to limit the generation of static electricity on personnel and equipment in the data center. All personnel and mobile furnishings and equipment must be connected to ground via an appropriate static control system. The following items are considered the minimum requirements:
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 5455 7
a. Conductive materials (conductive flooring, conductive footwear on all
personnel who go into the datacenter; all mobile furnishings and equipment will be made of conductive or static dissipative materials).
b. During maintenance on any hardware, a properly functioning wrist strap
must be used by any personnel who contacts IT equipment.
6. 5°C/hr for data centers employing tape drives and 20°C/hr for data centers employing disk drives.
7. Chassis is removed from original shipping container and is installed but not in use, for example, during repair, maintenance, or upgrade.
8. The equipment acclimation period is 1 hour per 20°C of temperature change from the shipping environment to the operating environment.
9. Condensation, but not rain, is acceptable.

What your compute node offers

Your compute node offers features such as the integrated management module II, hard disk drive support, systems-management support, microprocessor technology, integrated network support, I/O expansion, large system-memory capacity, light path diagnostics LEDs, PCI Express®, and power throttling.
v Features on Demand
If a Features on Demand feature is integrated in the compute node or in an optional device that is installed in the compute node, you can purchase an activation key to activate the feature. For information about Features on Demand, see /http://www.ibm.com/systems/x/fod/.
v Flexible network support
The compute node provides flexible network capabilities: – Models with embedded Ethernet
The server comes with an integrated dual-port Intel Gigabit Ethernet controller, which supports connection to a 10 Mbps, 100 Mbps, or 1000 Mbps network.
Models without embedded Ethernet
The compute node has connectors on the system board for optional expansion adapters for adding network communication capabilities to the compute node. This provides the flexibility to install expansion adapters that support a variety of network communication technologies.
v Hard disk drive support
The compute node supports up to one 3.5-inch simple-swap SATA, two 2.5-inch simple-swap SATA/SAS, or four 1.8-inch simple-swap solid-state drives. You can implement RAID 0, RAID 1, RAID 5, or RAID 10 for the drives with hardware RAID. 2.5-inch SATA and Solid state drives (SSD) support software RAID as well.
v IBM ServerGuide Setup and Installation CD
The ServerGuide Setup and Installation CD, which you can download from the web, provides programs to help you set up the server and install a Windows operating system. The ServerGuide program detects installed optional hardware devices and provides the correct configuration programs and device drivers. For more information about the ServerGuide Setup and Installation CD, see “Using the ServerGuide Setup and Installation CD” on page 24.
v Integrated management module II (IMM2)
The integrated management module II (IMM2) combines service processor functions, video controller, and remote presence and blue-screen capture features in a single chip. The IMM provides advanced service-processor control,
8 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
monitoring, and alerting function. If an environmental condition exceeds a threshold or if a system component fails, the IMM lights LEDs to help you diagnose the problem, records the error in the IMM event log, and alerts you to the problem. Optionally, the IMM also provides a virtual presence capability for remote server management capabilities. The IMM provides remote server management through the following industry-standard interfaces:
– Intelligent Platform Management Interface (IPMI) version 2.0 – Simple Network Management Protocol (SNMP) version 3.0 – Common Information Model (CIM) – Web browser For additional information, see “Using the integrated management module” on
page 33 and the Integrated Management Module II User’s Guide at the http://www.ibm.com/supportportal.
v Large system-memory capacity
The compute node supports up to 128 GB of system memory. The memory controller provides support for up to 8 industry-standard registered ECC DDR3 on low-profile (LP) DIMMs on the system board. For the most current list of supported DIMMs, see http://www.ibm.com/systems/info/x86servers/ serverproven/compat/us.
v Light path diagnostics
Light path diagnostics provides LEDs to help you diagnose problems. For more information about light path diagnostics and the LEDs, see “Compute node controls, connectors, and LEDs” on page 13.
v Microprocessor technology
The compute node supports up to two multi-core Intel Xeon microprocessors. For more information about supported microprocessors and their part numbers, see http://www.ibm.com/systems/info/x86servers/serverproven/compat/us.
Note: The optional microprocessors that IBM supports are limited by the capacity and capability of the compute node. Any microprocessor that you install must have the same specifications as the microprocessor that came with the compute node.
v Mobile access to IBM Service Information website
The server provides a QR code on the system service label, which is on the cover of the server, that you can scan using a QR code reader and scanner with a mobile device to get quick access to the IBM Service Information website. The IBM Service Information website provides additional information for parts installation and replacement videos, and error codes for server support. For the QR code, see Chapter 1, “The IBM NeXtScale nx360 M4 Compute Node Type 5455,” on page 1.
v PCI Express
PCI Express is a serial interface that is used for chip-to-chip interconnect and expansion adapter interconnect. You can add optional I/O and storage devices.
Optional expansion nodes are available to provide a cost-effective way for you to increase and customize the capabilities of the compute node. Expansion nodes support a wide variety of industry-standard PCI Express, network, storage, and graphics adapters. For additional information, see .
®
v Power
throttling
By enforcing a power policy known as power-domain oversubscription, the IBM NeXtScale n1200 Enclosure can share the power load between twelve power supplies to ensure sufficient power for each device in the IBM NeXtScale n1200
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 5455 9
Enclosure. This policy is enforced when the initial power is applied to the IBM NeXtScale n1200 Enclosure or when a compute node is inserted into the IBM NeXtScale n1200 Enclosure.
The following settings for this policy are available: – Basic power management – Power module redundancy – Power module redundancy with compute node throttling allowed

Reliability, availability, and serviceability features

Three of the most important features in compute node design are reliability, availability, and serviceability (RAS). These RAS features help to ensure the integrity of the data that is stored in the compute node, the availability of the compute node when you need it, and the ease with which you can diagnose and correct problems.
The compute node has the following RAS features:
v Advanced Configuration and Power Interface (ACPI) v Automatic server restart (ASR) v Built-in diagnostics using DSA Preboot v Built-in monitoring for temperature, voltage, and hard disk drives v Customer support center 24 hours per day, 7 days a week v Customer upgrade of flash ROM-resident code and diagnostics v Customer-upgradeable Unified Extensible Firmware Interface (UEFI) code and
diagnostics
v ECC protected DDR3 DIMMs v ECC protection on the L2 cache v Error codes and messages v Integrated management module II (IMM2) v Light path diagnostics v Memory parity testing v Microprocessor built-in self-test (BIST) during power-on self-test (POST) v Microprocessor serial number access v Processor presence detection v ROM-resident diagnostics v System-error logging v Vital product data (VPD) on memory v Wake on LAN capability v Wake on PCI (PME) capability
1

Major components of the compute node

Use this information to locate the major components on the compute node.
The following illustration shows the major components of the compute node.
1. Service availability varies by country. Response time varies depending on the number and nature of incoming calls.
10 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Left air baffle
Battery holder
Heatsink filler
Heatsink
Cover
1.8-inch solid-state drive cage
1.8-inch solid state drive backplate
2.5-inch Hard disk drive backplate
2.5-inch Hard disk drive cage
3.5-inch Hard disk drive cage
Microprocessor
1.8-inch solid state drive
2.5-inch Hard disk drive
3.5-inch Hard disk drive
Figure 3. Major components of the compute node

Major components of the storage tray

Use this information to locate the major components on the storage tray.
The storage tray is installed on the top of a compute node. Each storage tray supports up to seven 3.5-inch LFF SATA hard disk drives.
The ServeRAID adapter can be connects from compute node via PCIe interface to support RAID level-0, RAID level-1, RAID level-5, or RAID level-10.
The following illustration shows the major components of the storage tray.
DIMM
Right air baffle
PCI riser cage
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 5455 11
2
3
4
5
6
Figure 4. Major components of the storage tray
0
1
7

Major components of the GPU tray

Use this information to locate the major components on the GPU tray.
The GPU tray is installed on the top of a compute node. Each GPU tray supports up to two Graphics Processing Unit (GPU) enclosure (full-height, full-length).
The following illustration shows the major components of the GPU tray.
12 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Front PCI riser assembly
Air baffle
Figure 5. Major components of the GPU tray
Rear PCI riser assembly
Power paddle card

Power, controls, and indicators

Use this information to view power features, turn on and turn off the compute node, and view the functions of the controls and indicators.

Compute node controls, connectors, and LEDs

Use this information for details about the controls, connectors, and LEDs.
The following illustration identifies the buttons, connectors, and LEDs on the control panel.
Power-on LED/ power button
Check log LED
Locator LED
System-error LED
Dual-port network adapter (Optional)
Pull out tag
(shared management port)
KVM connector
Ethernet 1
connector
Ethernet link activity / status LED
Ethernet 2 connector
Ethernet connection speed LED
Management
connector
(dedicated management port)
Figure 6. Compute node control panel buttons, connectors, and LEDs
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 5455 13
Power button/LED
When the compute node is connected to power through the IBM NeXtScale n1200 Enclosure, press this button to turn on or turn off the compute node.
This button is also the power LED. This green LED indicates the power status of the compute node:
v Flashing rapidly: The LED flashes rapidly for the following reasons:
– The compute node has been installed in a chassis. When you install
the compute node, the LED flashes rapidly for up to 90 seconds while the integrated management module II (IMM2) in the compute node is initializing.
– The IBM NeXtScale n1200 Enclosure does not have enough power to
turn on the compute node.
– The IMM2 in the compute node is not communicating with the
Chassis Management Module.
v Flashing slowly: The compute node is connected to power through the
IBM NeXtScale n1200 Enclosure and is ready to be turned on.
v Lit continuously: The compute node is connected to power through the
IBM NeXtScale n1200 Enclosure and is turned on.
When the compute node is on, pressing this button causes an orderly shutdown of the compute node so that it can be removed safely from the chassis. This includes shutting down the operating system (if possible) and removing power from the compute node.
If an operating system is running, you might have to press the button for approximately 4 seconds to initiate the shutdown.
Attention: Pressing the button for 4 seconds forces the operating system to shut down immediately. Data loss is possible.
Locator LED
The system administrator can remotely light this blue LED to aid in visually locating the compute node.
Check log LED
When this yellow LED is lit, it indicates that a system error has occurred. Check the “Event logs” on page 55 for additional information.
System error LED
When this yellow LED is lit, it indicates that a system error has occurred. A system-error LED is also on the rear of the server. An LED on the light path diagnostics panel on the operator information panel or on the system board is also lit to help isolate the error. This LED is controlled by the IMM.
KVM connector
Connect the console breakout cable to this connector (see “Console breakout cable” on page 15 for more information).
Note: It is best practice to connect the console breakout cable to only one compute node at a time in each IBM NeXtScale n1200 Enclosure.
Ethernet connectors
Use either of these connectors to connect the server to a network. When you enable shared Ethernet for IMM2 in the Setup utility, you can access the IMM2 using either the Ethernet 1 or the system-management Ethernet (default) connector. See Using the Setup utility for more information.
14 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Ethernet link activity/status LED
When any of these LEDs is lit, they indicate that the server is transmitting to or receiving signals from the Ethernet LAN that is connected to the Ethernet port that corresponds to that LED.
Management connector
Use this connector to connect the server to a network for full systems-management information control. This connector is used only by the Integrated Management Module II (IMM2). A dedicated management network provides additional security by physically separating the management network traffic from the production network. You can use the Setup utility to configure the server to use a dedicated systems management network or a shared network.

Console breakout cable

Use this information for details about the console breakout cable.
Use the console breakout cable to connect external I/O devices to the compute node. The console breakout cable connects through the KVM connector (see “Compute node controls, connectors, and LEDs” on page 13). The console breakout cable has connectors for a display device (video), two USB connectors for a USB keyboard and mouse, and a serial interface connector.
The following illustration identifies the connectors and components on the console breakout cable.
Serial connector
USB ports (2)
VGA connector
Captive screws
Figure 7. Console breakout cable
Note: When you install the KVM cable, gently press down the pull out tag a little to prevent interfere with the KVM cable.

Turning on the compute node

Use this information for details about turning on the compute node.
After you connect the compute node to power through the IBM NeXtScale n1200 Enclosure, the compute node can be started in any of the following ways:
v You can press the power button on the front of the compute node (see
“Compute node controls, connectors, and LEDs” on page 13) to start the compute node. The power button works only if local power control is enabled for the compute node.
KVM connector
Notes:
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 5455 15
1. Wait until the power LED on the compute node flashes slowly before you press the power button. While the IMM2 in the compute node is initializing and synchronizing with the Chassis Management Module, the power LED flashes rapidly, and the power button on the compute node does not respond. This process can take approximately 90 seconds after the compute node has been installed.
2. While the compute node is starting, the power LED on the front of the compute node is lit and does not flash. See “Compute node controls, connectors, and LEDs” on page 13 for the power LED states.
v You can turn on the compute node through the Wake on LAN feature. The
compute node must be connected to power (the power LED is flashing slowly) and must be communicating with the Chassis Management Module. The operating system must support the Wake on LAN feature, and the Wake on LAN feature must be enabled through the Chassis Management Module web interface.

Turning off the compute node

Use this information for details about turning off the compute node.
When you turn off the compute node, it is still connected to power through the IBM NeXtScale n1200 Enclosure. The compute node can respond to requests from the IMM2, such as a remote request to turn on the compute node. To remove all power from the compute node, you must remove it from the IBM NeXtScale n1200 Enclosure.
Before you turn off the compute node, shut down the operating system. See the operating-system documentation for information about shutting down the operating system.
The compute node can be turned off in any of the following ways: v You can press the power button on the compute node (see “Compute node
controls, connectors, and LEDs” on page 13). This starts an orderly shutdown of the operating system, if this feature is supported by the operating system.
v If the operating system stops functioning, you can press and hold the power
button for more than 4 seconds to turn off the compute node. Attention: Pressing the power button for 4 seconds forces the operating system
to shut down immediately. Data loss is possible.

System-board layouts

Use this information to locate the connectors, LEDs, jumpers, and switches on the system board.

System-board internal connectors

The following illustrations show the internal connectors on the system board.
16 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
PCI riser connector 2
DIMM 7
DIMM 8
Power distribution
DIMM 3DIMM 4
board connector
DIMM 2
DIMM 1
Microprocessor 1
DIMM 6
DIMM 5
Microprocessor 2
Operator information panel
10GB ethernet card connector
Figure 8. Internal connectors on system board
USB hypervisor key
3V lithium battery
SATA connector
LED signal connector
PCI riser connector 1

System-board external connectors

The following illustration shows the external connectors on the system board.
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 5455 17
KVM connector
Figure 9. External connectors on system board
Ethernet 1 connector
Ethernet 2 connector
Management connector

System-board switches and jumpers

The following illustration shows the location and description of the switches and jumpers.
18 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
1 2 3
1 2 3
Lightpath button
UEFI boot recovery jumper
Clear CMOS jumper
NMI button
Figure 10. Location and description of switches and jumpers
Note: If there is a clear protective sticker on the top of the switch blocks, you must remove and discard it to access the switches.
Note:
1. Before you change any switch settings or move any jumpers, turn off the server. Review the information in “Safety” on page vii, “Installation guidelines” on page 93, “Handling static-sensitive devices” on page 95, and “Turning off the compute node” on page 16.
2. Any system-board switch or jumper block that is not shown in the illustrations in this document are reserved.

System-board LEDs and controls

The following illustration shows the light-emitting diodes (LEDs) on the system board.
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 5455 19
Any error LED can be lit after ac power has been removed from the system-board tray so that you can isolate a problem. After ac power has been removed from the system-board tray, power remains available to these LEDs for up to 90 seconds. To view the error LEDs, press and hold the light path button on the system board to light the error LEDs. The error LEDs that were lit while the system-board tray was running will be lit again while the button is pressed.
The following illustration shows the LEDs and controls on the system board.
Microprocessor LEDmismatch
HDD 0-3 Error LEDs
DIMM 4-3 error LEDs
Microprocessor 1 error LED
DIMM 8-7 error LEDs
DIMM 2-1 error LEDs
DIMM 6-5 error LEDs
System board error LED
Lightpath LED
Microprocessor 2 error LED
Battery error LED
Slot 1 error LED
Ethernet card error LED
RTMM hearbeat LED
IMM hearbeat LED
Figure 11. LEDs and controls on system board
20 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide

Chapter 2. Configuration information and instructions

This chapter provides information about updating the firmware and using the configuration utilities.

Updating the firmware

Use this information to update the system firmware.
Important:
1. Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
2. Before you update the firmware, be sure to back up any data that is stored in the Trusted Platform Module (TPM), in case any of the TPM characteristics are changed by the new firmware. For instructions, see your encryption software documentation.
3. Installing the wrong firmware or device-driver update might cause the server to malfunction. Before you install a firmware or device-driver update, read any readme and change history files that are provided with the downloaded update. These files contain important information about the update and the procedure for installing the update, including any special procedure for updating from an early firmware or device-driver version to the latest version.
You can install code updates that are packaged as an UpdateXpress System Pack or UpdateXpress CD image. An UpdateXpress System Pack contains an integration-tested bundle of online firmware and device-driver updates for your server. Use UpdateXpress System Pack Installer to acquire and apply UpdateXpress System Packs and individual firmware and device-driver updates. For additional information and to download the UpdateXpress System Pack Installer, go to the ToolsCenter for System x and BladeCenter at http://www.ibm.com/support/ entry/portal/docdisplay?lndocid=TOOL-CENTER and click UpdateXpress System Pack Installer.
When you click an update, an information page is displayed, including a list of the problems that the update fixes. Review this list for your specific problem; however, even if your problem is not listed, installing the update might solve the problem.
Be sure to separately install any listed critical updates that have release dates that are later than the release date of the UpdateXpress System Pack or UpdateXpress image.
The firmware for the server is periodically updated and is available for download on the IBM website. To check for the latest level of firmware, such as the UEFI firmware, device drivers, and integrated management module (IMM) firmware, go to http://www.ibm.com/support/fixcentral.
Download the latest firmware for the server; then, install the firmware, using the instructions that are included with the downloaded files.
© Copyright IBM Corp. 2014 21
When you replace a device in the server, you might have to update the firmware that is stored in memory on the device or restore the pre-existing firmware from a CD or DVD image.
The following list indicates where the firmware is stored:
v UEFI firmware is stored in ROM on the system board. v IMM2 firmware is stored in ROM on the system board. v Ethernet firmware is stored in ROM on the Ethernet controller and on the
system board.
v ServeRAID firmware is stored in ROM on the system board and the RAID
adapter (if one is installed).
v SAS/SATA firmware is stored in ROM on the SAS/SATA controller on the
system board.

Configuring the server

The following configuration programs come with the server:
v Setup utility
The Setup utility is part of the UEFI firmware. Use it to perform configuration tasks such as changing interrupt request (IRQ) settings, changing the startup-device sequence, setting the date and time, and setting passwords. For information about using this program, see “Using the Setup utility” on page 25.
v Boot Manager program
The Boot Manager is part of the UEFI firmware. Use it to override the startup sequence that is set in the Setup utility and temporarily assign a device to be first in the startup sequence. For more information about using this program, see “Using the Boot Manager” on page 32.
v IBM ServerGuide Setup and Installation CD
The ServerGuide program provides software-setup tools and installation tools that are designed for the server. Use this CD during the installation of the server to configure basic hardware features, such as an integrated SAS/SATA controller with RAID capabilities, and to simplify the installation of your operating system. For information about using this CD, see “Using the ServerGuide Setup and Installation CD” on page 24.
v Integrated management module
Use the integrated management module II (IMM2) for configuration, to update the firmware and sensor data record/field replaceable unit (SDR/FRU) data, and to remotely manage a network. For information about using the IMM, see “Using the integrated management module” on page 33 and the Integrated Management Module II User's Guide at http://www-947.ibm.com/support/entry/ portal/docdisplay?lndocid=migr-5086346.
v VMware ESXi embedded hypervisor
An optional USB flash device with VMware ESXi embedded hypervisor software is available for purchase. Hypervisor is virtualization software that enables multiple operating systems to run on a host system at the same time. The USB embedded hypervisor flash device can be installed in USB connectors 3 and 4 on the system board. For more information about using the embedded hypervisor, see “Using the embedded hypervisor” on page 36.
v Remote presence capability and blue-screen capture
The remote presence and blue-screen capture features are integrated functions of the integrated management module (IMM2). The remote presence feature provides the following functions:
22 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
– Remotely viewing video with graphics resolutions up to 1600 x 1200 at 75 Hz,
regardless of the system state
– Remotely accessing the server, using the keyboard and mouse from a remote
client
– Mapping the CD or DVD drive, diskette drive, and USB flash drive on a
remote client, and mapping ISO and diskette image files as virtual drives that are available for use by the server
– Uploading a diskette image to the IMM memory and mapping it to the server
as a virtual drive
The blue-screen capture feature captures the video display contents before the IMM restarts the server when the IMM detects an operating-system hang condition. A system administrator can use the blue-screen capture feature to assist in determining the cause of the hang condition. For more information, see “Using the remote presence and blue-screen capture features” on page 34.
v Ethernet controller configuration
For information about configuring the Ethernet controller, see “Configuring the Ethernet controller” on page 37.
v Features on Demand software Ethernet software
The server provides Features on Demand software Ethernet support. You can purchase a Features on Demand software upgrade key for Fibre Channel over Ethernet (FCoE) and iSCSI storage protocols. For more information, see “Enabling Features on Demand Ethernet software” on page 37.
v Features on Demand software RAID software
The server provides Features on Demand software RAID support. You can purchase a Features on Demand software upgrade key for RAID. For more information, see “Enabling Features on Demand RAID software” on page 37.
v IBM Advanced Settings Utility (ASU) program
Use this program as an alternative to the Setup utility for modifying UEFI settings and IMM settings. Use the ASU program online or out of band to modify UEFI settings from the command line without the need to restart the server to run the Setup utility. For more information about using this program, see “IBM Advanced Settings Utility program” on page 38.
v Configuring RAID arrays
For information about configuring RAID arrays, see “Configuring RAID arrays” on page 38.
The following table lists the different server configurations and the applications that are available for configuring and managing RAID arrays.
Table 1. Server configuration and applications for configuring and managing RAID arrays
RAID array configuration (before operating system is
Server configuration
ServeRAID-H1110 adapter LSI Utility (Setup utility,
ServeRAID-M1115 adapter MegaRAID BIOS
installed)
press Ctrl+C), ServerGuide, Human Interface Infrastructure (HII)
Configuration Utility (press Ctrl+H to start), pre-boot CLI (press Ctrl+P to start), ServerGuide, HII
Chapter 2. Configuration information and instructions 23
RAID array management (after operating system is installed)
MegaRAID Storage Manager (MSM), SAS2IRCU (Command Line) Utility for Storage Management
MegaRAID Storage Manager (MSM), MegaCLI (Command Line Interface), and IBM Director
Table 1. Server configuration and applications for configuring and managing RAID arrays (continued)
RAID array configuration (before operating system is
Server configuration
ServeRAID-C100 HII MegaRAID Storage Manager
installed)
RAID array management (after operating system is installed)
(MSM), MegaCLI, and IBM Director
Notes:
1. For more information about the Human Interface Infrastructure (HII) and SAS2IRCU, go to http://www-947.ibm.com/support/entry/portal/ docdisplay?lndocid=MIGR-5088601.
2. For more information about the MegaRAID, go to http://www-
947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5073015.

Using the ServerGuide Setup and Installation CD

Use this information as an overview for using the ServerGuide Setup and Installation CD.
The ServerGuide Setup and Installation CD provides software setup tools and installation tools that are designed for your server. The ServerGuide program detects the server model and optional hardware devices that are installed and uses that information during setup to configure the hardware. The ServerGuide simplifies the operating-system installations by providing updated device drivers and, in some cases, installing them automatically.
You can download a free image of the ServerGuide Setup and Installation CD from http://www.ibm.com/support/entry/portal/docdisplay?lndocid=SERV-GUIDE.
In addition to the ServerGuide Setup and Installation CD, you must have your operating-system CD to install the operating system.
ServerGuide features
This information provides an overview of the ServerGuide features.
Features and functions can vary slightly with different versions of the ServerGuide program. To learn more about the version that you have, start the ServerGuide Setup and Installation CD and view the online overview. Not all features are supported on all server models.
The ServerGuide program has the following features:
v An easy-to-use interface v Diskette-free setup, and configuration programs that are based on detected
hardware
v Device drivers that are provided for the server model and detected hardware v Operating-system partition size and file-system type that are selectable during
setup
The ServerGuide program performs the following tasks:
v Sets system date and time v Detects installed hardware options and provides updated device drivers for
most adapters and devices
v Provides diskette-free installation for supported Windows operating systems
24 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Includes an online readme file with links to tips for your hardware and
operating-system installation
Setup and configuration overview
Use this information for the ServerGuide setup and configuration.
When you use the ServerGuide Setup and Installation CD, you do not need setup diskettes. You can use the CD to configure any supported IBM server model. The setup program provides a list of tasks that are required to set up your server model. On a server with a ServeRAID adapter or SAS/SATA controller with RAID capabilities, you can run the SAS/SATA RAID configuration program to create logical drives.
Note: Features and functions can vary slightly with different versions of the ServerGuide program.
Typical operating-system installation
This section details the ServerGuide typical operating-system installation.
The ServerGuide program can reduce the time it takes to install an operating system. It provides the device drivers that are required for your hardware and for the operating system that you are installing. This section describes a typical ServerGuide operating-system installation.
Note: Features and functions can vary slightly with different versions of the ServerGuide program.
1. After you have completed the setup process, the operating-system installation program starts. (You will need your operating-system CD to complete the installation.)
2. The ServerGuide program stores information about the server model, service processor, hard disk drive controllers, and network adapters. Then, the program checks the CD for newer device drivers. This information is stored and then passed to the operating-system installation program.
3. The ServerGuide program presents operating-system partition options that are based on your operating-system selection and the installed hard disk drives.
4. The ServerGuide program prompts you to insert your operating-system CD and restart the server. At this point, the installation program for the operating system takes control to complete the installation.
Installing your operating system without using ServerGuide
Use this information to install the operating system on the server without using ServerGuide.
If you have already configured the server hardware and you are not using the ServerGuide program to install your operating system, you can download operating-system installation instructions for the server from http:// www.ibm.com/supportportal.

Using the Setup utility

Use these instructions to start the Setup utility.
Use the Unified Extensible Firmware Interface (UEFI) Setup Utility program to perform the following tasks:
v View configuration information
Chapter 2. Configuration information and instructions 25
v View and change assignments for devices and I/O ports v Set the date and time v Set and change passwords v Set the startup characteristics of the server and the order of startup devices v Set and change settings for advanced hardware features v View, set, and change settings for power-management features v View and clear error logs v Change interrupt request (IRQ) settings v Resolve configuration conflicts
Starting the Setup utility
Use this information to start up the Setup utility.
To start the Setup utility, complete the following steps:
1. Turn on the server.
Note: Approximately 5 to 10 seconds after the server is connected to power, the power-control button becomes active.
2. When the prompt <F1> Setup is displayed, press F1. If you have set an administrator password, you must type the administrator password to access the full Setup utility menu. If you do not type the administrator password, a limited Setup utility menu is available.
3. Select settings to view or change.
Setup utility menu choices
Use the Setup utility main menu to view and configure server configuration data and settings.
The following choices are on the Setup utility main menu for the UEFI. Depending on the version of the firmware, some menu choices might differ slightly from these descriptions.
v System Information
Select this choice to view information about the server. When you make changes through other choices in the Setup utility, some of those changes are reflected in the system information; you cannot change settings directly in the system information. This choice is on the full Setup utility menu only.
System Summary
Select this choice to view configuration information, including the ID, speed, and cache size of the microprocessors, machine type and model of the server, the serial number, the system UUID, and the amount of installed memory. When you make configuration changes through other options in the Setup utility, the changes are reflected in the system summary; you cannot change settings directly in the system summary.
Product Data
Select this choice to view the system-board identifier, the revision level or issue date of the firmware, the integrated management module and diagnostics code, and the version and date.
This choice is on the full Setup utility menu only.
v System Settings
Select this choice to view or change the server component settings. – Adapters and UEFI Drivers
26 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Select this choice to view information about the UEFI 1.10 and UEFI 2.0 compliant adapters and drivers installed in the server.
Processors
Select this choice to view or change the processor settings.
Memory
Select this choice to view or change the memory settings.
Devices and I/O Ports
Select this choice to view or change assignments for devices and input/output (I/O) ports. You can configure the serial ports, configure remote console redirection, enable or disable integrated Ethernet controllers, the SAS/SATA controllers, SATA optical drive channels, PCI slots, and video controller. If you disable a device, it cannot be configured, and the operating system will not be able to detect it (this is equivalent to disconnecting the device).
Power
Select this choice to view or change power capping to control consumption, processors, and performance states.
Operating Modes
Select this choice to view or change the operating profile (performance and power utilization).
Legacy Support
Select this choice to view or set legacy support.
- Force Legacy Video on Boot Select this choice to force INT video support, if the operating system does
not support UEFI video output standards.
- Rehook INT 19h Select this choice to enable or disable devices from taking control of the
boot process. The default is Disable.
- Legacy Thunk Support Select this choice to enable or disable UEFI to interact with PCI mass
storage devices that are non-UEFI compliant. The default is Enable.
- Infinite Boot Retry Select this choice to enable or disable UEFI to infinitely retry the legacy
boot order. The default is Disable.
- BBS Boot Select this choice to enable or disable legacy boot in BBS manner. The
default is Enable.
System Security
Select this choice to view or configure Trusted Platform Module (TPM) support.
Integrated Management Module
Select this choice to view or change the settings for the integrated management module.
- Power Restore Policy Select this choice to set the mode of operation after the power lost.
- Commands on USB Interface Select this choice to enable or disable the Ethernet over USB interface on
IMM. The default is Enable.
Chapter 2. Configuration information and instructions 27
- Network Configuration Select this choice to view the system management network interface port,
the IMM MAC address, the current IMM IP address, and host name; define the static IMM IP address, subnet mask, and gateway address, specify whether to use the static IP address or have DHCP assign the IMM2 IP address, save the network changes, and reset the IMM.
- Reset IMM to Defaults Select this choice to view or reset IMM to the default settings.
- Reset IMM Select this choice to reset IMM.
Recovery
Select this choice to view or change the system recovery parameters.
- POST Attempts Select this choice to view or change the number of attempts to POST.
v POST Attempts Limit
Select this choice to view or change the Nx boot failure parameters.
- System Recovery Select this choice to view or change system recovery settings.
v POST Watchdog Timer
Select this choice to view or enable the POST watchdog timer.
v POST Watchdog Timer Value
Select this choice to view or set the POST loader watchdog timer value.
v Reboot System on NMI
Select this choice to enable or disable restarting the system whenever a nonmaskable interrupt (NMI) occurs. Enable is the default.
v Halt on Severe Error
Select this choice to enable or disable the system from booting into OS, displaying the POST event viewer whenever a severe error was detected. Disable is the default.
Storage
Select this choice to view or change the storage device settings.
Network
Select this choice to view or change the network device options, such as iSCSI.
Drive Health
Select this choice to view the status of the controllers installed in the server.
v Date and Time
Select this choice to set the date and time in the server, in 24-hour format (hour:minute:second).
This choice is on the full Setup utility menu only.
v Start Options
Select this choice to view or change the start options, including the startup sequence, keyboard NumLock state, PXE boot option, and PCI device boot priority. Changes in the startup options take effect when you start the server.
The startup sequence specifies the order in which the server checks devices to find a boot record. The server starts from the first boot record that it finds. If the server has Wake on LAN hardware and software and the operating system supports Wake on LAN functions, you can specify a startup sequence for the
28 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Wake on LAN functions. For example, you can define a startup sequence that checks for a disc in the CD-RW/DVD drive, then checks the hard disk drive, and then checks a network adapter.
This choice is on the full Setup utility menu only.
v Boot Manager
Select this choice to view, add, delete, or change the device boot priority, boot from a file, select a one-time boot, or reset the boot order to the default setting.
v System Event Logs
Select this choice to enter the System Event Manager, where you can view the POST event log and the system-event log. You can use the arrow keys to move between pages in the error log. This choice is on the full Setup utility menu only.
The POST event log contains the most recent error codes and messages that were generated during POST.
The system-event log contains POST and system management interrupt (SMI) events and all events that are generated by the baseboard management controller that is embedded in the integrated management module (IMM).
Important: If the system-error LED on the front of the server is lit but there are no other error indications, clear the system-event log. Also, after you complete a repair or correct an error, clear the system-event log to turn off the system-error LED on the front of the server.
POST Event Viewer
Select this choice to enter the POST event viewer to view the POST error messages.
System Event Log
Select this choice to view the system event log.
Clear System Event Log
Select this choice to clear the system event log.
v User Security
Select this choice to set, change, or clear passwords. See “Passwords” on page 30 for more information.
This choice is on the full and limited Setup utility menu. – Set Power-on Password
Select this choice to set or change a power-on password. See “Power-on password” on page 30 for more information.
Clear Power-on Password
Select this choice to clear a power-on password. See “Power-on password” on page 30 for more information.
Set Administrator Password
Select this choice to set or change an administrator password. An administrator password is intended to be used by a system administrator; it limits access to the full Setup utility menu. If an administrator password is set, the full Setup utility menu is available only if you type the administrator password at the password prompt. See “Administrator password” on page 32 for more information.
Clear Administrator Password
Select this choice to clear an administrator password. See “Administrator password” on page 32 for more information.
v Save Settings
Chapter 2. Configuration information and instructions 29
Select this choice to save the changes that you have made in the settings.
v Restore Settings
Select this choice to cancel the changes that you have made in the settings and restore the previous settings.
v Load Default Settings
Select this choice to cancel the changes that you have made in the settings and restore the factory settings.
v Exit Setup
Select this choice to exit from the Setup utility. If you have not saved the changes that you have made in the settings, you are asked whether you want to save the changes or exit without saving them.
Passwords
From the User Security menu choice, you can set, change, and delete a power-on password and an administrator password.
The User Security menu choice is on the full Setup utility menu only.
If you set only a power-on password, you must type the power-on password to complete the system startup and to have access to the full Setup utility menu.
An administrator password is intended to be used by a system administrator; it limits access to the full Setup utility menu. If you set only an administrator password, you do not have to type a password to complete the system startup, but you must type the administrator password to access the Setup utility menu.
If you set a power-on password for a user and an administrator password for a system administrator, you must type the power-on password to complete the system startup. A system administrator who types the administrator password has access to the full Setup utility menu; the system administrator can give the user authority to set, change, and delete the power-on password. A user who types the power-on password has access to only the limited Setup utility menu; the user can set, change, and delete the power-on password, if the system administrator has given the user that authority.
Power-on password:
If a power-on password is set, when you turn on the server, you must type the power-on password to complete the system startup. You can use any combination of6-20printable ASCII characters for the password.
When a power-on password is set, you can enable the Unattended Start mode, in which the keyboard and mouse remain locked but the operating system can start. You can unlock the keyboard and mouse by typing the power-on password.
If you forget the power-on password, you can regain access to the server in any of the following ways:
v If an administrator password is set, type the administrator password at the
password prompt. Start the Setup utility and reset the power-on password. Attention: If you set an administrator password and then forget it, there is no
way to change, override, or remove it. You must replace the system board.
v Remove the battery from the server, wait 30 seconds, and then reinstall it.
30 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Change the position of the power-on password switch (enable switch 3 of the
1 2 3
1 2 3
system board switch block (SW4) to bypass the password check (see “System-board switches and jumpers” on page 18 for more information).
Lightpath button
UEFI boot recovery jumper
Clear CMOS jumper
NMI button
Figure 12. Power-on password switch
Attention: Before you change any switch settings or move any jumpers, turn off the server; then, disconnect all power cords and external cables. See the safety information that begins “Safety” on page vii. Do not change settings or move jumpers on any system-board switch or jumper blocks that are not shown in this document.
The default for all of the switches on switch block SW3 is Off. While the server is turned off, move switch 4 of the switch block SW3 to the On
position to enable the power-on password override. You can then start the Setup utility and reset the power-on password. You do not have to return the switch to the previous position.
Chapter 2. Configuration information and instructions 31
The power-on password override switch does not affect the administrator password.
Administrator password:
If an administrator password is set, you must type the administrator password for access to the full Setup utility menu. You can use any combination of 6 to 20 printable ASCII characters for the password.
Attention: If you set an administrator password and then forget it, there is no way to change, override, or remove it. You must replace the system board.

Using the Boot Manager

Use this information for the Boot Manager.
The Boot Manager program is a built-in, menu-driven configuration utility program that you can use to temporarily redefine the first startup device without changing settings in the Setup utility.
To use the Boot Manager program, complete the following steps:
1. Turn off the server.
2. Restart the server.
3. When the prompt <F12> Select Boot Device is displayed, press F12.
4. Use the Up arrow and Down arrow keys to select an item from the menu and
press Enter.
The next time the server starts, it returns to the startup sequence that is set in the Setup utility.

Starting the backup server firmware

Use this information to start the backup server firmware.
The system board contains a backup copy area for the server firmware. This is a secondary copy of the server firmware that you update only during the process of updating the server firmware. If the primary copy of the server firmware becomes damaged, use this backup copy.
To force the server to start from the backup copy, turn off the server; then, change the position of the UEFI boot backup switch (change switch 1 of the SW4 to the on position) to enable the UEFI recovery mode.
Use the backup copy of the server firmware until the primary copy is restored. After the primary copy is restored, turn off the server; then, change back the position of the UEFI boot backup switch (change switch 1 of the SW4 to the off position).
The Update
The UpdateXpress System Pack Installer detects supported and installed device drivers and firmware in the server and installs available updates.
Xpress System Pack Installer
For additional information and to download the UpdateXpress System Pack Installer, go to the ToolsCenter for System x and BladeCenter at http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/ and click UpdateXpress
System Pack Installer.
32 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide

Changing the Power Policy option to the default settings after loading UEFI defaults

The default settings for the Power Policy option are set by the IMM2.
To change the Power Policy option to the default settings, complete the following steps.
1. Turn on the server.
Note: Approximately 20 seconds after the server is connected to AC power, the power-control button becomes active.
2. When the prompt <F1> Setup is displayed, press F1. If you have set an administrator password, you must type the administrator password to access the full Setup utility menu. If you do not type the administrator password, a limited Setup utility menu is available.
3. Select System Settings > Integrated Management Module, then set Power Restore Policy setting to Restore.
4. Go back to System Configuration and Boot Management > Save Settings.
5. Go back and check the Power Policy setting to verify that it is set to Restore
(the default).
Attention: If you set an administrator password and then forget it, there is no way to change, override, or remove it. You must replace the system board.

Using the integrated management module

The integrated management module (IMM) is a second generation of the functions that were formerly provided by the baseboard management controller hardware. It combines service processor functions, video controller, and remote presence function in a single chip.
The IMM supports the following basic systems-management features:
v Active Energy Manager. v Alerts (in-band and out-of-band alerting, PET traps - IPMI style, SNMP, e-mail). v Auto Boot Failure Recovery (ABR). v Automatic microprocessor disable on failure and restart in a two-microprocessor
configuration when one microprocessor signals an internal error. When one of the microprocessors fail, the server will disable the failing microprocessor and restart with the other microprocessor.
v Automatic Server Restart (ASR) when POST is not complete or the operating
system hangs and the operating system watchdog timer times-out. The IMM might be configured to watch for the operating system watchdog timer and reboot the system after a timeout, if the ASR feature is enabled. Otherwise, the IMM allows the administrator to generate a nonmaskable interrupt (NMI) by pressing an NMI button on the light path diagnostics panel for an operating-system memory dump. ASR is supported by IPMI.
v Boot sequence manipulation. v Command-line interface. v Configuration save and restore. v DIMM error assistance. The Unified Extensible Firmware Interface (UEFI)
disables a failing DIMM that is detected during POST, and the IMM lights the associated system error LED and the failing DIMM error LED.
Chapter 2. Configuration information and instructions 33
v Environmental monitor with fan speed control for temperature, voltages, fan
failure, power supply failure, and power backplane failure.
v Intelligent Platform Management Interface (IPMI) Specification V2.0 and
Intelligent Platform Management Bus (IPMB) support.
v Invalid system configuration (CONFIG) LED support. v Light path diagnostics LEDs indicators to report errors that occur with fans,
power supplies, microprocessor, hard disk drives, and system errors.
v Local firmware code flash update v Nonmaskable interrupt (NMI) detection and reporting. v Operating-system failure blue screen capture. v PCI configuration data. v Power/reset control (power-on, hard and soft shutdown, hard and soft reset,
schedule power control).
v Query power-supply input power. v ROM-based IMM firmware flash updates. v Serial over LAN (SOL). v Serial port redirection over telnet or ssh. v SMI handling v System event log (SEL) - user readable event log.
The IMM also provides the following remote server management capabilities through the OSA SMBridge management utility program:
v Command-line interface (IPMI Shell)
The command-line interface provides direct access to server management functions through the IPMI 2.0 protocol. Use the command-line interface to issue commands to control the server power, view system information, and identify the server. You can also save one or more commands as a text file and run the file as a script.
v Serial over LAN
Establish a Serial over LAN (SOL) connection to manage servers from a remote location. You can remotely view and change the UEFI settings, restart the server, identify the server, and perform other management functions. Any standard Telnet client application can access the SOL connection.
For more information about IMM, see the Integrated Management Module II User's Guide at http://www-947.ibm.com/support/entry/portal/ docdisplay?lndocid=migr-5086346.

Using the remote presence and blue-screen capture features

The remote presence and blue-screen capture features are integrated functions of the integrated management module II (IMM2).
The remote presence feature provides the following functions: v Remotely viewing video with graphics resolutions up to 1600 x 1200 at 75 Hz,
regardless of the system state
v Remotely accessing the server, using the keyboard and mouse from a remote
client
v Mapping the CD or DVD drive, diskette drive, and USB flash drive on a remote
client, and mapping ISO and diskette image files as virtual drives that are available for use by the server
34 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Uploading a diskette image to the IMM memory and mapping it to the server as
a virtual drive
The blue-screen capture feature captures the video display contents before the IMM restarts the server when the IMM detects an operating-system hang condition. A system administrator can use the blue-screen capture to assist in determining the cause of the hang condition.
Obtaining the IMM host name
Use this information to obtain the IMM host name.
If you are logging on to the IMM for the first time after installation, the IMM defaults to DHCP. If a DHCP server is not available, the IMM uses a static IP address of 192.168.70.125. The default IPv4 host name is “IMM-” (plus the last 12 characters on the IMM MAC address). The default host name also comes on the IMM network access tag that comes attached to the power supply on the rear of the server. The IMM network access tag provides the default host name of the IMM and does not require you to start the server.
The IPv6 link-local address (LLA) is derived from the IMM default host name. The IMM LLA is on the IMM network access tag is on the power supply on the rear of the server. To derive the link-local address, complete the following steps:
1. Take the last 12 characters on the IMM MAC address (for example, 5CF3FC5EAAD0).
2. Separate the number into pairs of hexadecimal characters (for example, 5C:F3:FC:5E:AA:D0).
3. Separate the first six and last six hexadecimal characters.
4. Add “FF” and “FE” in the middle of the 12 characters (for example, 5C F3 FC
FF FE 5E AA D0).
5. Convert the first pair of hexadecimal characters to binary (for example, 5=0101, C=1100, which results in 01011100 F3 FC FF FE 5E AA D0).
6. Flip the 7th binary character from left (0 to 1 or 1 to 0), which results in 01011110 F3 FF FE 5E AA D0.
7. Convert the binary back to hexadecimal (for example, 5E F3FCFFFE5EAAD0).
Obtaining the IP address for the IMM
Use this information to obtain the IP address for the IMM.
To access the web interface to use the remote presence feature, you need the IP address or host name of the IMM. You can obtain the IMM IP address through the Setup utility and you can obtain the IMM host name from the IMM network access tag. The server comes with a default IP address for the IMM of 192.168.70.125.
To obtain the IP address, complete the following steps:
1. Turn off the server.
Note: Approximately 5 to 10 seconds after the server is connected to power, the power-control button becomes active.
2. When the prompt <F1> Setup is displayed, press F1. (This prompt is displayed on the screen for only a few seconds. You must press F1 quickly.) If you have set both a power-on password and an administrator password, you must type the administrator password to access the full Setup utility menu.
3. From the Setup utility main menu, select System Settings.
Chapter 2. Configuration information and instructions 35
4. On the next screen, select Integrated Management Module.
5. On the next screen, select Network Configuration.
6. Find the IP address and write it down.
7. Exit from the Setup utility.
Logging on to the web interface
Use this information to log on to the web interface.
To log on to the IMM web interface, complete the following steps:
1. On a system that is connected to the server, open a web browser. In the Address or URL field, type the IP address or host name of the IMM to which you want to connect.
Note: If you are logging on to the IMM for the first time after installation, the IMM defaults to DHCP. If a DHCP host is not available, the IMM assigns a static IP address of 192.168.70.125. The IMM network access tag provides the default host name of the IMM and does not require you to start the server.
2. On the Login page, type the user name and password. If you are using the IMM for the first time, you can obtain the user name and password from your system administrator. All login attempts are documented in the system-event log.
Note: The IMM is set initially with a user name of USERID and password of PASSW0RD (with a zero, not a the letter O). You have read/write access. You must change the default password the first time you log on.
3. Click Log in to start the session. The System Status and Health page provides a quick view of the system status.
Note: If you boot to the operating system while in the IMM GUI and the message “Booting OS or in unsupported OS” is displayed under System Status > System State, disable Windows 2008 firewall or type the following command in the Windows 2008 console. This might also affect blue-screen capture features.
netsh firewall set icmpsetting type=8 mode=ENABLE
By default, the icmp packet is blocked by Windows firewall. The IMM GUI will then change to “OS booted” status after you change the setting as indicated above in both the Web and CLI interfaces.

Using the embedded hypervisor

The VMware ESXi embedded hypervisor software is available on the optional IBM USB flash device with embedded hypervisor.
The USB flash device can be installed in USB connectors on the system board (see “Internal cable routing and connectors” on page 180 for the location of the connectors). Hypervisor is virtualization software that enables multiple operating systems to run on a host system at the same time. The USB flash device is required to activate the hypervisor functions.
To start using the embedded hypervisor functions, you must add the USB flash device to the startup sequence in the Setup utility.
To add the USB flash device to the startup sequence, complete the following steps:
1. Turn on the server.
36 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Note: Approximately 5 to 10 seconds after the server is connected to power,
the power-control button becomes active.
2. When the prompt <F1> Setup is displayed, press F1.
3. From the Setup utility main menu, select Boot Manager.
4. Select Add Boot Option; then, select Generic Boot Option > Embedded
Hypervisor. Press Enter, and then select Esc.
5. Select Change Boot Order > Change the order. Use the Up arrow and Down Arrow keys to select Embedded Hypervisor and use the plus (+) and minus (-) keys to move Embedded Hypervisor in the boot order. When Embedded
Hypervisor is in the correct location in the boot order, press Enter. Select Commit Changes and press Enter.
6. Select Save Settings and then select Exit Setup.
If the embedded hypervisor flash device image becomes corrupt, you can download the image from http://www-03.ibm.com/systems/x/os/vmware/esxi/.
For additional information and instructions, see VMware vSphere 4.1 Documentation at http://www.vmware.com/support/pubs/vs_pages/ vsp_pubs_esxi41_e_vc41.html or the VMware vSphere Installation and Setup Guide at http://pubs.vmware.com/vsphere-50/topic/com.vmware.ICbase/PDF/vsphere­esxi-vcenter-server-50-installation-setup-guide.pdf.

Configuring the Ethernet controller

Use this information to configure the Ethernet controller.
The Ethernet controllers are integrated on the system board. They provide an interface for connecting to a 10 Mbps, 100 Mbps, or 1 Gbps network and provide full-duplex (FDX) capability, which enables simultaneous transmission and reception of data on the network. If the Ethernet ports in the server support auto-negotiation, the controllers detect the data-transfer rate (10BASE-T, 100BASE-TX, or 1000BASE-T) and duplex mode (full-duplex or half-duplex) of the network and automatically operate at that rate and mode.
You do not have to set any jumpers or configure the controllers. However, you must install a device driver to enable the operating system to address the controllers.
To find device drivers and information about configuring the Ethernet controllers, go to http://www.ibm.com/supportportal.

Enabling Features on Demand Ethernet software

Use this information to enable Features on Demand Ethernet software.
You can activate the Features on Demand (FoD) software upgrade key for Fibre Channel over Ethernet (FCoE) and iSCSI storage protocols that is integrated in the integrated management module. For more information and instructions for activating the Features on Demand Ethernet software key, see the IBM Features on Demand User’s Guide. To download the document, go to /http://www.ibm.com/ systems/x/fod/, log in, and click Help.

Enabling Features on Demand RAID software

Use this information to enable Features on Demand RAID software.
Chapter 2. Configuration information and instructions 37
You can activate the Features on Demand (FoD) software upgrade key for RAID that is integrated in the integrated management module. For more information and instructions for activating the Features on Demand RAID software key, see the IBM Features on Demand User’s Guide. To download the document, go to /http://www.ibm.com/systems/x/fod/, log in, and click Help.

Configuring RAID arrays

Use the Setup utility to configure RAID arrays.
The specific procedure for configuring arrays depends on the RAID controller that you are using. For details, see the documentation for your RAID controller. To access the utility for your RAID controller, complete the following steps:
1. Turn on the server.
Note: Approximately 10 seconds after the server is connected to power, the power-control button becomes active.
2. When prompted, <F1 Setup> is displayed, press F1. If you have set an administrator password, you must type the administrator password to access the full Setup utility menu. If you do not type the administrator password, a limited Setup utility menu is available.
3. Select System Settings > Storage.
4. Press Enter to refresh the list of device drivers.
5. Select the device driver for your RAID controller and press Enter.
6. Follow the instructions in the documentation for your RAID controller.

IBM Advanced Settings Utility program

The IBM Advanced Settings Utility (ASU) program is an alternative to the Setup utility for modifying UEFI settings.
Use the ASU program online or out of band to modify UEFI settings from the command line without the need to restart the system to access the Setup utility.
You can also use the ASU program to configure the optional remote presence features or other IMM2 settings. The remote presence features provide enhanced systems-management capabilities.
In addition, the ASU program provides IMM LAN over USB interface configuration through the command-line interface.
Use the command-line interface to issue setup commands. You can save any of the settings as a file and run the file as a script. The ASU program supports scripting environments through a batch-processing mode.
For more information and to download the ASU program, go to http://www.ibm.com/support/entry/portal/docdisplay?lndocid=TOOL-ASU.

Updating IBM Systems Director

Use this information to update the IBM Systems Director.
If you plan to use IBM Systems Director to manage the server, you must check for the latest applicable IBM Systems Director updates and interim fixes.
38 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Note: Changes are made periodically to the IBM website. The actual procedure
might vary slightly from what is described in this document.
Installing a newer version
To locate and install a newer version of IBM Systems Director, complete the following steps:
1. Check for the latest version of IBM Systems Director: a. Go to http://www-03.ibm.com/systems/software/director/resources.html. b. If a newer version of IBM Systems Director than what comes with the
server is shown in the drop-down list, follow the instructions on the web page to download the latest version.
2. Install the IBM Systems Director program.
Installing updates with your management server is connected to the Internet
If your management server is connected to the Internet, to locate and install updates and interim fixes, complete the following steps:
1. Make sure that you have run the Discovery and Inventory collection tasks.
2. On the Welcome page of the IBM Systems Director web interface, click View
updates.
3. Click Check for updates. The available updates are displayed in a table.
4. Select the updates that you want to install, and click Install to start the
installation wizard.
Installing updates with your management server is not connected to the Internet
If your management server is not connected to the Internet, to locate and install updates and interim fixes, complete the following steps:
1. Make sure that you have run the Discovery and Inventory collection tasks.
2. On a system that is connected to the Internet, go to http://www.ibm.com/
support/fixcentral.
3. From the Product family list, select IBM Systems Director.
4. From the Product list, select IBM Systems Director.
5. From the Installed version list, select the latest version, and click Continue.
6. Download the available updates.
7. Copy the downloaded files to the management server.
8. On the management server, on the Welcome page of the IBM Systems Director
web interface, click the Manage tab, and click Update Manager.
9. Click Import updates and specify the location of the downloaded files that you copied to the management server.
10. Return to the Welcome page of the Web interface, and click View updates.
11. Select the updates that you want to install, and click Install to start the
installation wizard.

Updating the Universal Unique Identifier (UUID)

The Universal Unique Identifier (UUID) must be updated when the system board is replaced. Use the Advanced Settings Utility to update the UUID in the UEFI-based server.
Chapter 2. Configuration information and instructions 39
The ASU is an online tool that supports several operating systems. Make sure that you download the version for your operating system. You can download the ASU from the IBM Web site. To download the ASU and update the UUID, complete the following steps.
Note: Changes are made periodically to the IBM website. The actual procedure might vary slightly from what is described in this document.
1. Download the Advanced Settings Utility (ASU): a. Go to http://www.ibm.com/supportportal. b. Click on the Downloads tab at the top of the panel. c. Under ToolsCenter, select View ToolsCenter downloads. d. Select Advanced Settings Utility (ASU). e. Scroll down and click on the link and download the ASU version for your
operating system.
2. ASU sets the UUID in the Integrated Management Module (IMM). Select one of the following methods to access the Integrated Management Module (IMM) to set the UUID:
v Online from the target system (LAN or keyboard console style (KCS) access) v Remote access to the target system (LAN based) v Bootable media containing ASU (LAN or KCS, depending upon the bootable
media)
3. Copy and unpack the ASU package, which also includes other required files, to the server. Make sure that you unpack the ASU and the required files to the same directory. In addition to the application executable (asu or asu64), the following files are required:
v For Windows based operating systems:
ibm_rndis_server_os.infdevice.cat
v For Linux based operating systems:
cdc_interface.sh
4. After you install ASU, use the following command syntax to set the UUID: asu set SYSTEM_PROD_DATA.SysInfoUUID <uuid_value> [access_method]
Where:
<uuid_value>
Up to 16-byte hexadecimal value assigned by you.
[access_method]
The access method that you selected to use from the following methods:
v Online authenticated LAN access, type the command:
[host <imm_internal_ip>] [user <imm_user_id>][password <imm_password>]
Where:
imm_internal_ip
The IMM internal LAN/USB IP address. The default value is
169.254.95.118.
imm_user_id
The IMM account (1 of 12 accounts). The default value is USERID.
40 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
imm_password
The IMM account password (1 of 12 accounts). The default value is PASSW0RD (with a zero 0 not an O).
Note: If you do not specify any of these parameters, ASU will use the default values. When the default values are used and ASU is unable to access the IMM using the online authenticated LAN access method, ASU will automatically use the unauthenticated KCS access method.
The following commands are examples of using the userid and password default values and not using the default values:
Example that does not use the userid and password default values:
asu set SYSTEM_PROD_DATA.SYsInfoUUID <uuid_value> --user <user_id>
--password <password>
Example that does use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoUUID <uuid_value>
v Online KCS access (unauthenticated and user restricted):
You do not need to specify a value for access_method when you use this access method.
Example:
asu set SYSTEM_PROD_DATA.SysInfoUUID <uuid_value>
The KCS access method uses the IPMI/KCS interface. This method requires that the IPMI driver be installed. Some operating systems have the IPMI driver installed by default. ASU provides the corresponding mapping layer. See the Advanced Settings Utility Users Guide for more details. You can access the ASU Users Guide from the IBM website.
Note: Changes are made periodically to the IBM website. The actual procedure might vary slightly from what is described in this document.
a. Go to http://www.ibm.com/supportportal. b. Click on the Downloads tab at the top of the panel. c. Under ToolsCenter, select View ToolsCenter downloads. d. Select Advanced Settings Utility (ASU). e. Scroll down and click on the link and download the ASU version for
your operating system. Scroll down and look under Online Help to download the Advanced Settings Utility Users Guide.
v Remote LAN access, type the command:
Note: When using the remote LAN access method to access IMM using the LAN from a client, the host and the imm_external_ip address are required parameters.
host <imm_external_ip> [user <imm_user_id>][password <imm_password>]
Where:
imm_external_ip
The external IMM LAN IP address. There is no default value. This parameter is required.
imm_user_id
The IMM account (1 of 12 accounts). The default value is USERID.
Chapter 2. Configuration information and instructions 41
imm_password
The IMM account password (1 of 12 accounts). The default value is PASSW0RD (with a zero 0 not an O).
The following commands are examples of using the userid and password default values and not using the default values:
Example that does not use the userid and password default values:
asu set SYSTEM_PROD_DATA.SYsInfoUUID <uuid_value> --host <imm_ip>
--user <user_id> --password <password>
Example that does use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoUUID <uuid_value> --host <imm_ip>
v Bootable media:
You can also build a bootable media using the applications available through the ToolsCenter website at http://www.ibm.com/support/entry/portal/ docdisplay?lndocid=TOOL-CENTER. From the IBM ToolsCenter page, scroll down for the available tools.
5. Restart the server.

Updating the DMI/SMBIOS data

Use this information to update the DMI/SMBIOS data.
The Desktop Management Interface (DMI) must be updated when the system board is replaced. Use the Advanced Settings Utility to update the DMI in the UEFI-based server. The ASU is an online tool that supports several operating systems. Make sure that you download the version for your operating system. You can download the ASU from the IBM website. To download the ASU and update the DMI, complete the following steps.
Note: Changes are made periodically to the IBM website. The actual procedure might vary slightly from what is described in this document.
1. Download the Advanced Settings Utility (ASU): a. Go to http://www.ibm.com/supportportal. b. Click on the Downloads tab at the top of the panel. c. Under ToolsCenter, select View ToolsCenter downloads. d. Select Advanced Settings Utility (ASU). e. Scroll down and click on the link and download the ASU version for your
operating system.
2. ASU sets the DMI in the Integrated Management Module (IMM). Select one of the following methods to access the Integrated Management Module (IMM) to set the DMI:
v Online from the target system (LAN or keyboard console style (KCS) access) v Remote access to the target system (LAN based) v Bootable media containing ASU (LAN or KCS, depending upon the bootable
media)
3. Copy and unpack the ASU package, which also includes other required files, to the server. Make sure that you unpack the ASU and the required files to the same directory. In addition to the application executable (asu or asu64), the following files are required:
v For Windows based operating systems:
ibm_rndis_server_os.inf
42 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
device.cat
v For Linux based operating systems:
cdc_interface.sh
4. After you install ASU, Type the following commands to set the DMI:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model> [access_method] asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n> [access_method] asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag> [access_method]
Where:
<m/t_model>
The server machine type and model number. Type mtm xxxxyyy, where
xxxx is the machine type and yyy is the server model number.
<s/n> The serial number on the server. Type sn zzzzzzz, where zzzzzzz is the
serial number.
<asset_method>
The server asset tag number. Type asset aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, where
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa is the asset tag number.
[access_method]
The access method that you select to use from the following methods:
v Online authenticated LAN access, type the command:
[host <imm_internal_ip>] [user <imm_user_id>][password <imm_password>]
Where:
imm_internal_ip
The IMM internal LAN/USB IP address. The default value is
169.254.95.118.
imm_user_id
The IMM account (1 of 12 accounts). The default value is USERID.
imm_password
The IMM account password (1 of 12 accounts). The default value is PASSW0RD (with a zero 0 not an O).
Note: If you do not specify any of these parameters, ASU will use the default values. When the default values are used and ASU is unable to access the IMM using the online authenticated LAN access method, ASU will automatically use the unauthenticated KCS access method.
The following commands are examples of using the userid and password default values and not using the default values:
Examples that do not use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model>
--user <imm_user_id> --password <imm_password> asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n> --user <imm_user_id>
--password <imm_password> asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag>
--user <imm_user_id> --password <imm_password>
Examples that do use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model>
Chapter 2. Configuration information and instructions 43
asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n> asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag>
v Online KCS access (unauthenticated and user restricted):
You do not need to specify a value for access_method when you use this access method.
The KCS access method uses the IPMI/KCS interface. This method requires that the IPMI driver be installed. Some operating systems have the IPMI driver installed by default. ASU provides the corresponding mapping layer. To download the Advanced Settings Utility Users Guide, complete the following steps:
Note: Changes are made periodically to the IBM website. The actual procedure might vary slightly from what is described in this document.
a. Go to http://www.ibm.com/supportportal. b. Click on the Downloads tab at the top of the panel. c. Under ToolsCenter, select View ToolsCenter downloads. d. Select Advanced Settings Utility (ASU). e. Scroll down and click on the link and download the ASU version for
your operating system. Scroll down and look under Online Help to download the Advanced Settings Utility Users Guide.
v The following commands are examples of using the userid and password
default values and not using the default values:
Examples that do not use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model> asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n> asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag>
v Remote LAN access, type the command:
Note: When using the remote LAN access method to access IMM using the LAN from a client, the host and the imm_external_ip address are required parameters.
host <imm_external_ip> [user <imm_user_id>][password <imm_password>]
Where:
imm_external_ip
The external IMM LAN IP address. There is no default value. This parameter is required.
imm_user_id
The IMM account (1 of 12 accounts). The default value is USERID.
imm_password
The IMM account password (1 of 12 accounts). The default value is PASSW0RD (with a zero 0 not an O).
The following commands are examples of using the userid and password default values and not using the default values:
Examples that do not use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model> --host <imm_ip>
--user <imm_user_id> --password <imm_password> asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n> --host <imm_ip>
--user <imm_user_id> --password <imm_password> asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag> --host <imm_ip>
44 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
--user <imm_user_id> --password <imm_password>
Examples that do use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model> --host <imm_ip> asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n> --host <imm_ip> asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag> --host <imm_ip>
v Bootable media:
You can also build a bootable media using the applications available through the ToolsCenter website at http://www.ibm.com/support/entry/portal/ docdisplay?lndocid=TOOL-CENTER. From the IBM ToolsCenter page, scroll down for the available tools.
5. Restart the server.
Chapter 2. Configuration information and instructions 45
46 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide

Chapter 3. Troubleshooting

This chapter describes the diagnostic tools and troubleshooting information that are available to help you solve problems that might occur in the server.
If you cannot diagnose and correct a problem by using the information in this chapter, see Appendix D, “Getting help and technical assistance,” on page 373 for more information.

Start here

You can solve many problems without outside assistance by following the troubleshooting procedures in this documentation and on the World Wide Web.
This document describes the diagnostic tests that you can perform, troubleshooting procedures, and explanations of error messages and error codes. The documentation that comes with your operating system and software also contains troubleshooting information.

Diagnosing a problem

Before you contact IBM or an approved warranty service provider, follow these procedures in the order in which they are presented to diagnose a problem with your server.
1. Return the server to the condition it was in before the problem occurred. If any hardware, software, or firmware was changed before the problem occurred, if possible, reverse those changes. This might include any of the following items:
v Hardware components v Device drivers and firmware v System software v UEFI firmware v System input power or network connections
2. View the light path diagnostics LEDs and event logs. The server is designed for ease of diagnosis of hardware and software problems.
v Light path diagnostics LEDs: See Fan and power controller indicators,
controls, and connectors of the IBM NeXtScale n1200 Enclosure Type 5456
Installation and Service Guide for information about using light path diagnostics LEDs.
v Event logs: See “Event logs” on page 55 for information about notification
events and diagnosis.
v Software or operating-system error codes: See the documentation for the
software or operating system for information about a specific error code. See the manufacturer's website for documentation.
3. Run IBM Dynamic System Analysis (DSA) and collect system data. Run Dynamic System Analysis (DSA) to collect information about the hardware, firmware, software, and operating system. Have this information available when you contact IBM or an approved warranty service provider. For instructions for running DSA, see the Dynamic System Analysis Installation and User's Guide.
© Copyright IBM Corp. 2014 47
To download the latest version of DSA code and the Dynamic System Analysis Installation and User's Guide, go to http://www.ibm.com/support/entry/portal/ docdisplay?lndocid=SERV-DSA.
4. Check for and apply code updates. Fixes or workarounds for many problems might be available in updated UEFI firmware, device firmware, or device drivers. To display a list of available updates for the server, go to http://www.ibm.com/support/fixcentral.
Attention: Installing the wrong firmware or device-driver update might cause the server to malfunction. Before you install a firmware or device-driver update, read any readme and change history files that are provided with the downloaded update. These files contain important information about the update and the procedure for installing the update, including any special procedure for updating from an early firmware or device-driver version to the latest version.
Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
a. Install UpdateXpress system updates. You can install code updates that are
packaged as an UpdateXpress System Pack or UpdateXpress CD image. An UpdateXpress System Pack contains an integration-tested bundle of online firmware and device-driver updates for your server. In addition, you can use IBM ToolsCenter Bootable Media Creator to create bootable media that is suitable for applying firmware updates and running preboot diagnostics. For more information about UpdateXpress System Packs, see and “Updating the firmware” on page 21. For more information about the Bootable Media Creator, see http://www.ibm.com/support/entry/portal/ docdisplay?lndocid=TOOL-BOMC.
Be sure to separately install any listed critical updates that have release dates that are later than the release date of the UpdateXpress System Pack or UpdateXpress image (see step 4b).
b. Install manual system updates.
1) Determine the existing code levels.
In DSA, click Firmware/VPD to view system firmware levels, or click Software to view operating-system levels.
2) Download and install updates of code that is not at the latest level.
To display a list of available updates for the server, go to http://www.ibm.com/support/fixcentral.
When you click an update, an information page is displayed, including a list of the problems that the update fixes. Review this list for your specific problem; however, even if your problem is not listed, installing the update might solve the problem.
5. Check for and correct an incorrect configuration. If the server is incorrectly configured, a system function can fail to work when you enable it; if you make an incorrect change to the server configuration, a system function that has been enabled can stop working.
a. Make sure that all installed hardware and software are supported. See
http://www.ibm.com/systems/info/x86servers/serverproven/compat/us to verify that the server supports the installed operating system, optional devices, and software levels. If any hardware or software component is not supported, uninstall it to determine whether it is causing the problem. You must remove nonsupported hardware before you contact IBM or an approved warranty service provider for support.
48 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
b. Make sure that the server, operating system, and software are installed
and configured correctly. Many configuration problems are caused by loose
power or signal cables or incorrectly seated adapters. You might be able to solve the problem by turning off the server, reconnecting cables, reseating adapters, and turning the server back on. For information about performing the checkout procedure, see “About the checkout procedure” on page 50. For information about configuring the server, see Chapter 2, “Configuration information and instructions,” on page 21.
6. See controller and management software documentation. If the problem is associated with a specific function (for example, if a RAID hard disk drive is marked offline in the RAID array), see the documentation for the associated controller and management or controlling software to verify that the controller is correctly configured.
Problem determination information is available for many devices such as RAID and network adapters.
For problems with operating systems or IBM software or devices, go to http://www.ibm.com/supportportal.
7. Check for troubleshooting procedures and RETAIN tips. Troubleshooting procedures and RETAIN tips document known problems and suggested solutions. To search for troubleshooting procedures and RETAIN tips, go to http://www.ibm.com/supportportal.
8. Use the troubleshooting tables. See “Troubleshooting by symptom” on page 62 to find a solution to a problem that has identifiable symptoms.
A single problem might cause multiple symptoms. Follow the troubleshooting procedure for the most obvious symptom. If that procedure does not diagnose the problem, use the procedure for another symptom, if possible.
If the problem remains, contact IBM or an approved warranty service provider for assistance with additional problem determination and possible hardware replacement. To open an online service request, go to http://www.ibm.com/ support/entry/portal/Open_service_request. Be prepared to provide information about any error codes and collected data.

Undocumented problems

If you have completed the diagnostic procedure and the problem remains, the problem might not have been previously identified by IBM. After you have verified that all code is at the latest level, all hardware and software configurations are valid, and no light path diagnostics LEDs or log entries indicate a hardware component failure, contact IBM or an approved warranty service provider for assistance.
To open an online service request, go to http://www.ibm.com/support/entry/ portal/Open_service_request. Be prepared to provide information about any error codes and collected data and the problem determination procedures that you have used.

Service bulletins

IBM continually updates the support website with the latest tips and techniques that you can use to solve problem that you might have with the IBM NeXtScale nx360 M4 Compute Node server.
To find service bulletins that are available for the IBM NeXtScale nx360 M4 Compute Node server, go to and search for Type 5455, and retain.
Chapter 3. Troubleshooting 49

Checkout procedure

The checkout procedure is the sequence of tasks that you should follow to diagnose a problem in the server.

About the checkout procedure

Before you perform the checkout procedure for diagnosing hardware problems, review the following information:
v Read the safety information that begins on page “Safety” on page vii. v IBM Dynamic System Analysis (DSA) provides the primary methods of testing
the major components of the server, such as the system board, Ethernet controller, keyboard, mouse (pointing device), serial ports, and hard disk drives. You can also use them to test some external devices. If you are not sure whether a problem is caused by the hardware or by the software, you can use the diagnostic programs to confirm that the hardware is working correctly.
v When you run DSA, a single problem might cause more than one error message.
When this happens, correct the cause of the first error message. The other error messages usually will not occur the next time you run DSA.
Exception: If multiple error codes or light path diagnostics LEDs indicate a microprocessor error, the error might be in the microprocessor or in the microprocessor socket. See “Microprocessor problems” on page 67 for information about diagnosing microprocessor problems.
v Before you run DSA, you must determine whether the failing server is part of a
shared hard disk drive cluster (two or more servers sharing external storage devices). If it is part of a cluster, you can run all diagnostic programs except the ones that test the storage unit (that is, a hard disk drive in the storage unit) or the storage adapter that is attached to the storage unit. The failing server might be part of a cluster if any of the following conditions is true:
– You have identified the failing server as part of a cluster (two or more servers
sharing external storage devices).
– One or more external storage units are attached to the failing server and at
least one of the attached storage units is also attached to another server or unidentifiable device.
– One or more servers are located near the failing server.
Important: If the server is part of a shared hard disk drive cluster, run one test at a time. Do not run any suite of tests, such as “quick” or “normal” tests, because this might enable the hard disk drive diagnostic tests.
v If the server is halted and a POST error code is displayed, see Appendix B,
“UEFI (POST) error codes,” on page 309. If the server is halted and no error message is displayed, see “Troubleshooting by symptom” on page 62 and “Solving undetermined problems” on page 78.
v For information about power-supply problems, see “Solving power problems”
on page 75, “Power problems” on page 72, and “Power-supply LEDs” on page
53.
v For intermittent problems, check the event log; see “Event logs” on page 55 and
Appendix C, “DSA diagnostic test results,” on page 321.
50 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide

Performing the checkout procedure

Use this information to perform the checkout procedure.
To perform the checkout procedure, complete the following steps:
1. Is the server part of a cluster?
v No: Go to step 2. v Yes: Shut down all failing servers that are related to the cluster. Go to step 2.
2. Complete the following steps: a. Check the power supply LEDs (see “Power-supply LEDs” on page 53). b. Turn off the server and all external devices. c. Check all internal and external devices for compatibility at
http://www.ibm.com/systems/info/x86servers/serverproven/compat/us.
d. Check all cables and power cords. e. Set all display controls to the middle positions. f. Turn on all external devices. g. Turn on the server. If the server does not start, see “Troubleshooting by
symptom” on page 62.
h. Check the system-error LED on the operator information panel. If it is lit,
check the light path diagnostics LEDs (see “Compute node controls, connectors, and LEDs” on page 13).
i. Check for the following results:
v Successful completion of POST (see “POST” on page 58 for more
information)
v Successful completion of startup, which is indicated by a readable display
of the operating-system desktop
3. Is there a readable image on the monitor screen? v No: Find the failure symptom in “Troubleshooting by symptom” on page 62;
if necessary, see “Solving undetermined problems” on page 78.
v Yes: Run DSA (see “Running DSA Preboot diagnostic programs” on page 60).
– If DSA reports an error, follow the instructions in Appendix C, “DSA
diagnostic test results,” on page 321.
– If DSA does not report an error but you still suspect a problem, see
“Solving undetermined problems” on page 78.

Diagnostic tools

The section introduces available tools to help you diagnose and solve hardware-related problems.
v Light path diagnostics
Use light path diagnostics to diagnose system errors quickly. See Light path diagnostics for more information.
v Event logs
The event logs list the error codes and messages that are generated when an error is detected for the subsystems IMM2, POST, DSA, and the server baseboard management controller. See “Event logs” on page 55 for more information.
v Integrated management module II
Chapter 3. Troubleshooting 51
The integrated management module II (IMM2) combines service processor functions, video controller, and remote presence and blue-screen capture features in a single chip. The IMM provides advanced service-processor control, monitoring, and alerting function. If an environmental condition exceeds a threshold or if a system component fails, the IMM lights LEDs to help you diagnose the problem, records the error in the IMM event log, and alerts you to the problem. Optionally, the IMM also provides a virtual presence capability for remote server management capabilities. The IMM provides remote server management through the following industry-standard interfaces:
– Intelligent Platform Management Protocol (IPMI) version 2.0 – Simple Network Management Protocol (SNMP) version 3 – Common Information Model (CIM) – Web browser For more information about the integrated management module II (IMM2), see
“Using the integrated management module” on page 33, Appendix A, “Integrated Management Module II (IMM2) error messages,” on page 185, and the Integrated Management Module II User's Guide at http://www-947.ibm.com/ support/entry/portal/docdisplay?lndocid=migr-5086346.
v IBM Dynamic System Analysis
Two editions of IBM Dynamic System Analysis (DSA) are available for diagnosing problems, DSA Portable and DSA Preboot:
– DSA Portable
DSA Portable collects and analyzes system information to aid in diagnosing server problems. DSA Portable runs on the server operating system and collects the following information about the server:
- Drive health information
- Event logs for ServeRAID controllers and service processors
- Installed hardware, including PCI and USB information
- Installed applications and hot fixes
- Kernel modules
- Light path diagnostics status
- Microprocessor, input/out hub, and UEFI error logs
- Network interfaces and settings
- RAID controller configuration
- Service processor (integrated management module) status and configuration
- System configuration
- Vital product data, firmware, and UEFI configuration
DSA Portable creates a DSA log, which is a chronologically ordered merge of the system-event log (as the IPMI event log), the integrated management module (IMM) event log (as the ASM event log), and the operating-system event logs. You can send the DSA log as a file to IBM Support (when requested by IBM Support) or view the information as a text file or HTML file.
Note: Use the latest available version of DSA to make sure you are using the most recent configuration data. For documentation and download information for DSA, see http://www.ibm.com/systems/management.
For additional information, see “IBM Dynamic System Analysis” on page 58 and DSA messages.
52 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
– DSA Preboot
DSA Preboot diagnostic program is stored in the integrated USB memory on the server. DSA Preboot collects and analyzes system information to aid in diagnosing server problems, as well as offering a rich set of diagnostic tests of the major components of the server. DSA Preboot collects the following information about the server:
- Drive health information
- Event logs for ServeRAID controllers and service processors
- Installed hardware, including PCI and USB information
- Light path diagnostics status
- Microprocessor, input/output hub, and UEFI error logs
- Network interfaces and settings
- RAID controller configuration
- Service processor (integrated management module) status and configuration
- System configuration
- Vital product data, firmware, and UEFI configuration
DSA Preboot also provides diagnostics for the following system components (when they are installed):
1. Emulex network adapter
2. IMM I2C bus
3. Light path diagnostics panel
4. Memory modules
5. Microprocessors
6. Optical devices (CD or DVD)
7. SAS or SATA drives
See “Running DSA Preboot diagnostic programs” on page 60 for more information on running the DSA Preboot program on the server.
v Troubleshooting by symptom
These tables list problem symptoms and actions to correct the problems. See “Troubleshooting by symptom” on page 62 for more information.

Power-supply LEDs

The following minimum configuration is required for the server to start.
v One microprocessor in microprocessor socket 1 v One 2 GB DIMM on the system board v One power supply v Power cord v Four cooling fans v One PCI riser-card assembly in PCI connector 1
AC power-supply LEDs
Use this information to view AC power-supply LEDs.
The following minimum configuration is required for the DC LED on the power supply to be lit:
v Power supply v Power cord
Note: You must turn on the server for the DC LED on the power supply to be lit.
Chapter 3. Troubleshooting 53
The following illustration shows the locations of the power-supply LEDs on the ac power supply.
AC power LED (green)
Figure 13. AC power-supply LEDs
DC power LED (green)
Power-supply error LED (yellow)
The following table describes the problems that are indicated by various combinations of the power-supply LEDs on an ac power supply and suggested actions to correct the detected problems.
AC power-supply LEDs
Description Action NotesAC DC Error (!)
On On Off Normal operation. Off Off Off No ac power to the
server or a problem with the ac power source.
Off Off On The power supply
has failed.
Off On Off The power supply
has failed.
Off On On The power supply
has failed.
1. Check the ac power to the server.
2. Make sure that the power cord is connected to a functioning power source.
3. Restart the server. If the error remains, check the power-supply LEDs.
4. If the problem remains, replace the power-supply.
Replace the power supply.
Replace the power supply.
Replace the power supply.
This is a normal condition when no ac power is present.
54 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
AC power-supply LEDs
Description Action NotesAC DC Error (!)
On Off Off Power-supply not
fully seated, faulty system board, or the power supply has failed.
On Off On The power supply
has failed.
On On On The power supply
has failed.
1. Reseat the power supply.
2. Follow actions in “Power
problems” on page 72.
3. Follow actions in “Solving power problems” on page 75 until the problem is solved.
Replace the power supply.
Replace the power supply.
Typically indicates a power-supply is not fully seated.

System pulse LEDs

Use this information to view the system pulse LEDs.
The following LEDs are on the system board and monitor the system power-on and power-off sequencing and boot progress (see “System-board LEDs and controls” on page 19 for the location of these LEDs).
Table 2. System pulse LEDs
LED Description Action
RTMM heartbeat Power-on and power-off
sequencing.
IMM2 heartbeat IMM2 heartbeat boot process. The following steps describe the different
1. If the LED blinks at 1Hz, it is functioning properly and no action is necessary.
2. If the LED is not blinking, (trained technician only) replace the system board.
stages of the IMM2 heartbeat sequencing process.
1. When this LED is blinking fast (approximately 4Hz), this indicates, that the IMM2 code is in the loading process.
2. When this LED goes off momentarily, this indicates that the IMM2 code has loaded completely.
3. When this LED goes off momentarily and then starts blinking slowing (approximately 1Hz), this indicates that IMM2 is fully operational. You can now press the power-control button to power-on the server.
4. If this LED does not blink within 30 seconds of connecting a power source to the server, (trained technician only) replace the system board.

Event logs

Error codes and messages displayed in POST event log, system-event log, integrated management module (IMM2) event log, and DSA event log.
v POST event log: This log contains the most recent error codes and messages
that were generated during POST. You can view the contents of the POST event
Chapter 3. Troubleshooting 55
log from the Setup utility (see “Starting the Setup utility” on page 26). For more information about POST error codes, see Appendix B, “UEFI (POST) error codes,” on page 309.
v System-event log: This log contains POST and system management interrupt
(SMI) events and all events that are generated by the baseboard management controller that is embedded in the integrated management module (IMM). You can view the contents of the system-event log through the Setup utility and through the Dynamic System Analysis (DSA) program (as IPMI event log).
The system-event log is limited in size. When it is full, new entries will not overwrite existing entries; therefore, you must periodically clear the system-event log through the Setup utility. When you are troubleshooting an error, you might have to save and then clear the system-event log to make the most recent events available for analysis. For more information about the system-event log, see Appendix A, “Integrated Management Module II (IMM2) error messages,” on page 185.
Messages are listed on the left side of the screen, and details about the selected message are displayed on the right side of the screen. To move from one entry to the next, use the Up Arrow () and Down Arrow () keys.
Some IMM sensors cause assertion events to be logged when their setpoints are reached. When a setpoint condition no longer exists, a corresponding deassertion event is logged. However, not all events are assertion-type events.
v Integrated management module II (IMM2) event log: This log contains a
filtered subset of all IMM, POST, and system management interrupt (SMI) events. You can view the IMM event log through the IMM web interface. For more information, see “Logging on to the web interface” on page 36. You can also view the IMM event log through the Dynamic System Analysis (DSA) program (as the ASM event log). For more information about IMM error messages, see Appendix A, “Integrated Management Module II (IMM2) error messages,” on page 185.
v DSA event log: This log is generated by the Dynamic System Analysis (DSA)
program, and it is a chronologically ordered merge of the system-event log (as the IPMI event log), the IMM chassis-event log (as the ASM event log), and the operating-system event logs. You can view the DSA event log through the DSA program (see “Viewing event logs without restarting the server”). For more information about DSA and DSA messages, see “IBM Dynamic System Analysis” on page 58 and Appendix C, “DSA diagnostic test results,” on page 321.
Viewing event logs through the Setup utility
To view the POST event log or system-event log, complete the following steps:
1. Turn on the server.
2. When the prompt <F1> Setup is displayed, press F1. If you have set both a
power-on password and an administrator password, you must type the administrator password to view the event logs.
3. Select System Event Logs and use one of the following procedures:
v To view the POST event log, select POST Event Viewers. v To view the system-event log, select System Event Log.
Viewing event logs without restarting the server
If the server is not hung and the IMM is connected to a network, methods are available for you to view one or more event logs without having to restart the server.
56 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
If you have installed Dynamic System Analysis (DSA) Portable, you can use it to view the system-event log (as the IPMI event log), or the IMM event log (as the ASM event log), the operating-system event logs, or the merged DSA log. You can also use DSA Preboot to view these logs, although you must restart the server to use DSA Preboot. To install DSA Portable or check for and download a later version of DSA Preboot CD image, go to http://www.ibm.com/support/entry/ portal/docdisplay?lndocid=SERV-DSA.
If IPMItool is installed in the server, you can use it to view the system-event log. Most recent versions of the Linux operating system come with a current version of IPMItool. For an overview of IPMI, go to http://www.ibm.com/developerworks/ linux/blueprints/ and click Using Intelligent Platform Management Interface (IPMI) on IBM Linux platforms.
You can view the IMM event log through the Event Log link in the integrated management module II (IMM2) web interface. For more information, see “Logging on to the web interface” on page 36.
The following table describes the methods that you can use to view the event logs, depending on the condition of the server. The first three conditions generally do not require that you restart the server.
Table 3. Methods for viewing event logs
Condition Action
The server is not hung and is connected to a network (using an operating system controlled network ports).
The server is not hung and is not connected to a network (using an operating system controlled network ports).
The server is not hung and the integrated management module II (IMM2) is connected to a network.
Use any of the following methods: v Run DSA Portable to view the diagnostic
event log (requires IPMI driver) or create an output file that you can send to IBM service and support (using ftp or local copy).
v Use IPMItool to view the system-event log
(requires IPMI driver).
v Use the web browser interface to the IMM
to view the system-event log locally (requires RNDIS USB LAN driver).
v Run DSA Portable to view the diagnostic
event log (requires IPMI driver) or create an output file that you can send to IBM service and support (using ftp or local copy).
v Use IPMItool to view the system-event log
(requires IPMI driver).
v Use the web browser interface to the IMM
to view the system-event log locally (requires RNDIS USB LAN driver).
In a web browser, type the IP address for the IMM2 and go to the Event Log page. For more information, see “Obtaining the IMM host name” on page 35 and “Logging on to the web interface” on page 36.
Chapter 3. Troubleshooting 57
Table 3. Methods for viewing event logs (continued)
Condition Action
The server is hung, and no communication can be made with the IMM.
v If DSA Preboot is installed, restart the
server and press F2 to start DSA Preboot and view the event logs (see “Running DSA Preboot diagnostic programs” on page 60 for more information).
v Alternatively, you can restart the server
and press F1 to start the Setup utility and view the POST event log or system-event log. For more information, see “Viewing event logs through the Setup utility” on page 56.
Clearing the event logs
Use this information to clear the event logs.
To clear the event logs, complete the following steps:
Note: The POST error log is automatically cleared each time the server is restarted.
1. Turn on the server.
2. When the prompt <F1> Setup is displayed, press F1. If you have set both a
power-on password and an administrator password, you must type the administrator password to view the event logs.
3. To clear the IMM system-event log, select System Event Logs > Clear System Event Log, then, press Enter twice.

POST

When you turn on the server, it performs a series of tests to check the operation of the server components and some optional devices in the server. This series of tests is called the power-on self-test, or POST.
Note: This server does not use beep codes for server status.
If a power-on password is set, you must type the password and press Enter (when you are prompted), for POST to run.
If POST detects a problem, an error message is displayed. See Appendix B, “UEFI (POST) error codes,” on page 309 for more information.
If POST detects a problem, an error message is sent to the POST event log, see “Event logs” on page 55 for more information.

IBM Dynamic System Analysis

IBM Dynamic System Analysis (DSA) collects and analyzes system information to aid in diagnosing server problems.
DSA collects the following information about the server:
v Drive health information v Event logs for ServeRAID controllers and service processors v Hardware inventory, including PCI and USB information v Installed applications and hot fixes (available in DSA Portable only)
58 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Kernel modules (available in DSA Portable only) v Light path diagnostics status v Network interfaces and settings v Performance data and details about processes that are running v RAID controller configuration v Service processor (integrated management module) status and configuration v System configuration v Vital product data and firmware information
For system-specific information about the action that you should take as a result of a message that DSA generates, see Appendix C, “DSA diagnostic test results,” on page 321.
If you cannot find a problem by using DSA, see “Solving undetermined problems” on page 78 for information about testing the server.
Note: DSA Preboot might appear to be unresponsive when you start the program. This is normal operation while the program loads.
Make sure that the server has the latest version of the DSA code. To obtain DSA code and the Dynamic System Analysis Installation and User's Guide,goto http://www.ibm.com/support/entry/portal/docdisplay?lndocid=SERV-DSA.
DSA editions
Two editions of Dynamic System Analysis are available.
v DSA Portable
DSA Portable Edition runs within the operating system; you do not have to restart the server to run it. It is packaged as a self-extracting file that you download from the web. When you run the file, it self-extracts to a temporary folder and performs comprehensive collection of hardware and operating-system information. After it runs, it automatically deletes the temporary files and folder and leaves the results of the data collection and diagnostics on the server.
If you are able to start the server, use DSA Portable.
v DSA Preboot
DSA Preboot runs outside of the operating system; you must restart the server to run it. It is provided in the flash memory on the server, or you can create a bootable media such as a CD, DVD, ISO, USB, or PXE using the IBM ToolsCenter Bootable Media Creator (BoMC). For more details, see the BoMC Installation and User's Guide at http://www.ibm.com/support/entry/portal/ docdisplay?lndocid=TOOL-BOMC. In addition to the capabilities of the other editions of DSA, DSA Preboot includes diagnostic routines that would be disruptive to run within the operating-system environment (such as resetting devices and causing loss of network connectivity). It has a graphical user interface that you can use to specify which diagnostics to run and to view the diagnostic and data collection results.
DSA Preboot provides diagnostics for the following system components, if they are installed:
– Emulex network adapter – Optical devices (CD or DVD) – Tape drives (SCSI, SAS, or SATA) – Memory – Microprocessor
Chapter 3. Troubleshooting 59
– Checkpoint panel – I2C bus – SAS and SATA drives If you are unable to restart the server or if you need comprehensive diagnostics,
use DSA Preboot.
For more information and to download the utilities, go to http://www.ibm.com/ support/entry/portal/docdisplay?lndocid=SERV-DSA.
Running DSA Preboot diagnostic programs
Use this information to run the DSA Preboot diagnostic programs.
Note: The DSA memory test might take up to 30 minutes to run. If the problem is not a memory problem, skip the memory test.
To run the DSA Preboot diagnostic programs, complete the following steps:
1. If the server is running, turn off the server and all attached devices.
2. Turn on all attached devices; then, turn on the server.
3. When the prompt <F2> Diagnostics is displayed, press F2.
Note: The DSA Preboot diagnostic program might appear to be unresponsive for an unusual length of time when you start the program. This is normal operation while the program loads. The loading process may take up to 10 minutes.
4. Optionally, select Quit to DSA to exit from the stand-alone memory diagnostic program.
Note: After you exit from the stand-alone memory diagnostic environment, you must restart the server to access the stand-alone memory diagnostic environment again.
5. Type gui to display the graphical user interface, or type cmd to display the DSA interactive menu.
6. Follow the instructions on the screen to select the diagnostic test to run.
If the diagnostic programs do not detect any hardware errors but the problem remains during normal server operation, a software error might be the cause. If you suspect a software problem, see the information that comes with your software.
A single problem might cause more than one error message. When this happens, correct the cause of the first error message. The other error messages usually will not occur the next time you run the diagnostic programs.
If the server stops during testing and you cannot continue, restart the server and try running the DSA Preboot diagnostic programs again. If the problem remains, replace the component that was being tested when the server stopped.
Diagnostic text messages
Diagnostic text messages are displayed while the tests are running.
A diagnostic text message contains one of the following results:
Passed: The test was completed without any errors. Failed: The test detected an error.
60 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Aborted: The test could not proceed because of the server configuration.
Additional information concerning test failures is available in the extended diagnostic results for each test.
Viewing the test log results and transferring the DSA collection
Use this information to view the test log results and transferring the DSA collection.
To view the test log for the results when the tests are completed, click the Success link in the Status column, if you are running the DSA graphical user interface, or type :x to exit the Execute Tests menu, if you are running the DSA interactive menu, or select Diagnostic Event Log in the graphical user interface. To transfer DSA Preboot collections to an external USB device, type the copy command in the DSA interactive menu.
v If you are running the DSA graphical user interface (GUI), click the Success link
in the Status column.
v If you are running the DSA interactive menu (CLI), type :x to exit the Execute
Tests menu; then, select completed tests to view the results.
You can also send the DSA error log to IBM support to aid in diagnosing the server problems.

Automated service request (call home)

IBM Electronic Service Agent

Error messages

Error messages

IBM provides tools that can automatically collect and send data or call IBM Support when an error is detected. These tools can help IBM Support speed up the process of diagnosing problems.
The following sections provide information about the call home tools.
IBM Electronic Service Agent monitors, tracks, and captures system hardware errors and hardware and software inventory information, and reports serviceable problems directly to IBM Support. You can also choose to collect data manually. It uses minimal system resources, and can be downloaded from the IBM website.
For more information and to download IBM Electronic Service Agent, go to http://www-01.ibm.com/support/esa/.
This section provides the list of error codes and messages for UEFI/POST, IMM, and DSA that are generated when a problem is detected.
See UEFI/POST error codes, Integrated management module II (IMM2) error messages, and DSA messages for more information.
This section provides the list of error codes and messages for UEFI/POST, IMM, and DSA that are generated when a problem is detected.
See UEFI/POST error codes, Integrated management module II (IMM2) error messages, and DSA messages for more information.
Chapter 3. Troubleshooting 61

Troubleshooting by symptom

Use the troubleshooting tables to find solutions to problems that have identifiable symptoms.
If you cannot find a solution to the problem in these tables, see DSA messages for information about testing the server and “Running DSA Preboot diagnostic programs” on page 60 for additional information about running DSA Preboot program. For additional information to help you solve problems, see “Start here” on page 47.
If you have just added new software or a new optional device and the server is not working, complete the following steps before you use the troubleshooting tables:
1. Check the system-error LED on the operator information panel; if it is lit, check the light path diagnostics LEDs (see Light path diagnostics).
2. Remove the software or device that you just added.
3. Run IBM Dynamic System Analysis (DSA) to determine whether the server is
running correctly (for information about using DSA, see DSA messages).
4. Reinstall the new software or new device.

General problems

Use this information to solve general problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
A cover latch is broken, an LED is not working, or a similar problem has occurred.
The server is hung while the screen is on. Cannot start the Setup utility by pressing F1.
If the part is a CRU, replace it. If the part is a microprocessor or the system board, the part must be replaced by a trained technician.
1. See “Nx-boot failure” on page 83 for more information.
2. See “Recovering the server firmware (UEFI update failure)” on page 80 for
more information.

Hard disk drive problems

Table 4. Hard disk drive symptoms and actions
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only)”, that step must be performed only by a trained
technician.
v Go to the IBM support website at http://www.ibm.com/supportportal to check for technical information, hints,
tips, and new device drivers or to submit a request for information.
Symptom Action
Not all drives are recognized by the hard disk drive diagnostic tests.
Remove the drive that is indicated by the diagnostic tests; then, run the hard disk drive diagnostic tests again. If the remaining drives are recognized, replace the drive that you removed with a new one.
62 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Table 4. Hard disk drive symptoms and actions (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only)”, that step must be performed only by a trained
technician.
v Go to the IBM support website at http://www.ibm.com/supportportal to check for technical information, hints,
tips, and new device drivers or to submit a request for information.
Symptom Action
The server stops responding during the hard disk drive diagnostic test.
A hard disk drive was not detected while the operating system was being started.
A hard disk drive passes the diagnostic Fixed Disk Test, but the problem remains.
Remove the hard disk drive that was being tested when the server stopped responding, and run the diagnostic test again. If the hard disk drive diagnostic test runs successfully, replace the drive that you removed with a new one.
Reseat all hard disk drives and cables; then, run the hard disk drive diagnostic tests again.
Run the diagnostic SCSI Fixed Disk Test (see “Running DSA Preboot diagnostic programs” on page 60). Note: This test is not available on servers that have RAID arrays or servers that have SATA hard disk drives.

Hypervisor problems

Use this information to solve hypervisor problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
If an optional embedded hypervisor flash device is not listed in the expected boot order, does not appear in the list of boot devices, or a similar problem has occurred.
1. Make sure that the optional embedded hypervisor flash device is selected on the boot manager <F12> Select Boot Device at startup.
2. Make sure that the embedded hypervisor flash device is seated in the connector correctly (see “Removing the USB flash drive” on page 162 and “Installing the USB flash drive” on page 163).
3. See the documentation that comes with the optional embedded hypervisor flash device for setup and configuration information.
4. Make sure that other software works on the server.
Chapter 3. Troubleshooting 63

Intermittent problems

Use this information to solve intermittent problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
A problem occurs only occasionally and is difficult to diagnose.
The server resets (restarts) occasionally.
1. Make sure that: v All cables and cords are connected securely to the rear of the server and
attached devices.
v When the server is turned on, air is flowing from the fan grille. If there is no
airflow, the fan is not working. This can cause the server to overheat and shut down.
2. Check the system-error log or IMM event logs (see “Event logs” on page 55).
1. If the reset occurs during POST and the POST watchdog timer is enabled (click
System Settings > Recovery > System Recovery > POST Watchdog Timer in the Setup utility to see the POST watchdog setting), make sure that sufficient time is allowed in the watchdog timeout value (POST Watchdog Timer). If the server continues to reset during POST, see UEFI/POST error codes and DSA messages.
2. If neither condition applies, check the system-error log or IMM system-event log (see “Event logs” on page 55).

Keyboard, mouse, or USB-device problems

Use this information to solve keyboard, mouse, or USB-device problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
All or some keys on the keyboard do not work.
1. Make sure that:
v The keyboard cable is securely connected. v The server and the monitor are turned on.
2. If you are using a USB keyboard, run the Setup utility and enable keyboardless operation.
3. If you are using a USB keyboard and it is connected to a USB hub, disconnect the keyboard from the hub and connect it directly to the server.
4. Replace the keyboard.
64 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
The mouse or USB-device does not work.
1. Make sure that:
v The mouse or USB device cable is securely connected to the server. v The mouse or USB device drivers are installed correctly. v The server and the monitor are turned on. v The mouse option is enabled in the Setup utility.
2. If you are using a USB mouse or USB device and it is connected to a USB hub, disconnect the mouse or USB device from the hub and connect it directly to the server.
3. Replace the mouse or USB-device.
Chapter 3. Troubleshooting 65

Memory problems

Use this information to solve memory problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
The amount of system memory that is displayed is less than the amount of installed physical memory.
Note: Each time you install or remove a DIMM, you must disconnect the server from the power source; then, wait 10 seconds before restarting the server.
1. Make sure that:
v No error LEDs are lit on the operator information panel. v No DIMM error LEDs are lit on the system board. v Memory mirrored channel does not account for the discrepancy. v The memory modules are seated correctly. v You have installed the correct type of memory. v If you changed the memory, you updated the memory configuration in the
Setup utility.
v All banks of memory are enabled. The server might have automatically
disabled a memory bank when it detected a problem, or a memory bank might have been manually disabled.
v There is no memory mismatch when the server is at the minimum memory
configuration.
2. Reseat the DIMMs, and then restart the server.
3. Check the POST error log:
v If a DIMM was disabled by a systems-management interrupt (SMI), replace
the DIMM.
v If a DIMM was disabled by the user or by POST, reseat the DIMM; then, run
the Setup utility and enable the DIMM.
4. Check that all DIMMs are initialized in the Setup utility; then, run memory diagnostics (see “Running DSA Preboot diagnostic programs” on page 60).
5. Reverse the DIMMs between the channels (of the same microprocessor), and then restart the server. If the problem is related to a DIMM, replace the failing DIMM.
6. Re-enable all DIMMs using the Setup utility, and then restart the server.
7. (Trained technician only) Install the failing DIMM into a DIMM connector for
microprocessor 2 (if installed) to verify that the problem is not the microprocessor or the DIMM connector.
8. (Trained technician only) Replace the system board.
66 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
Multiple DIMMs in a channel are identified as failing.
Note: Each time you install or remove a DIMM, you must disconnect the server from the power source; then, wait 10 seconds before restarting the server.
1. Reseat the DIMMs; then, restart the server.
2. Remove the highest-numbered DIMM of those that are identified and replace it
with an identical known good DIMM; then, restart the server. Repeat as necessary. If the failures continue after all identified DIMMs are replaced, go to step 4.
3. Return the removed DIMMs, one at a time, to their original connectors, restarting the server after each DIMM, until a DIMM fails. Replace each failing DIMM with an identical known good DIMM, restarting the server after each DIMM replacement. Repeat step 3 until you have tested all removed DIMMs.
4. Replace the highest-numbered DIMM of those identified; then, restart the server. Repeat as necessary.
5. Reverse the DIMMs between the channels (of the same microprocessor), and then restart the server. If the problem is related to a DIMM, replace the failing DIMM.
6. (Trained technician only) Install the failing DIMM into a DIMM connector for microprocessor 2 (if installed) to verify that the problem is not the microprocessor or the DIMM connector.
7. (Trained technician only) Replace the system board.

Microprocessor problems

Use this information to solve microprocessor problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
The server goes directly to the POST Event Viewer when it is turned on.
1. Correct any errors that are indicated by the light path diagnostics LEDs (see Light path diagnostics).
2. Make sure that the server supports all the microprocessors and that the microprocessors match in speed and cache size. To view the microprocessor information, run the Setup utility and select System Information > System Summary > Processor Details.
3. (Trained technician only) Make sure that microprocessor 1 is seated correctly.
4. (Trained technician only) Remove microprocessor 2 and restart the server.
5. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. (Trained technician only) Microprocessor b. (Trained technician only) System board
Chapter 3. Troubleshooting 67

Monitor and video problems

Use this information to solve monitor and video problems.
Some IBM monitors have their own self-tests. If you suspect a problem with your monitor, see the documentation that comes with the monitor for instructions for testing and adjusting the monitor. If you cannot diagnose the problem, call for service.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
Testing the monitor.
The screen is blank.
1. Make sure that the monitor cables are firmly connected.
2. Try using a different monitor on the server, or try using the monitor that is
being tested on a different server.
3. Run the diagnostic programs. If the monitor passes the diagnostic programs, the problem might be a video device driver.
4. (Trained technician only) Replace the system board.
1. If the server is attached to a KVM switch, bypass the KVM switch to eliminate
it as a possible cause of the problem: connect the monitor cable directly to the correct connector on the rear of the server.
2. The IMM2 remote presence function is disabled if you install an optional video adapter. To use the IMM2 remote presence function, remove the optional video adapter.
3. If the server installed with the graphical adapters while turning on the server, the IBM logo displays on the screen after approximately 3 minutes. This is normal operation while the system loads.
4. Make sure that: v The server is turned on. If there is no power to the server, see “Power
problems” on page 72.
v The monitor cables are connected correctly. v The monitor is turned on and the brightness and contrast controls are
adjusted correctly.
5. Make sure that the correct server is controlling the monitor, if applicable.
6. Make sure that damaged server firmware is not affecting the video; see
“Updating the firmware” on page 21.
7. Observe the checkpoint LEDs on the system board; if the codes are changing, go to step 6.
8. Replace the following components one at a time, in the order shown, restarting the server each time:
a. Monitor b. Video adapter (if one is installed) c. (Trained technician only) System board.
9. See “Solving undetermined problems” on page 78.
68 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
The monitor works when you turn on the server, but the screen goes blank when you start some application programs.
1. Make sure that: v The application program is not setting a display mode that is higher than
the capability of the monitor.
v You installed the necessary device drivers for the application.
2. Run video diagnostics (see “Running DSA Preboot diagnostic programs” on page 60).
v If the server passes the video diagnostics, the video is good; see “Solving
undetermined problems” on page 78.
v (Trained technician only) If the server fails the video diagnostics, replace the
system board.
The monitor has screen jitter, or the screen image is wavy, unreadable, rolling, or distorted.
1. If the monitor self-tests show that the monitor is working correctly, consider the location of the monitor. Magnetic fields around other devices (such as transformers, appliances, fluorescents, and other monitors) can cause screen jitter or wavy, unreadable, rolling, or distorted screen images. If this happens, turn off the monitor.
Attention: Moving a color monitor while it is turned on might cause screen discoloration.
Move the device and the monitor at least 305 mm (12 in.) apart, and turn on the monitor.
Notes:
a. To prevent diskette drive read/write errors, make sure that the distance
between the monitor and any external diskette drive is at least 76 mm (3 in.).
b. Non-IBM monitor cables might cause unpredictable problems.
2. Reseat the monitor cable.
3. Replace the components listed in step 2 one at a time, in the order shown,
restarting the server each time:
a. Monitor cable b. Video adapter (if one is installed) c. Monitor d. (Trained technician only) System board.
Wrong characters appear on the screen.
1. If the wrong language is displayed, update the server firmware to the latest level (see “Updating the firmware” on page 21) with the correct language.
2. Reseat the monitor cable.
3. Replace the components listed in step 2 one at a time, in the order shown,
restarting the server each time:
a. Monitor cable b. Video adapter (if one is installed) c. Monitor d. (Trained technician only) System board.
Chapter 3. Troubleshooting 69

Network connection problems

Use this information to solve network connection problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
Unable to wake the server using the Wake on LAN feature.
Log in failed by using LDAP account with SSL enabled.
1. If you are using the dual-port network adapter and the server is connected to the network using Ethernet 5 connector, check the system-error log or IMM2 system event log (see “Event logs” on page 55), make sure:
a. Fan 3 is running in standby mode, if Emulex dual port 10GBase-T
embedded adapter is installed.
b. The room temperature is not too high (see “Features and specifications” on
page 5).
c. The air vents are not blocked. d. The air baffle is installed securely.
2. Reseat the dual-port network adapter.
3. Turn off the server and disconnect it from the power source; then, wait 10
seconds before restarting the server.
4. If the problem still remains, replace the dual-port network adapter.
1. Make sure the license key is valid.
2. Generate a new license key and log in again.

Optional-device problems

Use this information to solve optional-device problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
An IBM optional device that was just installed does not work.
1. Make sure that: v The device is designed for the server (see http://www.ibm.com/systems/
info/x86servers/serverproven/compat/us).
v You followed the installation instructions that came with the device and the
device is installed correctly.
v You have not loosened any other installed devices or cables. v You updated the configuration information in the Setup utility. Whenever
memory or any other device is changed, you must update the configuration.
2. Reseat the device that you just installed.
3. Replace the device that you just installed.
70 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
An IBM optional device that worked previously does not work now.
1. Make sure that all of the cable connections for the device are secure.
2. If the device comes with test instructions, use those instructions to test the
device.
3. If the failing device is a SCSI device, make sure that:
v The cables for all external SCSI devices are connected correctly. v The last device in each SCSI chain, or the end of the SCSI cable, is
terminated correctly.
v Any external SCSI device is turned on. You must turn on an external SCSI
device before you turn on the server.
4. Reseat the failing device.
5. Replace the failing device.
Chapter 3. Troubleshooting 71

Power problems

Use this information to solve power problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
The power-control button does not work, and the reset button does not work (the server does not start). Note: The power-control button will not function until approximately 5 to 10 seconds after the server has been connected to power.
1. Make sure that the power-control button is working correctly: a. Disconnect the server power cords. b. Reconnect the power cords. c. (Trained technician only) Reseat the operator information panel cable, and
then repeat steps 1a and 1b. v (Trained technician only) If the server starts, reseat the operator
information panel. If the problem remains, replace the operator information panel.
v If the server does not start, bypass the power-control button by using the
force power-on jumper. If the server starts, reseat the operator information panel. If the problem remains, replace the operator information panel.
2. Make sure that the reset button is working correctly: a. Disconnect the server power cords. b. Reconnect the power cords. c. (Trained technician only) Reseat the operator information panel cable, and
then repeat steps 2a and 2b. v (Trained technician only) If the server starts, replace the operator
information panel.
v If the server does not start, go to step 3.
3. Make sure that both power supplies installed in the server are of the same type. Mixing different power supplies in the server will cause a system error (the system-error LED on the front panel turns on).
4. Make sure that: v The power cords are correctly connected to the server and to a working
electrical outlet.
v The type of memory that is installed is correct. v The DIMMs are fully seated. v The LEDs on the power supply do not indicate a problem. v The microprocessors are installed in the correct sequence.
5. Reseat the following components: a. Operator information panel connector b. Power supplies
6. Replace the components listed in step 5 one at a time, in the order shown,
restarting the server each time.
7. If you just installed an optional device, remove it, and restart the server. If the server now starts, you might have installed more devices than the power supply supports.
8. See “Power-supply LEDs” on page 53.
9. See “Solving undetermined problems” on page 78.
72 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
The server does not turn off.
The server unexpectedly shuts down, and the LEDs on the operator information panel are not lit.
1. Determine whether you are using an Advanced Configuration and Power Interface (ACPI) or a non-ACPI operating system. If you are using a non-ACPI operating system, complete the following steps:
a. Press Ctrl+Alt+Delete. b. Turn off the server by pressing the power-control button and hold it down
for 5 seconds.
c. Restart the server. d. If the server fails POST and the power-control button does not work,
disconnect the power cord for 20 seconds; then, reconnect the power cord and restart the server.
2. If the problem remains or if you are using an ACPI-aware operating system, suspect the system board.
See “Solving undetermined problems” on page 78.

Serial-device problems

Use this information to solve serial-device problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
The number of serial ports that are identified by the operating system is less than the number of installed serial ports.
1. Make sure that: v Each port is assigned a unique address in the Setup utility and none of the
serial ports is disabled.
v The serial-port adapter (if one is present) is seated correctly.
2. Reseat the serial port adapter.
3. Replace the serial port adapter.
Chapter 3. Troubleshooting 73
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
A serial device does not work.
1. Make sure that:
v The device is compatible with the server. v The serial port is enabled and is assigned a unique address. v The device is connected to the correct connector (see “System-board internal
connectors” on page 16).
2. Reseat the following components: a. Failing serial device b. Serial cable
3. Replace the components listed in step 2 one at a time, in the order shown,
restarting the server each time.
4. (Trained technician only) Replace the system board.

ServerGuide problems

Use this information to solve ServerGuide problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
The MegaRAID Storage Manager program cannot view all installed drives, or the operating system cannot be installed.
The operating-system installation program continuously loops.
The ServerGuide program will not start the operating-system CD.
The operating system cannot be installed; the option is not available.
1. Make sure that the hard disk drive is connected correctly.
2. Make sure that the SAS/SATA hard disk drive cables are securely connected.
Make more space available on the hard disk.
Make sure that the operating-system CD is supported by the ServerGuide program. For a list of supported operating-system versions, go to http://www.ibm.com/support/entry/portal/docdisplay?lndocid=SERV-GUIDE, click the link for your ServerGuide version, and scroll down to the list of supported Microsoft Windows operating systems.
Make sure that the server supports the operating system. If it does, either no logical drive is defined (SCSI RAID servers), or the ServerGuide System Partition is not present. Run the ServerGuide program and make sure that setup is complete.
74 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide

Software problems

Use this information to solve software problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
You suspect a software problem.
1. To determine whether the problem is caused by the software, make sure that: v The server has the minimum memory that is needed to use the software. For
memory requirements, see the information that comes with the software. If you have just installed an adapter or memory, the server might have a memory-address conflict.
v The software is designed to operate on the server. v Other software works on the server. v The software works on another server.
2. If you received any error messages when using the software, see the information that comes with the software for a description of the messages and suggested solutions to the problem.
3. Contact the software vendor.

Universal Serial Bus (USB) port problems

Use this information to solve Universal Serial Bus (USB) port problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
Symptom Action
A USB device does not work.
1. Make sure that:
v The correct USB device driver is installed. v The operating system supports USB devices.
2. Make sure that the USB configuration options are set correctly in the Setup utility (see “Using the Setup utility” on page 25 for more information).
3. If you are using a USB hub, disconnect the USB device from the hub and connect it directly to the server.

Video problems

Use this information to solve video problems.
See “Monitor and video problems” on page 68.

Solving power problems

Use this information to solve power problems.
Chapter 3. Troubleshooting 75
Power problems can be difficult to solve. For example, a short circuit can exist anywhere on any of the power distribution buses. Usually, a short circuit will cause the power subsystem to shut down because of an overcurrent condition. To diagnose a power problem, use the following general procedure:
1. Turn off the server and disconnect all power cords.
2. Check for loose cables in the power subsystem. Also check for short circuits, for
example, if a loose screw is causing a short circuit on a circuit board.
3. Check the lit LEDs on the operator information panel (see Light path diagnostics).
4. If the check log LED on the light path diagnostics panel is lit, check the IMM event log for faulty Pwr rail and complete the following steps. Table 5 identifies the components that are associated with each Pwr rail and the order in which to troubleshoot the components.
a. Disconnect the cables and power cords to all internal and external devices
(see “Internal cable routing and connectors” on page 180). Leave the power-supply cords connected.
b. For Pwr rail A error, complete the following steps:
1) (Trained technician only) Replace the system board.
2) (Trained technician only) Replace the microprocessor.
c. For other rail errors (Pwr rail A error, see step 4b), remove each component
that is associated with the faulty Pwr rail, one at a time, in the sequence indicated in Table 5, restarting the server each time, until the cause of the overcurrent condition is identified.
Table 5. Components associated with power rail errors
Pwr rail error in the IMM event log Components
Pwr rail A error
Pwr rail B error
Pwr rail C error
Pwr rail D error
Pwr rail E error
Pwr rail F error
Pwr rail G error
v Microprocessor 1
v Microprocessor 2
v Adapter (if one is installed) in PCI riser-card
assembly 1
v PCI riser-card assembly 1 v Fan 1 v DIMMs 1 through 6
v Dual-port network adapter v Fan 2 v DIMMs 7 through 12
v Hard disk drives v DIMMs 13 through 18
v Adapter (if one is installed) in PCI riser-card
assembly 1
v PCI riser-card assembly 1 v Fan 4 v DIMMs 19 through 24
v PCI adaptor power cable (if one is present) v Fan 3 v Hard disk drives v Hard disk drive backplane assembly
76 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Table 5. Components associated with power rail errors (continued)
Pwr rail error in the IMM event log Components
Pwr rail H error
v Hard disk drive power cable v Hard disk drives v Hard disk drive backplane
or
v PCI adapter power cable v Adapter installed in PCI riser-card assembly 2 v PCI riser-card assembly 2
d. Replace the identified component.
5. Remove the adapters and disconnect the cables and power cords to all internal
and external devices until the server is at the minimum configuration that is required for the server to start (see “Power-supply LEDs” on page 53 for the minimum configuration).
6. Reconnect all power cords and turn on the server. If the server starts successfully, reseat the adapters and devices one at a time until the problem is isolated.
If the server does not start from the minimum configuration, see “Power-supply LEDs” on page 53 to replace the components in the minimum configuration one at a time until the problem is isolated.

Solving Ethernet controller problems

Use this information to solve Ethernet controller problems.
The method that you use to test the Ethernet controller depends on which operating system you are using. See the operating-system documentation for information about Ethernet controllers, and see the Ethernet controller device-driver readme file.
Try the following procedures: v Make sure that the correct device drivers, which come with the server are
installed and that they are at the latest level.
v Make sure that the Ethernet cable is installed correctly.
– The cable must be securely attached at all connections. If the cable is attached
but the problem remains, try a different cable.
– If you set the Ethernet controller to operate at 100 Mbps, you must use
Category 5 cabling.
– If you directly connect two servers (without a hub), or if you are not using a
hub with X ports, use a crossover cable. To determine whether a hub has an X port, check the port label. If the label contains an X, the hub has an X port.
v Determine whether the hub supports auto-negotiation. If it does not, try
configuring the integrated Ethernet controller manually to match the speed and duplex mode of the hub.
v Check the Ethernet controller LEDs on the rear panel of the server. These LEDs
indicate whether there is a problem with the connector, cable, or hub. – The Ethernet link status LED is lit when the Ethernet controller receives a link
pulse from the hub. If the LED is off, there might be a defective connector or cable or a problem with the hub.
Chapter 3. Troubleshooting 77
– The Ethernet transmit/receive activity LED is lit when the Ethernet controller
sends or receives data over the Ethernet network. If the Ethernet transmit/receive activity is off, make sure that the hub and network are operating and that the correct device drivers are installed.
v Check the LAN activity LED on the rear of the server. The LAN activity LED is
lit when data is active on the Ethernet network. If the LAN activity LED is off, make sure that the hub and network are operating and that the correct device drivers are installed.
v Check for operating-system-specific causes of the problem. v Make sure that the device drivers on the client and server are using the same
protocol.
If the Ethernet controller still cannot connect to the network but the hardware appears to be working, the network administrator must investigate other possible causes of the error.

Solving undetermined problems

If Dynamic System Analysis (DSA) did not diagnose the failure or if the server is inoperative, use the information in this section.
If you suspect that a software problem is causing failures (continuous or intermittent), see “Software problems” on page 75.
Corrupted data in CMOS memory or corrupted UEFI firmware can cause undetermined problems. To reset the CMOS data, use the CMOS clear jumper (JP1) to clear the CMOS memory and override the power-on password; see “System-board switches and jumpers” on page 18 for more information. If you suspect that the UEFI firmware is corrupted, see “Recovering the server firmware (UEFI update failure)” on page 80.
If the power supplies are working correctly, complete the following steps:
1. Turn off the server.
2. Make sure that the server is cabled correctly.
3. Remove or disconnect the following devices, one at a time, until you find the
failure. Turn on the server and reconfigure it each time.
v Any external devices. v Surge-suppressor device (on the server). v Printer, mouse, and non-IBM devices. v Each adapter. v Hard disk drives. v Memory modules. The minimum configuration requirement is 2 GB DIMM
in slot 1.
4. Turn on the server.
If the problem is solved when you remove an adapter from the server but the problem recurs when you reinstall the same adapter, suspect the adapter; if the problem recurs when you replace the adapter with a different one, suspect the riser card.
If you suspect a networking problem and the server passes all the system tests, suspect a network cabling problem that is external to the server.
78 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide

Problem determination tips

Because of the variety of hardware and software combinations that can encounter, use the following information to assist you in problem determination.
If possible, have this information available when requesting assistance from IBM.
The model name and serial number are located on the ID label on the front of the server as shown in the following illustration.
Note: The illustrations in this document might differ slightly from your hardware.
Figure 14. ID label
v Machine type and model v Microprocessor or hard disk drive upgrades v Failure symptom
– Does the server fail the diagnostic tests? – What occurs? When? Where? – Does the failure occur on a single server or on multiple servers? – Is the failure repeatable? – Has this configuration ever worked? – What changes, if any, were made before the configuration failed? – Is this the original reported failure?
v Diagnostic program type and version level v Hardware configuration (print screen of the system summary) v UEFI firmware level v IMM firmware level v Operating system software
You can solve some problems by comparing the configuration and software setups between working and nonworking servers. When you compare servers to each other for diagnostic purposes, consider them identical only if all the following factors are exactly the same in all the servers:
v Machine type and model v UEFI firmware level
Chapter 3. Troubleshooting 79
v IMM firmware level v Adapters and attachments, in the same locations v Address jumpers, terminators, and cabling v Software versions and levels v Diagnostic program type and version level v Configuration option settings v Operating-system control-file setup
See Appendix D, “Getting help and technical assistance,” on page 373 for information about calling IBM for service.

Recovering the server firmware (UEFI update failure)

Use this information to recover the server firmware.
Important: Some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code.
If the server firmware has become corrupted, such as from a power failure during an update, you can recover the server firmware in the following way:
v In-band method: Recover server firmware, using either the boot block jumper
(Automated Boot Recovery) and a server Firmware Update Package Service Pack.
v Out-of-band method: Use the IMM web interface to update the firmware, using
the latest server firmware update package.
Note: You can obtain a server update package from one of the following sources:
v Download the server firmware update from the World Wide Web. v Contact your IBM service representative.
To download the server firmware update package from the World Wide Web, go to .
The flash memory of the server consists of a primary bank and a backup bank. You must maintain a bootable UEFI firmware image in the backup bank. If the server firmware in the primary bank becomes corrupted, you can either manually boot the backup bank with the UEFI boot backup jumper (JP2), or in the case of image corruption, this will occur automatically with the Automated Boot Recovery function.

In-band manual recovery method

Use this information to recover the server firmware and restore the server operation to the primary bank.
To recover the server firmware and restore the server operation to the primary bank, complete the following steps:
1. Read the safety information that begins on “Safety” on page vii and “Installation guidelines” on page 93.
2. Turn off the server, and disconnect all power cords and external cables.
3. Remove the cover (see “Removing the compute node cover” on page 110).
80 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
4. Locate the UEFI boot backup jumper (JP2) on the system board.
1 2 3
1 2 3
Lightpath button
UEFI boot recovery jumper
Clear CMOS jumper
NMI button
Figure 15. UEFI boot backup jumper (JP2) location
5. Move the UEFI boot backup jumper (JP2) from pins 1 and 2 to pins 2 and 3 to enable the UEFI recovery mode.
6. Reinstall the server cover; then, reconnect all power cords.
7. Restart the server. The system begins the power-on self-test (POST).
8. Boot the server to an operating system that is supported by the firmware
update package that you downloaded.
9. Perform the firmware update by following the instructions that are in the firmware update package readme file.
10. Turn off the server and disconnect all power cords and external cables, and then remove the cover (see “Removing the compute node cover” on page 110).
11. Move the UEFI boot backup jumper (JP2) from pins 2 and 3 back to the
primary position (pins 1 and 2).
Chapter 3. Troubleshooting 81
12. Reinstall the cover (see “Installing the compute node cover” on page 111).
13. Reconnect the power cord and any cables that you removed.
14. Restart the server. The system begins the power-on self-test (POST). If this
does not recover the primary bank, continue with the following steps.
15. Remove the cover (see “Removing the compute node cover” on page 110).
16. Reset the CMOS by removing the system battery (see “Removing the system
battery” on page 130).
17. Leave the system battery out of the server for approximately 5 to 15 minutes.
18. Reinstall the system battery (see “Replacing the system battery” on page 131).
19. Reinstall the cover (see “Installing the compute node cover” on page 111).
20. Reconnect the power cord and any cables that you removed.
21. Restart the server. The system begins the power-on self-test (POST).
22. If these recovery efforts fail, contact your IBM service representative for
support.

In-band automated boot recovery method

Use this information to use the in-band automated boot recovery method.
Note: Use this method if the system-error LED on the operator information panel is lit and there is a log entry or Booting Backup Image is displayed on the firmware splash screen; otherwise, use the in-band manual recovery method.
1. Boot the server to an operating system that is supported by the firmware update package that you downloaded.
2. Perform the firmware update by following the instructions that are in the firmware update package readme file.
3. Restart the server.
4. At the firmware splash screen, press F3 when prompted to restore to the
primary bank. The server boots from the primary bank.

Out-of-band method

Use this information to use the out-of-band method.
See the IMM2 documentation (Integrated Management Module II User's Guide)at http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-
5086346.

Automated boot recovery (ABR)

While the server is starting, if the integrated management module II detects problems with the server firmware in the primary bank, the server automatically switches to the backup firmware bank and gives you the opportunity to recover the firmware in the primary bank.
For instructions for recovering the UEFI firmware, see “Recovering the server firmware (UEFI update failure)” on page 80. After you have recovered the firmware in the primary bank, complete the following steps:
1. Restart the server.
2. When the prompt Press F3 to restore to primary is displayed, press F3 to
start the server from the primary bank.
82 IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Loading...