Before using this information and the product it supports, read the general information in Appendix D, “Getting help and
technical assistance,” on page 373, “Notices” on page 377, the Warranty Information document, and the Safety Information and
Environmental Notices and User Guide documents on the IBM Documentation CD.
Removing a storage tray from a compute node . . 105
Installing a storage tray into a compute node. . . 106
Removing a GPU tray from a compute node . . . 108
Installing a GPU tray into a compute node. . . 109
Removing and replacing structural parts ....110
Removing the compute node cover .....110
Installing the compute node cover .....111
Removing the air baffle .........113
Replacing the air baffle .........114
Removing a RAID adapter battery holder . . . 115
Replacing a RAID adapter battery holder . . . 115
Removing the PCI riser filler .......116
Replacing the PCI riser filler .......117
Removing the filler from the GPU tray ....117
Replacing the filler on to the GPU tray ....118
Removing the front handle ........119
Installing the front handle ........120
Removing the hard disk drive cage .....121
Installing the hard disk drive cage .....123
Removing and replacing Tier 1 CRUs .....125
Removing the operator information panel . . . 125
Installing the operator information panel . . . 127
Removing the power paddle card from the GPU
tray...............128
Replacing the power paddle card on to the GPU
tray...............129
Removing the system battery .......130
Replacing the system battery .......131
Removing a memory module .......132
Installing a memory module .......133
Removing the optional 3.5-inch hard disk drive
hardware RAID cage ..........138
Installing the optional 3.5-inch hard disk drive
hardware RAID cage ..........140
Removing the hard disk drive backplate . . . 142
Installing the hard disk drive backplate. . . 143
Removing and installing drives ......145
Removing a PCI riser-cage assembly....154
Replacing a PCI riser-cage assembly.....155
Removing a PCI riser-cage assembly in the GPU
tray...............156
Replacing a PCI riser-cage assembly in the GPU
tray...............157
Removing an adapter/GPU adapter .....159
Replacing an adapter/GPU adapter .....160
Removing the USB flash drive.......162
Installing the USB flash drive .......163
Removing and replacing Tier 2 CRUs .....165
Removing a microprocessor and heat sink . . . 165
Replacing a microprocessor and heat sink . . . 168
Removing the compute node .......176
Installing the compute node .......178
Internal cable routing and connectors .....180
Cabling hard disk drive with software RAID
signal cable .............180
Cabling hard disk drive with ServeRAID
SAS/SATA controller ..........181
Appendix A. Integrated Management
Module II (IMM2) error messages . . . 185
Appendix B. UEFI (POST) error codes309
Appendix C. DSA diagnostic test
results ..............321
DSA Broadcom network test results ......321
DSA Brocade test results..........324
DSA checkpoint panel test results......326
DSA CPU stress test results.........327
DSA Emulex adapter test results .......328
DSA EXA port ping test results .......329
DSA hard drive test results .........330
DSA Intel network test results ........330
DSA LSI hard drive test results .......332
DSA Mellanox adapter test results ......332
DSA memory isolation test results ......333
DSA memory stress test results .......360
DSA Nvidia GPU test results ........361
DSA optical drive test results ........363
DSA system management test results .....364
DSA tape drive test results .........369
Appendix D. Getting help and
technical assistance ........373
Before you call .............373
Using the documentation .........374
Getting help and information from the World Wide
Web................374
How to send DSA data to IBM.......374
Creating a personalized support web page. . . 374
Software service and support ........375
Hardware service and support.......375
IBM Taiwan product service ........375
Notices ..............377
Trademarks ..............377
Important notes............378
Particulate contamination .........379
Documentation format ..........380
Telecommunication regulatory statement ....380
Electronic emission notices .........380
Federal Communications Commission (FCC)
statement..............380
Industry Canada Class A emission compliance
statement..............381
Avis de conformité à la réglementation
d'Industrie Canada ..........381
Australia and New Zealand Class A statement381
ivIBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
European Union EMC Directive conformance
statement..............381
Germany Class A statement.......382
Japan VCCI Class A statement.......383
Japan Electronics and Information Technology
Industries Association (JEITA) statement . . . 383
Korea Communications Commission (KCC)
statement..............383
Russia Electromagnetic Interference (EMI) Class
A statement .............383
People's Republic of China Class A electronic
emission statement ..........383
Taiwan Class A compliance statement ....384
German Ordinance for Work gloss
statement .............385
Index ...............387
Contentsv
viIBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Safety
Before installing this product, read the Safety Information.
Antes de instalar este produto, leia as Informações de Segurança.
Læs sikkerhedsforskrifterne, før du installerer dette produkt.
Lees voordat u dit product installeert eerst de veiligheidsvoorschriften.
Ennen kuin asennat tämän tuotteen, lue turvaohjeet kohdasta Safety Information.
Avant d'installer ce produit, lisez les consignes de sécurité.
Vor der Installation dieses Produkts die Sicherheitshinweise lesen.
Prima di installare questo prodotto, leggere le Informazioni sulla Sicurezza.
Les sikkerhetsinformasjonen (Safety Information) før du installerer dette produktet.
Antes de instalar este produto, leia as Informações sobre Segurança.
Antes de instalar este producto, lea la información de seguridad.
Läs säkerhetsinformationen innan du installerar den här produkten.
Bu ürünü kurmadan önce güvenlik bilgilerini okuyun.
Guidelines for trained service technicians
This section contains information for trained service technicians.
Inspecting for unsafe conditions
Use this information to help you identify potential unsafe conditions in an IBM
product that you are working on.
Each IBM product, as it was designed and manufactured, has required safety items
to protect users and service technicians from injury. The information in this section
addresses only those items. Use good judgment to identify potential unsafe
conditions that might be caused by non-IBM alterations or attachment of non-IBM
features or optional devices that are not addressed in this section. If you identify
®
viiiIBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
an unsafe condition, you must determine how serious the hazard is and whether
you must correct the problem before you work on the product.
Consider the following conditions and the safety hazards that they present:
v Electrical hazards, especially primary power. Primary voltage on the frame can
cause serious or fatal electrical shock.
v Explosive hazards, such as a damaged CRT face or a bulging capacitor.
v Mechanical hazards, such as loose or missing hardware.
To inspect the product for potential unsafe conditions, complete the following
steps:
1. Make sure that the power is off and the power cords are disconnected.
2. Make sure that the exterior cover is not damaged, loose, or broken, and observe
any sharp edges.
3. Check the power cords:
v Make sure that the third-wire ground connector is in good condition. Use a
meter to measure third-wire ground continuity for 0.1 ohm or less between
the external ground pin and the frame ground.
v Make sure that the power cords are the correct type.
v Make sure that the insulation is not frayed or worn.
4. Remove the cover.
5. Check for any obvious non-IBM alterations. Use good judgment as to the safety
of any non-IBM alterations.
6. Check inside the system for any obvious unsafe conditions, such as metal
filings, contamination, water or other liquid, or signs of fire or smoke damage.
7. Check for worn, frayed, or pinched cables.
8. Make sure that the power-supply cover fasteners (screws or rivets) have not
been removed or tampered with.
Guidelines for servicing electrical equipment
Observe these guidelines when you service electrical equipment.
v Check the area for electrical hazards such as moist floors, nongrounded power
extension cords, and missing safety grounds.
v Use only approved tools and test equipment. Some hand tools have handles that
are covered with a soft material that does not provide insulation from live
electrical current.
v Regularly inspect and maintain your electrical hand tools for safe operational
condition. Do not use worn or broken tools or testers.
v Do not touch the reflective surface of a dental mirror to a live electrical circuit.
The surface is conductive and can cause personal injury or equipment damage if
it touches a live electrical circuit.
v Some rubber floor mats contain small conductive fibers to decrease electrostatic
discharge. Do not use this type of mat to protect yourself from electrical shock.
v Do not work alone under hazardous conditions or near equipment that has
hazardous voltages.
v Locate the emergency power-off (EPO) switch, disconnecting switch, or electrical
outlet so that you can turn off the power quickly in the event of an electrical
accident.
v Disconnect all power before you perform a mechanical inspection, work near
power supplies, or remove or install main units.
Safetyix
v Before you work on the equipment, disconnect the power cord. If you cannot
disconnect the power cord, have the customer power-off the wall box that
supplies power to the equipment and lock the wall box in the off position.
v Never assume that power has been disconnected from a circuit. Check it to
make sure that it has been disconnected.
v If you have to work on equipment that has exposed electrical circuits, observe
the following precautions:
– Make sure that another person who is familiar with the power-off controls is
near you and is available to turn off the power if necessary.
– When you work with powered-on electrical equipment, use only one hand.
Keep the other hand in your pocket or behind your back to avoid creating a
complete circuit that could cause an electrical shock.
– When you use a tester, set the controls correctly and use the approved probe
leads and accessories for that tester.
– Stand on a suitable rubber mat to insulate you from grounds such as metal
floor strips and equipment frames.
v Use extreme care when you measure high voltages.
v To ensure proper grounding of components such as power supplies, pumps,
blowers, fans, and motor generators, do not service these components outside of
their normal operating locations.
v If an electrical accident occurs, use caution, turn off the power, and send another
person to get medical aid.
Safety statements
These statements provide the caution and danger information that is used in this
documentation.
Important:
Each caution and danger statement in this documentation is labeled with a
number. This number is used to cross reference an English-language caution or
danger statement with translated versions of the caution or danger statement in
the Safety Information document.
For example, if a caution statement is labeled Statement 1, translations for that
caution statement are in the Safety Information document under Statement 1.
Be sure to read all caution and danger statements in this documentation before you
perform the procedures. Read any additional safety information that comes with
your system or optional device before you install the device.
Statement 1
xIBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
DANGER
Electrical current from power, telephone, and communication cables is
hazardous.
To avoid a shock hazard:
v Do not connect or disconnect any cables or perform installation,
maintenance, or reconfiguration of this product during an electrical storm.
v Connect all power cords to a properly wired and grounded electrical outlet.
v Connect to properly wired outlets any equipment that will be attached to
this product.
v When possible, use one hand only to connect or disconnect signal cables.
v Never turn on any equipment when there is evidence of fire, water, or
structural damage.
v Disconnect the attached power cords, telecommunications systems,
networks, and modems before you open the device covers, unless
instructed otherwise in the installation and configuration procedures.
v Connect and disconnect cables as described in the following table when
installing, moving, or opening covers on this product or attached devices.
To Connect:To Disconnect:
1. Turn everything OFF.
2. First, attach all cables to devices.
3. Attach signal cables to connectors.
4. Attach power cords to outlet.
5. Turn device ON.
1. Turn everything OFF.
2. First, remove power cords from outlet.
3. Remove signal cables from connectors.
4. Remove all cables from devices.
Statement 2
CAUTION:
When replacing the lithium battery, use only IBM Part Number 33F8354 or an
equivalent type battery recommended by the manufacturer. If your system has a
module containing a lithium battery, replace it only with the same module type
made by the same manufacturer. The battery contains lithium and can explode if
not properly used, handled, or disposed of.
Do not:
v Throw or immerse into water
v Heat to more than 100°C (212°F)
v Repair or disassemble
Dispose of the battery as required by local ordinances or regulations.
Safetyxi
Statement 3
CAUTION:
When laser products (such as CD-ROMs, DVD drives, fiber optic devices, or
transmitters) are installed, note the following:
v Do not remove the covers. Removing the covers of the laser product could
result in exposure to hazardous laser radiation. There are no serviceable parts
inside the device.
v Use of controls or adjustments or performance of procedures other than those
specified herein might result in hazardous radiation exposure.
DANGER
Some laser products contain an embedded Class 3A or Class 3B laser diode.
Note the following.
Laser radiation when open. Do not stare into the beam, do not view directly
with optical instruments, and avoid direct exposure to the beam.
Class 1 Laser Product
Laser Klasse 1
Laser Klass 1
Luokan 1 Laserlaite
Appareil A Laser de Classe 1
`
Statement 4
CAUTION:
Use safe practices when lifting.
≥ 18 kg (39.7 lb)≥ 32 kg (70.5 lb)≥ 55 kg (121.2 lb)
xiiIBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Statement 5
CAUTION:
The power control button on the device and the power switch on the power
supply do not turn off the electrical current supplied to the device. The device
also might have more than one power cord. To remove all electrical current from
the device, ensure that all power cords are disconnected from the power source.
2
1
Statement 6
CAUTION:
If you install a strain-relief bracket option over the end of the power cord that is
connected to the device, you must connect the other end of the power cord to an
easily accessible power source.
Statement 8
CAUTION:
Never remove the cover on a power supply or any part that has the following
label attached.
Hazardous voltage, current, and energy levels are present inside any component
that has this label attached. There are no serviceable parts inside these
components. If you suspect a problem with one of these parts, contact a service
technician.
Safetyxiii
Statement 12
CAUTION:
The following label indicates a hot surface nearby.
Statement 26
CAUTION:
Do not place any object on top of rack-mounted devices.
Statement 27
CAUTION:
Hazardous moving parts are nearby.
Rack Safety Information, Statement 2
xivIBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
DANGER
v Always lower the leveling pads on the rack cabinet.
v Always install stabilizer brackets on the rack cabinet.
v Always install servers and optional devices starting from the bottom of the
rack cabinet.
v Always install the heaviest devices in the bottom of the rack cabinet.
Safetyxv
xviIBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type
5455
The IBM NeXtScale nx360 M4 Compute Node Type 5455 is a high-availability,
scalable compute node that is optimized to support the next-generation
microprocessor technology and is ideally suited for medium and large businesses.
The IBM NeXtScale nx360 M4 Compute Node Type 5455 is supported in the IBM
NeXtScale n1200 Enclosure only.
This documentation provides the following information about setting up and
troubleshooting the compute node:
v Starting and configuring the compute node
v Installing the operating system
v Diagnosing problems
v Installing, removing, and replacing components
Packaged with the compute node are software CDs that help you configure
hardware, install device drivers, and install the operating system.
If firmware and documentation updates are available, you can download them
from the IBM website. The server might have features that are not described in the
documentation that comes with the server, and the documentation might be
updated occasionally to include information about those features, or technical
updates might be available to provide additional information that is not included
in the server documentation. To check for updates, go to http://www.ibm.com/
supportportal.
The compute node comes with a limited warranty. For information about the terms
of the warranty and getting service and assistance, see the Warranty Information
document for your compute node.
You can download the IBM ServerGuide Setup and Installation CD to help you
configure the hardware, install device drivers, and install the operating system.
For a list of supported optional devices for the server, see http://www.ibm.com/
systems/info/x86servers/serverproven/compat/us.
See the Rack Installation Instructions document on the IBM System x Documentation
CD for complete rack installation and removal instructions.
You can obtain up-to-date information about the server and other IBM server
products at http://www.ibm.com/systems/x. At http://www.ibm.com/
supportportal, you can create a personalized support page by identifying IBM
products that are of interest to you. From this personalized page, you can subscribe
to weekly email notifications about new technical documents, search for
information and downloads, and access various administrative services.
The compute node might have features that are not described in the
documentation that comes with the compute node. The documentation might be
updated occasionally to include information about those features. Technical
updates might also be available to provide additional information that is not
included in the compute node documentation. To obtain the most up-to-date
documentation for this product, go to http://publib.boulder.ibm.com/infocenter/
flexsys/information/index.jsp.
You can subscribe to information updates that are specific to your compute node at
http://www.ibm.com/support/mynotifications.
The model number and serial number are on the ID label on the bezel on the front
of the compute node, and on a label on the bottom of the compute node that is
visible when the compute node is not in the IBM NeXtScale n1200 Enclosure. If the
compute node comes with an RFID tag, the RFID tag covers the ID label on the
bezel on the front of the compute node, but you can open the RFID tag to see the
ID label behind it.
Note: The illustrations in this document might differ slightly from your hardware.
Node
serial
number
Figure 1. NeXtScale nx360 M4 compute node
In addition, the system service label, which is on the cover of the server, provides a
QR code for mobile access to service information. You can scan the QR code using
a QR code reader and scanner with a mobile device and get quick access to the
IBM Service Information website. The IBM Service Information website provides
additional information for parts installation and replacement videos, and error
codes for server support.
The following illustration shows the QR code (http://ibm.co/1hrOZP0):
Figure 2. QR code
The IBM Documentation CD
The IBM Documentation CD contains documentation for the server in Portable
Document Format (PDF) and includes the IBM Documentation Browser to help
you find information quickly.
2IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Hardware and software requirements
The hardware and software requirements of the IBM Documentation CD.
The IBM Documentation CD requires the following minimum hardware and
software:
v Microsoft Windows or Red Hat Linux
v 100 MHz microprocessor
v 32 MB of RAM
v Adobe Acrobat Reader 3.0 (or later) or xpdf, which comes with Linux operating
systems
The Documentation Browser
Use the Documentation Browser to browse the contents of the CD, read brief
descriptions of the documents, and view documents, using Adobe Acrobat Reader
or xpdf.
The Documentation Browser automatically detects the regional settings in use in
your server and displays the documents in the language for that region (if
available). If a document is not available in the language for that region, the
English-language version is displayed. Use one of the following procedures to start
the Documentation Browser:
v If Autostart is enabled, insert the CD into the CD or DVD drive. The
Documentation Browser starts automatically.
v If Autostart is disabled or is not enabled for all users, use one of the following
procedures:
– If you are using a Windows operating system, insert the CD into the CD or
DVD drive and click Start > Run. In the Open field, type:
e:\win32.bat
where e is the drive letter of the CD or DVD drive, and click OK.
– If you are using Red Hat Linux, insert the CD into the CD or DVD drive;
then, run the following command from the /mnt/cdrom directory:
sh runlinux.sh
Select the server from the Product menu. The Available Topics list displays all the
documents for the server. Some documents might be in folders. A plus sign (+)
indicates each folder or document that has additional documents under it. Click
the plus sign to display the additional documents.
When you select a document, a description of the document is displayed under
Topic Description. To select more than one document, press and hold the Ctrl key
while you select the documents. Click View Book to view the selected document
or documents in Acrobat Reader or xpdf. If you selected more than one document,
all the selected documents are opened in Acrobat Reader or xpdf.
To search all the documents, type a word or word string in the Search field and
click Search. The documents in which the word or word string appears are listed
in order of the most occurrences. Click a document to view it, and press Crtl+F to
use the Acrobat search function, or press Alt+F to use the xpdf search function
within the document.
Click Help for detailed information about using the Documentation Browser.
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 54553
Related documentation
This Installation and Service Guide contains general information about the server
including how to set up and cable the server, how to install supported optional
devices, how to configure the server, and information to help you solve problems
yourself and information for service technicians.
The following documentation also comes with the server:
v Warranty Information
This document is in printed format and comes with the server. It contains
warranty terms and a pointer to the IBM Statement of Limited Warranty on the
IBM website.
v Important Notices
This document is in printed format and comes with the server. It contains
information about the safety, environmental, and electronic emission notices for
your IBM product.
v Environmental Notices and User Guide
This document is in PDF format on the IBM Documentation CD. It contains
translated environmental notices.
v IBM License Agreement for Machine Code
This document is in PDF on the IBM Documentation CD. It provides translated
versions of the IBM License Agreement for Machine Code for your product.
v Licenses and Attributions Document
This document is in PDF on the IBM Documentation CD. It provides the open
source notices.
v Safety Information
This document is in PDF on the IBM Documentation CD. It contains translated
caution and danger statements. Each caution and danger statement that appears
in the documentation has a number that you can use to locate the corresponding
statement in your language in the Safety Information document.
Depending on the server model, additional documentation might be included on
the IBM Documentation CD.
The ToolsCenter for System x and BladeCenter is an online information center that
contains information about tools for updating, managing, and deploying firmware,
device drivers, and operating systems. The ToolsCenter for System x and
BladeCenter is at http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/.
The server might have features that are not described in the documentation that
you received with the server. The documentation might be updated occasionally to
include information about those features, or technical updates might be available
to provide additional information that is not included in the server documentation.
These updates are available from the IBM website. To check for updates, go to
http://www.ibm.com/supportportal.
Notices and statements in this document
The caution and danger statements in this document are also in the multilingual
Safety Information document, which is on the IBM Documentation CD. Each
statement is numbered for reference to the corresponding statement in your
language in the Safety Information document.
4IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
The following notices and statements are used in this document:
v Note: These notices provide important tips, guidance, or advice.
v Important: These notices provide information or advice that might help you
avoid inconvenient or problem situations.
v Attention: These notices indicate potential damage to programs, devices, or data.
An attention notice is placed just before the instruction or situation in which
damage might occur.
v Caution: These statements indicate situations that can be potentially hazardous
to you. A caution statement is placed just before the description of a potentially
hazardous procedure step or situation.
v Danger: These statements indicate situations that can be potentially lethal or
extremely hazardous to you. A danger statement is placed just before the
description of a potentially lethal or extremely hazardous procedure step or
situation.
Features and specifications
Use this information to view specific information about the compute node, such as
compute node hardware features and the dimensions of the compute node.
Notes:
1. Power, cooling, and chassis systems management are provided by the IBM
NeXtScale n1200 Enclosure chassis.
2. The operating system in the compute node must provide USB support for the
compute node to recognize and use USB media drives and devices. The IBM
NeXtScale n1200 Enclosure chassis uses USB for internal communication with
these devices.
The following table is a summary of the features and specifications of the
NeXtScale nx360 M4 compute node.
Microprocessor (depending on the model):
v Supports up to two multi-core microprocessors (one installed)
v Level-3 cache
v Two QuickPath Interconnect (QPI) links speed up to 8.0 GT per second
Note:
v Use the Setup utility to determine the type and speed of the
microprocessors in the server.
v For a list of supported microprocessors, see http://www.ibm.com/
systems/info/x86servers/serverproven/compat/us.
Memory:
v 8 dual inline memory module (DIMM) connectors
v Type: Low-profile (LP) double-data rate (DDR3) DRAM
v Supports 4 GB, 8 GB, and 16 GB DIMMs with up to 128 GB of total
memory on the system board
v Support for UDIMMs and RDIMMs (combining is not supported)
Integrated functions:
v Integrated Management Module II (IMM2), which consolidates multiple
management functions in a single chip.
v Concurrent COM/VGA/2x USB (KVM)
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 54555
v System error LEDs
v Software RAID supportability for RAID level-0, RAID level-1, or RAID
level-10
v Hardware RAID supportability for RAID level-0, RAID level-1, RAID
level-5, or RAID level-10
v Wake on LAN (WOL)
Drive expansion bays (depending on the model):
Supports up to eight 3.5-inch SATA (if the storage tray is installed, up to 7
in the storage tray and 1 in the compute node), two 2.5-inch SATA/SAS, or
four 1.8-inch solid-state drives.
1
Attention: As a general consideration, do not mix standard 512-byte and
advanced 4-KB format drives in the same RAID array because it might
lead to potential performance issues.
– Two PCI Express x16 (x16 mechanically) slots (PCIe3.0, full-height,
full-length)
Size:
v Compute node
– Height: 41 mm (1.6 in)
– Depth: 659 mm (25.9 in)
– Width: 216 mm (8.5 in)
– Weight estimation (based on the LFF HDD within computer node):
v Storage tray
– Height: 58.3 mm (2.3 in)
– Depth: 659 mm (25.9 in)
– Width: 216 mm (8.5 in)
– Weight estimation (with 7 hard disk drives installed): 8.64 kg (19 lb)
v GPU tray
– Height: 58.3 mm (2.3 in)
– Depth: 659 mm (25.9 in)
– Width: 216 mm (8.5 in)
– Weight estimation (with no GPU adapter installed): 3.33 kg (7.34 lb)
Environment:
The NeXtScale nx360 M4 compute node complies with ASHRAE class A3
specifications.
Server on
v Temperature: 5°C to 40°C (41°F to 104°F)
v Humidity, non-condensing: -12°C dew point (10.4°F) and 8% to 85%
relative humidity
v Maximum dew point: 24°C (75°F)
v Maximum altitude: 3048 m (10,000 ft)
v Maximum rate of temperature change: 5°C/hr (41°F/hr)
6.05 kg (13.31 lb)
2
3
4,5
6
6IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Environment:
Server off7:
v Temperature: 5°C to 45°C (41°F to 113°F)
v Relative humidity: 8% to 85%
v Maximum dew point: 27°C (80.6°F)
Storage (non-operating):
v Temperature: 1°C to 60°C (33.8°F to 140.0°F)
v Maximum altitude: 3,050 m (10,000 ft)
v Relative humidity: 5% to 80%
v Maximum dew point: 29°C (84.2°F)
Shipment (non-operating):
v Temperature: -40°C to 60°C (-40°F to 140.0°F)
v Maximum altitude: 10,700 m (35,105 ft)
v Relative humidity: 5% to 100%
v Maximum dew point: 29°C (84.2°F)
Particulate contamination
Attention:
v Design to ASHRAE Class A3, temperature: 36°C - 40°C (96.8°F - 104°F),
with relaxed support:
– Support cloud such as workload with no performance degradation
– Under no circumstance, can any combination of the worst case
– The worst case workload (such as linpack and turbo-on) may have
v Airborne particulates and reactive gases acting alone or in combination
with other environmental factors such as humidity or temperature might
pose a risk to the compute node. For information about the limits for
particulates and gases, see “Particulate contamination” on page 379.
8
9
acceptable (turbo-off)
workload and configuration result in system shutdown or design
exposure at 40°C
performance degradation
Notes:
1. Onboard LSI software SATA RAID supports SATA drives and Solid state drives
(SSD). SAS drives are not supported for software RAID. The booting and use of
internal drives with VMware is not supported with the ServeRAID C100
(software RAID) controller.
2. Chassis is powered on.
3. A3 - Derate maximum allowable temperature 1°C/175 m above 950 m.
4. The minimum humidity level for class A3 is the higher (more moisture) of the
-12°C dew point and the 8% relative humidity. These intersect at approximately
25°C. Below this intersection (~25°C), the dew point (-12°C) represents the
minimum moisture level; above the intersection, relative humidity (8%) is the
minimum.
5. Moisture levels lower than 0.5°C DP, but not lower -10 °C DP or 8% relative
humidity, can be accepted if appropriate control measures are implemented to
limit the generation of static electricity on personnel and equipment in the data
center. All personnel and mobile furnishings and equipment must be connected
to ground via an appropriate static control system. The following items are
considered the minimum requirements:
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 54557
a. Conductive materials (conductive flooring, conductive footwear on all
personnel who go into the datacenter; all mobile furnishings and equipment
will be made of conductive or static dissipative materials).
b. During maintenance on any hardware, a properly functioning wrist strap
must be used by any personnel who contacts IT equipment.
6. 5°C/hr for data centers employing tape drives and 20°C/hr for data centers
employing disk drives.
7. Chassis is removed from original shipping container and is installed but not in
use, for example, during repair, maintenance, or upgrade.
8. The equipment acclimation period is 1 hour per 20°C of temperature change
from the shipping environment to the operating environment.
9. Condensation, but not rain, is acceptable.
What your compute node offers
Your compute node offers features such as the integrated management module II,
hard disk drive support, systems-management support, microprocessor technology,
integrated network support, I/O expansion, large system-memory capacity, light
path diagnostics LEDs, PCI Express®, and power throttling.
v Features on Demand
If a Features on Demand feature is integrated in the compute node or in an
optional device that is installed in the compute node, you can purchase an
activation key to activate the feature. For information about Features on
Demand, see /http://www.ibm.com/systems/x/fod/.
v Flexible network support
The compute node provides flexible network capabilities:
– Models with embedded Ethernet
The server comes with an integrated dual-port Intel Gigabit Ethernet
controller, which supports connection to a 10 Mbps, 100 Mbps, or 1000 Mbps
network.
– Models without embedded Ethernet
The compute node has connectors on the system board for optional expansion
adapters for adding network communication capabilities to the compute
node. This provides the flexibility to install expansion adapters that support a
variety of network communication technologies.
v Hard disk drive support
The compute node supports up to one 3.5-inch simple-swap SATA, two 2.5-inch
simple-swap SATA/SAS, or four 1.8-inch simple-swap solid-state drives. You can
implement RAID 0, RAID 1, RAID 5, or RAID 10 for the drives with hardware
RAID. 2.5-inch SATA and Solid state drives (SSD) support software RAID as
well.
v IBM ServerGuide Setup and Installation CD
The ServerGuide Setup and Installation CD, which you can download from the
web, provides programs to help you set up the server and install a Windows
operating system. The ServerGuide program detects installed optional hardware
devices and provides the correct configuration programs and device drivers. For
more information about the ServerGuide Setup and Installation CD, see “Using the
ServerGuide Setup and Installation CD” on page 24.
v Integrated management module II (IMM2)
The integrated management module II (IMM2) combines service processor
functions, video controller, and remote presence and blue-screen capture features
in a single chip. The IMM provides advanced service-processor control,
8IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
monitoring, and alerting function. If an environmental condition exceeds a
threshold or if a system component fails, the IMM lights LEDs to help you
diagnose the problem, records the error in the IMM event log, and alerts you to
the problem. Optionally, the IMM also provides a virtual presence capability for
remote server management capabilities. The IMM provides remote server
management through the following industry-standard interfaces:
– Intelligent Platform Management Interface (IPMI) version 2.0
– Simple Network Management Protocol (SNMP) version 3.0
– Common Information Model (CIM)
– Web browser
For additional information, see “Using the integrated management module” on
page 33 and the Integrated Management Module II User’s Guide at the
http://www.ibm.com/supportportal.
v Large system-memory capacity
The compute node supports up to 128 GB of system memory. The memory
controller provides support for up to 8 industry-standard registered ECC DDR3
on low-profile (LP) DIMMs on the system board. For the most current list of
supported DIMMs, see http://www.ibm.com/systems/info/x86servers/
serverproven/compat/us.
v Light path diagnostics
Light path diagnostics provides LEDs to help you diagnose problems. For more
information about light path diagnostics and the LEDs, see “Compute node
controls, connectors, and LEDs” on page 13.
v Microprocessor technology
The compute node supports up to two multi-core Intel Xeon microprocessors.
For more information about supported microprocessors and their part numbers,
see http://www.ibm.com/systems/info/x86servers/serverproven/compat/us.
Note: The optional microprocessors that IBM supports are limited by the
capacity and capability of the compute node. Any microprocessor that you
install must have the same specifications as the microprocessor that came with
the compute node.
v Mobile access to IBM Service Information website
The server provides a QR code on the system service label, which is on the
cover of the server, that you can scan using a QR code reader and scanner with
a mobile device to get quick access to the IBM Service Information website. The
IBM Service Information website provides additional information for parts
installation and replacement videos, and error codes for server support. For the
QR code, see Chapter 1, “The IBM NeXtScale nx360 M4 Compute Node Type
5455,” on page 1.
v PCI Express
PCI Express is a serial interface that is used for chip-to-chip interconnect and
expansion adapter interconnect. You can add optional I/O and storage devices.
Optional expansion nodes are available to provide a cost-effective way for you
to increase and customize the capabilities of the compute node. Expansion nodes
support a wide variety of industry-standard PCI Express, network, storage, and
graphics adapters. For additional information, see .
®
v Power
throttling
By enforcing a power policy known as power-domain oversubscription, the IBM
NeXtScale n1200 Enclosure can share the power load between twelve power
supplies to ensure sufficient power for each device in the IBM NeXtScale n1200
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 54559
Enclosure. This policy is enforced when the initial power is applied to the IBM
NeXtScale n1200 Enclosure or when a compute node is inserted into the IBM
NeXtScale n1200 Enclosure.
The following settings for this policy are available:
– Basic power management
– Power module redundancy
– Power module redundancy with compute node throttling allowed
Reliability, availability, and serviceability features
Three of the most important features in compute node design are reliability,
availability, and serviceability (RAS). These RAS features help to ensure the
integrity of the data that is stored in the compute node, the availability of the
compute node when you need it, and the ease with which you can diagnose and
correct problems.
The compute node has the following RAS features:
v Advanced Configuration and Power Interface (ACPI)
v Automatic server restart (ASR)
v Built-in diagnostics using DSA Preboot
v Built-in monitoring for temperature, voltage, and hard disk drives
v Customer support center 24 hours per day, 7 days a week
v Customer upgrade of flash ROM-resident code and diagnostics
v Customer-upgradeable Unified Extensible Firmware Interface (UEFI) code and
diagnostics
v ECC protected DDR3 DIMMs
v ECC protection on the L2 cache
v Error codes and messages
v Integrated management module II (IMM2)
v Light path diagnostics
v Memory parity testing
v Microprocessor built-in self-test (BIST) during power-on self-test (POST)
v Microprocessor serial number access
v Processor presence detection
v ROM-resident diagnostics
v System-error logging
v Vital product data (VPD) on memory
v Wake on LAN capability
v Wake on PCI (PME) capability
1
Major components of the compute node
Use this information to locate the major components on the compute node.
The following illustration shows the major components of the compute node.
1. Service availability varies by country. Response time varies depending on the number and nature of incoming calls.
10IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Left air baffle
Battery holder
Heatsink
filler
Heatsink
Cover
1.8-inch solid-state
drive cage
1.8-inch solid state
drive backplate
2.5-inch Hard disk
drive backplate
2.5-inch Hard disk
drive cage
3.5-inch Hard disk
drive cage
Microprocessor
1.8-inch solid state drive
2.5-inch Hard
disk drive
3.5-inch Hard
disk drive
Figure 3. Major components of the compute node
Major components of the storage tray
Use this information to locate the major components on the storage tray.
The storage tray is installed on the top of a compute node. Each storage tray
supports up to seven 3.5-inch LFF SATA hard disk drives.
The ServeRAID adapter can be connects from compute node via PCIe interface to
support RAID level-0, RAID level-1, RAID level-5, or RAID level-10.
The following illustration shows the major components of the storage tray.
DIMM
Right air baffle
PCI riser cage
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 545511
2
3
4
5
6
Figure 4. Major components of the storage tray
0
1
7
Major components of the GPU tray
Use this information to locate the major components on the GPU tray.
The GPU tray is installed on the top of a compute node. Each GPU tray supports
up to two Graphics Processing Unit (GPU) enclosure (full-height, full-length).
The following illustration shows the major components of the GPU tray.
12IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Front PCI riser
assembly
Air baffle
Figure 5. Major components of the GPU tray
Rear PCI riser
assembly
Power paddle
card
Power, controls, and indicators
Use this information to view power features, turn on and turn off the compute
node, and view the functions of the controls and indicators.
Compute node controls, connectors, and LEDs
Use this information for details about the controls, connectors, and LEDs.
The following illustration identifies the buttons, connectors, and LEDs on the
control panel.
Power-on LED/
power button
Check log
LED
Locator LED
System-error
LED
Dual-port
network adapter
(Optional)
Pull out tag
(shared management port)
KVM connector
Ethernet 1
connector
Ethernet link
activity / status
LED
Ethernet 2
connector
Ethernet connection
speed LED
Management
connector
(dedicated management port)
Figure 6. Compute node control panel buttons, connectors, and LEDs
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 545513
Power button/LED
When the compute node is connected to power through the IBM NeXtScale
n1200 Enclosure, press this button to turn on or turn off the compute node.
This button is also the power LED. This green LED indicates the power
status of the compute node:
v Flashing rapidly: The LED flashes rapidly for the following reasons:
– The compute node has been installed in a chassis. When you install
the compute node, the LED flashes rapidly for up to 90 seconds while
the integrated management module II (IMM2) in the compute node is
initializing.
– The IBM NeXtScale n1200 Enclosure does not have enough power to
turn on the compute node.
– The IMM2 in the compute node is not communicating with the
Chassis Management Module.
v Flashing slowly: The compute node is connected to power through the
IBM NeXtScale n1200 Enclosure and is ready to be turned on.
v Lit continuously: The compute node is connected to power through the
IBM NeXtScale n1200 Enclosure and is turned on.
When the compute node is on, pressing this button causes an orderly
shutdown of the compute node so that it can be removed safely from the
chassis. This includes shutting down the operating system (if possible) and
removing power from the compute node.
If an operating system is running, you might have to press the button for
approximately 4 seconds to initiate the shutdown.
Attention: Pressing the button for 4 seconds forces the operating system
to shut down immediately. Data loss is possible.
Locator LED
The system administrator can remotely light this blue LED to aid in
visually locating the compute node.
Check log LED
When this yellow LED is lit, it indicates that a system error has occurred.
Check the “Event logs” on page 55 for additional information.
System error LED
When this yellow LED is lit, it indicates that a system error has occurred.
A system-error LED is also on the rear of the server. An LED on the light
path diagnostics panel on the operator information panel or on the system
board is also lit to help isolate the error. This LED is controlled by the
IMM.
KVM connector
Connect the console breakout cable to this connector (see “Console
breakout cable” on page 15 for more information).
Note: It is best practice to connect the console breakout cable to only one
compute node at a time in each IBM NeXtScale n1200 Enclosure.
Ethernet connectors
Use either of these connectors to connect the server to a network. When
you enable shared Ethernet for IMM2 in the Setup utility, you can access
the IMM2 using either the Ethernet 1 or the system-management Ethernet
(default) connector. See Using the Setup utility for more information.
14IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Ethernet link activity/status LED
When any of these LEDs is lit, they indicate that the server is transmitting
to or receiving signals from the Ethernet LAN that is connected to the
Ethernet port that corresponds to that LED.
Management connector
Use this connector to connect the server to a network for full
systems-management information control. This connector is used only by
the Integrated Management Module II (IMM2). A dedicated management
network provides additional security by physically separating the
management network traffic from the production network. You can use the
Setup utility to configure the server to use a dedicated systems
management network or a shared network.
Console breakout cable
Use this information for details about the console breakout cable.
Use the console breakout cable to connect external I/O devices to the compute
node. The console breakout cable connects through the KVM connector (see
“Compute node controls, connectors, and LEDs” on page 13). The console breakout
cable has connectors for a display device (video), two USB connectors for a USB
keyboard and mouse, and a serial interface connector.
The following illustration identifies the connectors and components on the console
breakout cable.
Serial
connector
USB
ports (2)
VGA
connector
Captive
screws
Figure 7. Console breakout cable
Note: When you install the KVM cable, gently press down the pull out tag a little
to prevent interfere with the KVM cable.
Turning on the compute node
Use this information for details about turning on the compute node.
After you connect the compute node to power through the IBM NeXtScale n1200
Enclosure, the compute node can be started in any of the following ways:
v You can press the power button on the front of the compute node (see
“Compute node controls, connectors, and LEDs” on page 13) to start the
compute node. The power button works only if local power control is enabled
for the compute node.
KVM connector
Notes:
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 545515
1. Wait until the power LED on the compute node flashes slowly before you
press the power button. While the IMM2 in the compute node is initializing
and synchronizing with the Chassis Management Module, the power LED
flashes rapidly, and the power button on the compute node does not
respond. This process can take approximately 90 seconds after the compute
node has been installed.
2. While the compute node is starting, the power LED on the front of the
compute node is lit and does not flash. See “Compute node controls,
connectors, and LEDs” on page 13 for the power LED states.
v You can turn on the compute node through the Wake on LAN feature. The
compute node must be connected to power (the power LED is flashing slowly)
and must be communicating with the Chassis Management Module. The
operating system must support the Wake on LAN feature, and the Wake on
LAN feature must be enabled through the Chassis Management Module web
interface.
Turning off the compute node
Use this information for details about turning off the compute node.
When you turn off the compute node, it is still connected to power through the
IBM NeXtScale n1200 Enclosure. The compute node can respond to requests from
the IMM2, such as a remote request to turn on the compute node. To remove all
power from the compute node, you must remove it from the IBM NeXtScale n1200
Enclosure.
Before you turn off the compute node, shut down the operating system. See the
operating-system documentation for information about shutting down the
operating system.
The compute node can be turned off in any of the following ways:
v You can press the power button on the compute node (see “Compute node
controls, connectors, and LEDs” on page 13). This starts an orderly shutdown of
the operating system, if this feature is supported by the operating system.
v If the operating system stops functioning, you can press and hold the power
button for more than 4 seconds to turn off the compute node.
Attention: Pressing the power button for 4 seconds forces the operating system
to shut down immediately. Data loss is possible.
System-board layouts
Use this information to locate the connectors, LEDs, jumpers, and switches on the
system board.
System-board internal connectors
The following illustrations show the internal connectors on the system board.
16IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
PCI riser
connector 2
DIMM 7
DIMM 8
Power distribution
DIMM 3DIMM 4
board connector
DIMM 2
DIMM 1
Microprocessor 1
DIMM 6
DIMM 5
Microprocessor 2
Operator information
panel
10GB ethernet
card connector
Figure 8. Internal connectors on system board
USB hypervisor key
3V lithium battery
SATA connector
LED signal connector
PCI riser
connector 1
System-board external connectors
The following illustration shows the external connectors on the system board.
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 545517
KVM connector
Figure 9. External connectors on system board
Ethernet 1
connector
Ethernet 2
connector
Management
connector
System-board switches and jumpers
The following illustration shows the location and description of the switches and
jumpers.
18IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
1
2
3
1
2
3
Lightpath button
UEFI boot
recovery jumper
Clear CMOS jumper
NMI button
Figure 10. Location and description of switches and jumpers
Note: If there is a clear protective sticker on the top of the switch blocks, you must
remove and discard it to access the switches.
Note:
1. Before you change any switch settings or move any jumpers, turn off the
server. Review the information in “Safety” on page vii, “Installation guidelines”
on page 93, “Handling static-sensitive devices” on page 95, and “Turning off
the compute node” on page 16.
2. Any system-board switch or jumper block that is not shown in the illustrations
in this document are reserved.
System-board LEDs and controls
The following illustration shows the light-emitting diodes (LEDs) on the system
board.
Chapter 1. The IBM NeXtScale nx360 M4 Compute Node Type 545519
Any error LED can be lit after ac power has been removed from the system-board
tray so that you can isolate a problem. After ac power has been removed from the
system-board tray, power remains available to these LEDs for up to 90 seconds. To
view the error LEDs, press and hold the light path button on the system board to
light the error LEDs. The error LEDs that were lit while the system-board tray was
running will be lit again while the button is pressed.
The following illustration shows the LEDs and controls on the system board.
MicroprocessorLEDmismatch
HDD 0-3 Error LEDs
DIMM 4-3 error LEDs
Microprocessor 1 error LED
DIMM 8-7 error LEDs
DIMM 2-1 error LEDs
DIMM 6-5 error LEDs
System board
error LED
Lightpath LED
Microprocessor 2
error LED
Battery error LED
Slot 1 error LED
Ethernet card
error LED
RTMM hearbeat LED
IMM hearbeat LED
Figure 11. LEDs and controls on system board
20IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Chapter 2. Configuration information and instructions
This chapter provides information about updating the firmware and using the
configuration utilities.
Updating the firmware
Use this information to update the system firmware.
Important:
1. Some cluster solutions require specific code levels or coordinated code updates.
If the device is part of a cluster solution, verify that the latest level of code is
supported for the cluster solution before you update the code.
2. Before you update the firmware, be sure to back up any data that is stored in
the Trusted Platform Module (TPM), in case any of the TPM characteristics are
changed by the new firmware. For instructions, see your encryption software
documentation.
3. Installing the wrong firmware or device-driver update might cause the server
to malfunction. Before you install a firmware or device-driver update, read any
readme and change history files that are provided with the downloaded
update. These files contain important information about the update and the
procedure for installing the update, including any special procedure for
updating from an early firmware or device-driver version to the latest version.
You can install code updates that are packaged as an UpdateXpress System Pack or
UpdateXpress CD image. An UpdateXpress System Pack contains an
integration-tested bundle of online firmware and device-driver updates for your
server. Use UpdateXpress System Pack Installer to acquire and apply UpdateXpress
System Packs and individual firmware and device-driver updates. For additional
information and to download the UpdateXpress System Pack Installer, go to the
ToolsCenter for System x and BladeCenter at http://www.ibm.com/support/
entry/portal/docdisplay?lndocid=TOOL-CENTER and click UpdateXpress SystemPack Installer.
When you click an update, an information page is displayed, including a list of the
problems that the update fixes. Review this list for your specific problem; however,
even if your problem is not listed, installing the update might solve the problem.
Be sure to separately install any listed critical updates that have release dates that
are later than the release date of the UpdateXpress System Pack or UpdateXpress
image.
The firmware for the server is periodically updated and is available for download
on the IBM website. To check for the latest level of firmware, such as the UEFI
firmware, device drivers, and integrated management module (IMM) firmware, go
to http://www.ibm.com/support/fixcentral.
Download the latest firmware for the server; then, install the firmware, using the
instructions that are included with the downloaded files.
When you replace a device in the server, you might have to update the firmware
that is stored in memory on the device or restore the pre-existing firmware from a
CD or DVD image.
The following list indicates where the firmware is stored:
v UEFI firmware is stored in ROM on the system board.
v IMM2 firmware is stored in ROM on the system board.
v Ethernet firmware is stored in ROM on the Ethernet controller and on the
system board.
v ServeRAID firmware is stored in ROM on the system board and the RAID
adapter (if one is installed).
v SAS/SATA firmware is stored in ROM on the SAS/SATA controller on the
system board.
Configuring the server
The following configuration programs come with the server:
v Setup utility
The Setup utility is part of the UEFI firmware. Use it to perform configuration
tasks such as changing interrupt request (IRQ) settings, changing the
startup-device sequence, setting the date and time, and setting passwords. For
information about using this program, see “Using the Setup utility” on page 25.
v Boot Manager program
The Boot Manager is part of the UEFI firmware. Use it to override the startup
sequence that is set in the Setup utility and temporarily assign a device to be
first in the startup sequence. For more information about using this program, see
“Using the Boot Manager” on page 32.
v IBM ServerGuide Setup and Installation CD
The ServerGuide program provides software-setup tools and installation tools
that are designed for the server. Use this CD during the installation of the server
to configure basic hardware features, such as an integrated SAS/SATA controller
with RAID capabilities, and to simplify the installation of your operating system.
For information about using this CD, see “Using the ServerGuide Setup and
Installation CD” on page 24.
v Integrated management module
Use the integrated management module II (IMM2) for configuration, to update
the firmware and sensor data record/field replaceable unit (SDR/FRU) data, and
to remotely manage a network. For information about using the IMM, see
“Using the integrated management module” on page 33 and the IntegratedManagement Module II User's Guide at http://www-947.ibm.com/support/entry/
portal/docdisplay?lndocid=migr-5086346.
v VMware ESXi embedded hypervisor
An optional USB flash device with VMware ESXi embedded hypervisor software
is available for purchase. Hypervisor is virtualization software that enables
multiple operating systems to run on a host system at the same time. The USB
embedded hypervisor flash device can be installed in USB connectors 3 and 4 on
the system board. For more information about using the embedded hypervisor,
see “Using the embedded hypervisor” on page 36.
v Remote presence capability and blue-screen capture
The remote presence and blue-screen capture features are integrated functions of
the integrated management module (IMM2). The remote presence feature
provides the following functions:
22IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
– Remotely viewing video with graphics resolutions up to 1600 x 1200 at 75 Hz,
regardless of the system state
– Remotely accessing the server, using the keyboard and mouse from a remote
client
– Mapping the CD or DVD drive, diskette drive, and USB flash drive on a
remote client, and mapping ISO and diskette image files as virtual drives that
are available for use by the server
– Uploading a diskette image to the IMM memory and mapping it to the server
as a virtual drive
The blue-screen capture feature captures the video display contents before the
IMM restarts the server when the IMM detects an operating-system hang
condition. A system administrator can use the blue-screen capture feature to
assist in determining the cause of the hang condition. For more information, see
“Using the remote presence and blue-screen capture features” on page 34.
v Ethernet controller configuration
For information about configuring the Ethernet controller, see “Configuring the
Ethernet controller” on page 37.
v Features on Demand software Ethernet software
The server provides Features on Demand software Ethernet support. You can
purchase a Features on Demand software upgrade key for Fibre Channel over
Ethernet (FCoE) and iSCSI storage protocols. For more information, see
“Enabling Features on Demand Ethernet software” on page 37.
v Features on Demand software RAID software
The server provides Features on Demand software RAID support. You can
purchase a Features on Demand software upgrade key for RAID. For more
information, see “Enabling Features on Demand RAID software” on page 37.
v IBM Advanced Settings Utility (ASU) program
Use this program as an alternative to the Setup utility for modifying UEFI
settings and IMM settings. Use the ASU program online or out of band to
modify UEFI settings from the command line without the need to restart the
server to run the Setup utility. For more information about using this program,
see “IBM Advanced Settings Utility program” on page 38.
v Configuring RAID arrays
For information about configuring RAID arrays, see “Configuring RAID arrays”
on page 38.
The following table lists the different server configurations and the applications
that are available for configuring and managing RAID arrays.
Table 1. Server configuration and applications for configuring and managing RAID arrays
RAID array configuration
(before operating system is
MegaRAID Storage Manager
(MSM), MegaCLI (Command
Line Interface), and IBM
Director
Table 1. Server configuration and applications for configuring and managing RAID
arrays (continued)
RAID array configuration
(before operating system is
Server configuration
ServeRAID-C100HIIMegaRAID Storage Manager
installed)
RAID array management
(after operating system is
installed)
(MSM), MegaCLI, and IBM
Director
Notes:
1. For more information about the Human Interface Infrastructure (HII) and
SAS2IRCU, go to http://www-947.ibm.com/support/entry/portal/
docdisplay?lndocid=MIGR-5088601.
2. For more information about the MegaRAID, go to http://www-
Use this information as an overview for using the ServerGuide Setup and
Installation CD.
The ServerGuide Setup and Installation CD provides software setup tools and
installation tools that are designed for your server. The ServerGuide program
detects the server model and optional hardware devices that are installed and uses
that information during setup to configure the hardware. The ServerGuide
simplifies the operating-system installations by providing updated device drivers
and, in some cases, installing them automatically.
You can download a free image of the ServerGuide Setup and Installation CD from
http://www.ibm.com/support/entry/portal/docdisplay?lndocid=SERV-GUIDE.
In addition to the ServerGuide Setup and Installation CD, you must have your
operating-system CD to install the operating system.
ServerGuide features
This information provides an overview of the ServerGuide features.
Features and functions can vary slightly with different versions of the ServerGuide
program. To learn more about the version that you have, start the ServerGuide Setupand Installation CD and view the online overview. Not all features are supported on
all server models.
The ServerGuide program has the following features:
v An easy-to-use interface
v Diskette-free setup, and configuration programs that are based on detected
hardware
v Device drivers that are provided for the server model and detected hardware
v Operating-system partition size and file-system type that are selectable during
setup
The ServerGuide program performs the following tasks:
v Sets system date and time
v Detects installed hardware options and provides updated device drivers for
most adapters and devices
v Provides diskette-free installation for supported Windows operating systems
24IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Includes an online readme file with links to tips for your hardware and
operating-system installation
Setup and configuration overview
Use this information for the ServerGuide setup and configuration.
When you use the ServerGuide Setup and Installation CD, you do not need setup
diskettes. You can use the CD to configure any supported IBM server model. The
setup program provides a list of tasks that are required to set up your server
model. On a server with a ServeRAID adapter or SAS/SATA controller with RAID
capabilities, you can run the SAS/SATA RAID configuration program to create
logical drives.
Note: Features and functions can vary slightly with different versions of the
ServerGuide program.
Typical operating-system installation
This section details the ServerGuide typical operating-system installation.
The ServerGuide program can reduce the time it takes to install an operating
system. It provides the device drivers that are required for your hardware and for
the operating system that you are installing. This section describes a typical
ServerGuide operating-system installation.
Note: Features and functions can vary slightly with different versions of the
ServerGuide program.
1. After you have completed the setup process, the operating-system installation
program starts. (You will need your operating-system CD to complete the
installation.)
2. The ServerGuide program stores information about the server model, service
processor, hard disk drive controllers, and network adapters. Then, the
program checks the CD for newer device drivers. This information is stored
and then passed to the operating-system installation program.
3. The ServerGuide program presents operating-system partition options that are
based on your operating-system selection and the installed hard disk drives.
4. The ServerGuide program prompts you to insert your operating-system CD
and restart the server. At this point, the installation program for the operating
system takes control to complete the installation.
Installing your operating system without using ServerGuide
Use this information to install the operating system on the server without using
ServerGuide.
If you have already configured the server hardware and you are not using the
ServerGuide program to install your operating system, you can download
operating-system installation instructions for the server from http://
www.ibm.com/supportportal.
Using the Setup utility
Use these instructions to start the Setup utility.
Use the Unified Extensible Firmware Interface (UEFI) Setup Utility program to
perform the following tasks:
v View configuration information
Chapter 2. Configuration information and instructions25
v View and change assignments for devices and I/O ports
v Set the date and time
v Set and change passwords
v Set the startup characteristics of the server and the order of startup devices
v Set and change settings for advanced hardware features
v View, set, and change settings for power-management features
v View and clear error logs
v Change interrupt request (IRQ) settings
v Resolve configuration conflicts
Starting the Setup utility
Use this information to start up the Setup utility.
To start the Setup utility, complete the following steps:
1. Turn on the server.
Note: Approximately 5 to 10 seconds after the server is connected to power,
the power-control button becomes active.
2. When the prompt <F1> Setup is displayed, press F1. If you have set an
administrator password, you must type the administrator password to access
the full Setup utility menu. If you do not type the administrator password, a
limited Setup utility menu is available.
3. Select settings to view or change.
Setup utility menu choices
Use the Setup utility main menu to view and configure server configuration data
and settings.
The following choices are on the Setup utility main menu for the UEFI. Depending
on the version of the firmware, some menu choices might differ slightly from these
descriptions.
v System Information
Select this choice to view information about the server. When you make changes
through other choices in the Setup utility, some of those changes are reflected in
the system information; you cannot change settings directly in the system
information. This choice is on the full Setup utility menu only.
– System Summary
Select this choice to view configuration information, including the ID, speed,
and cache size of the microprocessors, machine type and model of the server,
the serial number, the system UUID, and the amount of installed memory.
When you make configuration changes through other options in the Setup
utility, the changes are reflected in the system summary; you cannot change
settings directly in the system summary.
– Product Data
Select this choice to view the system-board identifier, the revision level or
issue date of the firmware, the integrated management module and
diagnostics code, and the version and date.
This choice is on the full Setup utility menu only.
v System Settings
Select this choice to view or change the server component settings.
– Adapters and UEFI Drivers
26IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Select this choice to view information about the UEFI 1.10 and UEFI 2.0
compliant adapters and drivers installed in the server.
– Processors
Select this choice to view or change the processor settings.
– Memory
Select this choice to view or change the memory settings.
– Devices and I/O Ports
Select this choice to view or change assignments for devices and
input/output (I/O) ports. You can configure the serial ports, configure remote
console redirection, enable or disable integrated Ethernet controllers, the
SAS/SATA controllers, SATA optical drive channels, PCI slots, and video
controller. If you disable a device, it cannot be configured, and the operating
system will not be able to detect it (this is equivalent to disconnecting the
device).
– Power
Select this choice to view or change power capping to control consumption,
processors, and performance states.
– Operating Modes
Select this choice to view or change the operating profile (performance and
power utilization).
– Legacy Support
Select this choice to view or set legacy support.
- Force Legacy Video on Boot
Select this choice to force INT video support, if the operating system does
not support UEFI video output standards.
- Rehook INT 19h
Select this choice to enable or disable devices from taking control of the
boot process. The default is Disable.
- Legacy Thunk Support
Select this choice to enable or disable UEFI to interact with PCI mass
storage devices that are non-UEFI compliant. The default is Enable.
- Infinite Boot Retry
Select this choice to enable or disable UEFI to infinitely retry the legacy
boot order. The default is Disable.
- BBS Boot
Select this choice to enable or disable legacy boot in BBS manner. The
default is Enable.
– System Security
Select this choice to view or configure Trusted Platform Module (TPM)
support.
– Integrated Management Module
Select this choice to view or change the settings for the integrated
management module.
- Power Restore Policy
Select this choice to set the mode of operation after the power lost.
- Commands on USB Interface
Select this choice to enable or disable the Ethernet over USB interface on
IMM. The default is Enable.
Chapter 2. Configuration information and instructions27
- Network Configuration
Select this choice to view the system management network interface port,
the IMM MAC address, the current IMM IP address, and host name; define
the static IMM IP address, subnet mask, and gateway address, specify
whether to use the static IP address or have DHCP assign the IMM2 IP
address, save the network changes, and reset the IMM.
- Reset IMM to Defaults
Select this choice to view or reset IMM to the default settings.
- Reset IMM
Select this choice to reset IMM.
– Recovery
Select this choice to view or change the system recovery parameters.
- POST Attempts
Select this choice to view or change the number of attempts to POST.
v POST Attempts Limit
Select this choice to view or change the Nx boot failure parameters.
- System Recovery
Select this choice to view or change system recovery settings.
v POST Watchdog Timer
Select this choice to view or enable the POST watchdog timer.
v POST Watchdog Timer Value
Select this choice to view or set the POST loader watchdog timer value.
v Reboot System on NMI
Select this choice to enable or disable restarting the system whenever a
nonmaskable interrupt (NMI) occurs. Enable is the default.
v Halt on Severe Error
Select this choice to enable or disable the system from booting into OS,
displaying the POST event viewer whenever a severe error was detected.
Disable is the default.
– Storage
Select this choice to view or change the storage device settings.
– Network
Select this choice to view or change the network device options, such as
iSCSI.
– Drive Health
Select this choice to view the status of the controllers installed in the server.
v Date and Time
Select this choice to set the date and time in the server, in 24-hour format
(hour:minute:second).
This choice is on the full Setup utility menu only.
v Start Options
Select this choice to view or change the start options, including the startup
sequence, keyboard NumLock state, PXE boot option, and PCI device boot
priority. Changes in the startup options take effect when you start the server.
The startup sequence specifies the order in which the server checks devices to
find a boot record. The server starts from the first boot record that it finds. If the
server has Wake on LAN hardware and software and the operating system
supports Wake on LAN functions, you can specify a startup sequence for the
28IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Wake on LAN functions. For example, you can define a startup sequence that
checks for a disc in the CD-RW/DVD drive, then checks the hard disk drive,
and then checks a network adapter.
This choice is on the full Setup utility menu only.
v Boot Manager
Select this choice to view, add, delete, or change the device boot priority, boot
from a file, select a one-time boot, or reset the boot order to the default setting.
v System Event Logs
Select this choice to enter the System Event Manager, where you can view the
POST event log and the system-event log. You can use the arrow keys to move
between pages in the error log. This choice is on the full Setup utility menu only.
The POST event log contains the most recent error codes and messages that
were generated during POST.
The system-event log contains POST and system management interrupt (SMI)
events and all events that are generated by the baseboard management
controller that is embedded in the integrated management module (IMM).
Important: If the system-error LED on the front of the server is lit but there are
no other error indications, clear the system-event log. Also, after you complete a
repair or correct an error, clear the system-event log to turn off the system-error
LED on the front of the server.
– POST Event Viewer
Select this choice to enter the POST event viewer to view the POST error
messages.
– System Event Log
Select this choice to view the system event log.
– Clear System Event Log
Select this choice to clear the system event log.
v User Security
Select this choice to set, change, or clear passwords. See “Passwords” on page 30
for more information.
This choice is on the full and limited Setup utility menu.
– Set Power-on Password
Select this choice to set or change a power-on password. See “Power-on
password” on page 30 for more information.
– Clear Power-on Password
Select this choice to clear a power-on password. See “Power-on password” on
page 30 for more information.
– Set Administrator Password
Select this choice to set or change an administrator password. An
administrator password is intended to be used by a system administrator; it
limits access to the full Setup utility menu. If an administrator password is
set, the full Setup utility menu is available only if you type the administrator
password at the password prompt. See “Administrator password” on page 32
for more information.
– Clear Administrator Password
Select this choice to clear an administrator password. See “Administrator
password” on page 32 for more information.
v Save Settings
Chapter 2. Configuration information and instructions29
Select this choice to save the changes that you have made in the settings.
v Restore Settings
Select this choice to cancel the changes that you have made in the settings and
restore the previous settings.
v Load Default Settings
Select this choice to cancel the changes that you have made in the settings and
restore the factory settings.
v Exit Setup
Select this choice to exit from the Setup utility. If you have not saved the
changes that you have made in the settings, you are asked whether you want to
save the changes or exit without saving them.
Passwords
From the User Security menu choice, you can set, change, and delete a power-on
password and an administrator password.
The User Security menu choice is on the full Setup utility menu only.
If you set only a power-on password, you must type the power-on password to
complete the system startup and to have access to the full Setup utility menu.
An administrator password is intended to be used by a system administrator; it
limits access to the full Setup utility menu. If you set only an administrator
password, you do not have to type a password to complete the system startup, but
you must type the administrator password to access the Setup utility menu.
If you set a power-on password for a user and an administrator password for a
system administrator, you must type the power-on password to complete the
system startup. A system administrator who types the administrator password has
access to the full Setup utility menu; the system administrator can give the user
authority to set, change, and delete the power-on password. A user who types the
power-on password has access to only the limited Setup utility menu; the user can
set, change, and delete the power-on password, if the system administrator has
given the user that authority.
Power-on password:
If a power-on password is set, when you turn on the server, you must type the
power-on password to complete the system startup. You can use any combination
of6-20printable ASCII characters for the password.
When a power-on password is set, you can enable the Unattended Start mode, in
which the keyboard and mouse remain locked but the operating system can start.
You can unlock the keyboard and mouse by typing the power-on password.
If you forget the power-on password, you can regain access to the server in any of
the following ways:
v If an administrator password is set, type the administrator password at the
password prompt. Start the Setup utility and reset the power-on password.
Attention: If you set an administrator password and then forget it, there is no
way to change, override, or remove it. You must replace the system board.
v Remove the battery from the server, wait 30 seconds, and then reinstall it.
30IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Change the position of the power-on password switch (enable switch 3 of the
1
2
3
1
2
3
system board switch block (SW4) to bypass the password check (see
“System-board switches and jumpers” on page 18 for more information).
Lightpath button
UEFI boot
recovery jumper
Clear CMOS jumper
NMI button
Figure 12. Power-on password switch
Attention: Before you change any switch settings or move any jumpers, turn
off the server; then, disconnect all power cords and external cables. See the
safety information that begins “Safety” on page vii. Do not change settings or
move jumpers on any system-board switch or jumper blocks that are not shown
in this document.
The default for all of the switches on switch block SW3 is Off.
While the server is turned off, move switch 4 of the switch block SW3 to the On
position to enable the power-on password override. You can then start the Setup
utility and reset the power-on password. You do not have to return the switch to
the previous position.
Chapter 2. Configuration information and instructions31
The power-on password override switch does not affect the administrator
password.
Administrator password:
If an administrator password is set, you must type the administrator password for
access to the full Setup utility menu. You can use any combination of 6 to 20
printable ASCII characters for the password.
Attention: If you set an administrator password and then forget it, there is no
way to change, override, or remove it. You must replace the system board.
Using the Boot Manager
Use this information for the Boot Manager.
The Boot Manager program is a built-in, menu-driven configuration utility
program that you can use to temporarily redefine the first startup device without
changing settings in the Setup utility.
To use the Boot Manager program, complete the following steps:
1. Turn off the server.
2. Restart the server.
3. When the prompt <F12> Select Boot Device is displayed, press F12.
4. Use the Up arrow and Down arrow keys to select an item from the menu and
press Enter.
The next time the server starts, it returns to the startup sequence that is set in the
Setup utility.
Starting the backup server firmware
Use this information to start the backup server firmware.
The system board contains a backup copy area for the server firmware. This is a
secondary copy of the server firmware that you update only during the process of
updating the server firmware. If the primary copy of the server firmware becomes
damaged, use this backup copy.
To force the server to start from the backup copy, turn off the server; then, change
the position of the UEFI boot backup switch (change switch 1 of the SW4 to the on
position) to enable the UEFI recovery mode.
Use the backup copy of the server firmware until the primary copy is restored.
After the primary copy is restored, turn off the server; then, change back the
position of the UEFI boot backup switch (change switch 1 of the SW4 to the off
position).
The Update
The UpdateXpress System Pack Installer detects supported and installed device
drivers and firmware in the server and installs available updates.
Xpress System Pack Installer
For additional information and to download the UpdateXpress System Pack
Installer, go to the ToolsCenter for System x and BladeCenter at
http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/ and click UpdateXpress
System Pack Installer.
32IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Changing the Power Policy option to the default settings after
loading UEFI defaults
The default settings for the Power Policy option are set by the IMM2.
To change the Power Policy option to the default settings, complete the following
steps.
1. Turn on the server.
Note: Approximately 20 seconds after the server is connected to AC power, the
power-control button becomes active.
2. When the prompt <F1> Setup is displayed, press F1. If you have set an
administrator password, you must type the administrator password to access
the full Setup utility menu. If you do not type the administrator password, a
limited Setup utility menu is available.
3. Select System Settings > Integrated Management Module, then set Power
Restore Policy setting to Restore.
4. Go back to System Configuration and Boot Management > Save Settings.
5. Go back and check the Power Policy setting to verify that it is set to Restore
(the default).
Attention: If you set an administrator password and then forget it, there is no
way to change, override, or remove it. You must replace the system board.
Using the integrated management module
The integrated management module (IMM) is a second generation of the functions
that were formerly provided by the baseboard management controller hardware. It
combines service processor functions, video controller, and remote presence
function in a single chip.
The IMM supports the following basic systems-management features:
v Active Energy Manager.
v Alerts (in-band and out-of-band alerting, PET traps - IPMI style, SNMP, e-mail).
v Auto Boot Failure Recovery (ABR).
v Automatic microprocessor disable on failure and restart in a two-microprocessor
configuration when one microprocessor signals an internal error. When one of
the microprocessors fail, the server will disable the failing microprocessor and
restart with the other microprocessor.
v Automatic Server Restart (ASR) when POST is not complete or the operating
system hangs and the operating system watchdog timer times-out. The IMM
might be configured to watch for the operating system watchdog timer and
reboot the system after a timeout, if the ASR feature is enabled. Otherwise, the
IMM allows the administrator to generate a nonmaskable interrupt (NMI) by
pressing an NMI button on the light path diagnostics panel for an
operating-system memory dump. ASR is supported by IPMI.
v Boot sequence manipulation.
v Command-line interface.
v Configuration save and restore.
v DIMM error assistance. The Unified Extensible Firmware Interface (UEFI)
disables a failing DIMM that is detected during POST, and the IMM lights the
associated system error LED and the failing DIMM error LED.
Chapter 2. Configuration information and instructions33
v Environmental monitor with fan speed control for temperature, voltages, fan
failure, power supply failure, and power backplane failure.
v Intelligent Platform Management Interface (IPMI) Specification V2.0 and
Intelligent Platform Management Bus (IPMB) support.
v Invalid system configuration (CONFIG) LED support.
v Light path diagnostics LEDs indicators to report errors that occur with fans,
power supplies, microprocessor, hard disk drives, and system errors.
v Local firmware code flash update
v Nonmaskable interrupt (NMI) detection and reporting.
v Operating-system failure blue screen capture.
v PCI configuration data.
v Power/reset control (power-on, hard and soft shutdown, hard and soft reset,
schedule power control).
v Query power-supply input power.
v ROM-based IMM firmware flash updates.
v Serial over LAN (SOL).
v Serial port redirection over telnet or ssh.
v SMI handling
v System event log (SEL) - user readable event log.
The IMM also provides the following remote server management capabilities
through the OSA SMBridge management utility program:
v Command-line interface (IPMI Shell)
The command-line interface provides direct access to server management
functions through the IPMI 2.0 protocol. Use the command-line interface to issue
commands to control the server power, view system information, and identify
the server. You can also save one or more commands as a text file and run the
file as a script.
v Serial over LAN
Establish a Serial over LAN (SOL) connection to manage servers from a remote
location. You can remotely view and change the UEFI settings, restart the server,
identify the server, and perform other management functions. Any standard
Telnet client application can access the SOL connection.
For more information about IMM, see the Integrated Management Module II User'sGuide at http://www-947.ibm.com/support/entry/portal/
docdisplay?lndocid=migr-5086346.
Using the remote presence and blue-screen capture features
The remote presence and blue-screen capture features are integrated functions of
the integrated management module II (IMM2).
The remote presence feature provides the following functions:
v Remotely viewing video with graphics resolutions up to 1600 x 1200 at 75 Hz,
regardless of the system state
v Remotely accessing the server, using the keyboard and mouse from a remote
client
v Mapping the CD or DVD drive, diskette drive, and USB flash drive on a remote
client, and mapping ISO and diskette image files as virtual drives that are
available for use by the server
34IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Uploading a diskette image to the IMM memory and mapping it to the server as
a virtual drive
The blue-screen capture feature captures the video display contents before the IMM
restarts the server when the IMM detects an operating-system hang condition. A
system administrator can use the blue-screen capture to assist in determining the
cause of the hang condition.
Obtaining the IMM host name
Use this information to obtain the IMM host name.
If you are logging on to the IMM for the first time after installation, the IMM
defaults to DHCP. If a DHCP server is not available, the IMM uses a static IP
address of 192.168.70.125. The default IPv4 host name is “IMM-” (plus the last 12
characters on the IMM MAC address). The default host name also comes on the
IMM network access tag that comes attached to the power supply on the rear of
the server. The IMM network access tag provides the default host name of the
IMM and does not require you to start the server.
The IPv6 link-local address (LLA) is derived from the IMM default host name. The
IMM LLA is on the IMM network access tag is on the power supply on the rear of
the server. To derive the link-local address, complete the following steps:
1. Take the last 12 characters on the IMM MAC address (for example,
5CF3FC5EAAD0).
2. Separate the number into pairs of hexadecimal characters (for example,
5C:F3:FC:5E:AA:D0).
3. Separate the first six and last six hexadecimal characters.
4. Add “FF” and “FE” in the middle of the 12 characters (for example, 5C F3 FC
FF FE 5E AA D0).
5. Convert the first pair of hexadecimal characters to binary (for example, 5=0101,
C=1100, which results in 01011100 F3 FC FF FE 5E AA D0).
6. Flip the 7th binary character from left (0 to 1 or 1 to 0), which results in
01011110 F3 FF FE 5E AA D0.
7. Convert the binary back to hexadecimal (for example, 5E F3FCFFFE5EAAD0).
Obtaining the IP address for the IMM
Use this information to obtain the IP address for the IMM.
To access the web interface to use the remote presence feature, you need the IP
address or host name of the IMM. You can obtain the IMM IP address through the
Setup utility and you can obtain the IMM host name from the IMM network access
tag. The server comes with a default IP address for the IMM of 192.168.70.125.
To obtain the IP address, complete the following steps:
1. Turn off the server.
Note: Approximately 5 to 10 seconds after the server is connected to power,
the power-control button becomes active.
2. When the prompt <F1> Setup is displayed, press F1. (This prompt is displayed
on the screen for only a few seconds. You must press F1 quickly.) If you have
set both a power-on password and an administrator password, you must type
the administrator password to access the full Setup utility menu.
3. From the Setup utility main menu, select System Settings.
Chapter 2. Configuration information and instructions35
4. On the next screen, select Integrated Management Module.
5. On the next screen, select Network Configuration.
6. Find the IP address and write it down.
7. Exit from the Setup utility.
Logging on to the web interface
Use this information to log on to the web interface.
To log on to the IMM web interface, complete the following steps:
1. On a system that is connected to the server, open a web browser. In the
Address or URL field, type the IP address or host name of the IMM to which
you want to connect.
Note: If you are logging on to the IMM for the first time after installation, the
IMM defaults to DHCP. If a DHCP host is not available, the IMM assigns a
static IP address of 192.168.70.125. The IMM network access tag provides the
default host name of the IMM and does not require you to start the server.
2. On the Login page, type the user name and password. If you are using the
IMM for the first time, you can obtain the user name and password from your
system administrator. All login attempts are documented in the system-event
log.
Note: The IMM is set initially with a user name of USERID and password of
PASSW0RD (with a zero, not a the letter O). You have read/write access. You
must change the default password the first time you log on.
3. Click Log in to start the session. The System Status and Health page provides a
quick view of the system status.
Note: If you boot to the operating system while in the IMM GUI and the message
“Booting OS or in unsupported OS” is displayed under System Status > SystemState, disable Windows 2008 firewall or type the following command in the
Windows 2008 console. This might also affect blue-screen capture features.
netsh firewall set icmpsetting type=8 mode=ENABLE
By default, the icmp packet is blocked by Windows firewall. The IMM GUI will
then change to “OS booted” status after you change the setting as indicated above
in both the Web and CLI interfaces.
Using the embedded hypervisor
The VMware ESXi embedded hypervisor software is available on the optional IBM
USB flash device with embedded hypervisor.
The USB flash device can be installed in USB connectors on the system board (see
“Internal cable routing and connectors” on page 180 for the location of the
connectors). Hypervisor is virtualization software that enables multiple operating
systems to run on a host system at the same time. The USB flash device is required
to activate the hypervisor functions.
To start using the embedded hypervisor functions, you must add the USB flash
device to the startup sequence in the Setup utility.
To add the USB flash device to the startup sequence, complete the following steps:
1. Turn on the server.
36IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Note: Approximately 5 to 10 seconds after the server is connected to power,
the power-control button becomes active.
2. When the prompt <F1> Setup is displayed, press F1.
3. From the Setup utility main menu, select Boot Manager.
5. Select Change Boot Order > Change the order. Use the Up arrow and Down
Arrow keys to select Embedded Hypervisor and use the plus (+) and minus (-)
keys to move Embedded Hypervisor in the boot order. When Embedded
Hypervisor is in the correct location in the boot order, press Enter. Select
Commit Changes and press Enter.
6. Select Save Settings and then select Exit Setup.
If the embedded hypervisor flash device image becomes corrupt, you can
download the image from http://www-03.ibm.com/systems/x/os/vmware/esxi/.
For additional information and instructions, see VMware vSphere 4.1
Documentation at http://www.vmware.com/support/pubs/vs_pages/
vsp_pubs_esxi41_e_vc41.html or the VMware vSphere Installation and Setup Guide at
http://pubs.vmware.com/vsphere-50/topic/com.vmware.ICbase/PDF/vsphereesxi-vcenter-server-50-installation-setup-guide.pdf.
Configuring the Ethernet controller
Use this information to configure the Ethernet controller.
The Ethernet controllers are integrated on the system board. They provide an
interface for connecting to a 10 Mbps, 100 Mbps, or 1 Gbps network and provide
full-duplex (FDX) capability, which enables simultaneous transmission and
reception of data on the network. If the Ethernet ports in the server support
auto-negotiation, the controllers detect the data-transfer rate (10BASE-T,
100BASE-TX, or 1000BASE-T) and duplex mode (full-duplex or half-duplex) of the
network and automatically operate at that rate and mode.
You do not have to set any jumpers or configure the controllers. However, you
must install a device driver to enable the operating system to address the
controllers.
To find device drivers and information about configuring the Ethernet controllers,
go to http://www.ibm.com/supportportal.
Enabling Features on Demand Ethernet software
Use this information to enable Features on Demand Ethernet software.
You can activate the Features on Demand (FoD) software upgrade key for Fibre
Channel over Ethernet (FCoE) and iSCSI storage protocols that is integrated in the
integrated management module. For more information and instructions for
activating the Features on Demand Ethernet software key, see the IBM Features onDemand User’s Guide. To download the document, go to /http://www.ibm.com/
systems/x/fod/, log in, and click Help.
Enabling Features on Demand RAID software
Use this information to enable Features on Demand RAID software.
Chapter 2. Configuration information and instructions37
You can activate the Features on Demand (FoD) software upgrade key for RAID
that is integrated in the integrated management module. For more information and
instructions for activating the Features on Demand RAID software key, see the IBMFeatures on Demand User’s Guide. To download the document, go to
/http://www.ibm.com/systems/x/fod/, log in, and click Help.
Configuring RAID arrays
Use the Setup utility to configure RAID arrays.
The specific procedure for configuring arrays depends on the RAID controller that
you are using. For details, see the documentation for your RAID controller. To
access the utility for your RAID controller, complete the following steps:
1. Turn on the server.
Note: Approximately 10 seconds after the server is connected to power, the
power-control button becomes active.
2. When prompted, <F1 Setup> is displayed, press F1. If you have set an
administrator password, you must type the administrator password to access
the full Setup utility menu. If you do not type the administrator password, a
limited Setup utility menu is available.
3. Select System Settings > Storage.
4. Press Enter to refresh the list of device drivers.
5. Select the device driver for your RAID controller and press Enter.
6. Follow the instructions in the documentation for your RAID controller.
IBM Advanced Settings Utility program
The IBM Advanced Settings Utility (ASU) program is an alternative to the Setup
utility for modifying UEFI settings.
Use the ASU program online or out of band to modify UEFI settings from the
command line without the need to restart the system to access the Setup utility.
You can also use the ASU program to configure the optional remote presence
features or other IMM2 settings. The remote presence features provide enhanced
systems-management capabilities.
In addition, the ASU program provides IMM LAN over USB interface
configuration through the command-line interface.
Use the command-line interface to issue setup commands. You can save any of the
settings as a file and run the file as a script. The ASU program supports scripting
environments through a batch-processing mode.
For more information and to download the ASU program, go to
http://www.ibm.com/support/entry/portal/docdisplay?lndocid=TOOL-ASU.
Updating IBM Systems Director
Use this information to update the IBM Systems Director.
If you plan to use IBM Systems Director to manage the server, you must check for
the latest applicable IBM Systems Director updates and interim fixes.
38IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Note: Changes are made periodically to the IBM website. The actual procedure
might vary slightly from what is described in this document.
Installing a newer version
To locate and install a newer version of IBM Systems Director, complete the
following steps:
1. Check for the latest version of IBM Systems Director:
a. Go to http://www-03.ibm.com/systems/software/director/resources.html.
b. If a newer version of IBM Systems Director than what comes with the
server is shown in the drop-down list, follow the instructions on the web
page to download the latest version.
2. Install the IBM Systems Director program.
Installing updates with your management server is connected to
the Internet
If your management server is connected to the Internet, to locate and install
updates and interim fixes, complete the following steps:
1. Make sure that you have run the Discovery and Inventory collection tasks.
2. On the Welcome page of the IBM Systems Director web interface, click View
updates.
3. Click Check for updates. The available updates are displayed in a table.
4. Select the updates that you want to install, and click Install to start the
installation wizard.
Installing updates with your management server is not
connected to the Internet
If your management server is not connected to the Internet, to locate and install
updates and interim fixes, complete the following steps:
1. Make sure that you have run the Discovery and Inventory collection tasks.
2. On a system that is connected to the Internet, go to http://www.ibm.com/
support/fixcentral.
3. From the Product family list, select IBM Systems Director.
4. From the Product list, select IBM Systems Director.
5. From the Installed version list, select the latest version, and click Continue.
6. Download the available updates.
7. Copy the downloaded files to the management server.
8. On the management server, on the Welcome page of the IBM Systems Director
web interface, click the Manage tab, and click Update Manager.
9. Click Import updates and specify the location of the downloaded files that
you copied to the management server.
10. Return to the Welcome page of the Web interface, and click View updates.
11. Select the updates that you want to install, and click Install to start the
installation wizard.
Updating the Universal Unique Identifier (UUID)
The Universal Unique Identifier (UUID) must be updated when the system board
is replaced. Use the Advanced Settings Utility to update the UUID in the
UEFI-based server.
Chapter 2. Configuration information and instructions39
The ASU is an online tool that supports several operating systems. Make sure that
you download the version for your operating system. You can download the ASU
from the IBM Web site. To download the ASU and update the UUID, complete the
following steps.
Note: Changes are made periodically to the IBM website. The actual procedure
might vary slightly from what is described in this document.
1. Download the Advanced Settings Utility (ASU):
a. Go to http://www.ibm.com/supportportal.
b. Click on the Downloads tab at the top of the panel.
c. Under ToolsCenter, select View ToolsCenter downloads.
d. Select Advanced Settings Utility (ASU).
e. Scroll down and click on the link and download the ASU version for your
operating system.
2. ASU sets the UUID in the Integrated Management Module (IMM). Select one of
the following methods to access the Integrated Management Module (IMM) to
set the UUID:
v Online from the target system (LAN or keyboard console style (KCS) access)
v Remote access to the target system (LAN based)
v Bootable media containing ASU (LAN or KCS, depending upon the bootable
media)
3. Copy and unpack the ASU package, which also includes other required files, to
the server. Make sure that you unpack the ASU and the required files to the
same directory. In addition to the application executable (asu or asu64), the
following files are required:
v For Windows based operating systems:
– ibm_rndis_server_os.inf
– device.cat
v For Linux based operating systems:
– cdc_interface.sh
4. After you install ASU, use the following command syntax to set the UUID: asu
set SYSTEM_PROD_DATA.SysInfoUUID <uuid_value> [access_method]
Where:
<uuid_value>
Up to 16-byte hexadecimal value assigned by you.
[access_method]
The access method that you selected to use from the following
methods:
v Online authenticated LAN access, type the command:
The IMM internal LAN/USB IP address. The default value is
169.254.95.118.
imm_user_id
The IMM account (1 of 12 accounts). The default value is USERID.
40IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
imm_password
The IMM account password (1 of 12 accounts). The default value is
PASSW0RD (with a zero 0 not an O).
Note: If you do not specify any of these parameters, ASU will use the
default values. When the default values are used and ASU is unable to access
the IMM using the online authenticated LAN access method, ASU will
automatically use the unauthenticated KCS access method.
The following commands are examples of using the userid and password
default values and not using the default values:
Example that does not use the userid and password default values:
asu set SYSTEM_PROD_DATA.SYsInfoUUID <uuid_value> --user <user_id>
--password <password>
Example that does use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoUUID <uuid_value>
v Online KCS access (unauthenticated and user restricted):
You do not need to specify a value for access_method when you use this
access method.
Example:
asu set SYSTEM_PROD_DATA.SysInfoUUID <uuid_value>
The KCS access method uses the IPMI/KCS interface. This method requires
that the IPMI driver be installed. Some operating systems have the IPMI
driver installed by default. ASU provides the corresponding mapping layer.
See the Advanced Settings Utility Users Guide for more details. You can access
the ASU Users Guide from the IBM website.
Note: Changes are made periodically to the IBM website. The actual
procedure might vary slightly from what is described in this document.
a. Go to http://www.ibm.com/supportportal.
b. Click on the Downloads tab at the top of the panel.
c. Under ToolsCenter, select View ToolsCenter downloads.
d. Select Advanced Settings Utility (ASU).
e. Scroll down and click on the link and download the ASU version for
your operating system. Scroll down and look under Online Help to
download the Advanced Settings Utility Users Guide.
v Remote LAN access, type the command:
Note: When using the remote LAN access method to access IMM using the
LAN from a client, the host and the imm_external_ip address are required
parameters.
The external IMM LAN IP address. There is no default value. This
parameter is required.
imm_user_id
The IMM account (1 of 12 accounts). The default value is USERID.
Chapter 2. Configuration information and instructions41
imm_password
The IMM account password (1 of 12 accounts). The default value is
PASSW0RD (with a zero 0 not an O).
The following commands are examples of using the userid and password
default values and not using the default values:
Example that does not use the userid and password default values:
asu set SYSTEM_PROD_DATA.SYsInfoUUID <uuid_value> --host <imm_ip>
--user <user_id> --password <password>
Example that does use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoUUID <uuid_value> --host <imm_ip>
v Bootable media:
You can also build a bootable media using the applications available through
the ToolsCenter website at http://www.ibm.com/support/entry/portal/
docdisplay?lndocid=TOOL-CENTER. From the IBM ToolsCenter page, scroll
down for the available tools.
5. Restart the server.
Updating the DMI/SMBIOS data
Use this information to update the DMI/SMBIOS data.
The Desktop Management Interface (DMI) must be updated when the system
board is replaced. Use the Advanced Settings Utility to update the DMI in the
UEFI-based server. The ASU is an online tool that supports several operating
systems. Make sure that you download the version for your operating system. You
can download the ASU from the IBM website. To download the ASU and update
the DMI, complete the following steps.
Note: Changes are made periodically to the IBM website. The actual procedure
might vary slightly from what is described in this document.
1. Download the Advanced Settings Utility (ASU):
a. Go to http://www.ibm.com/supportportal.
b. Click on the Downloads tab at the top of the panel.
c. Under ToolsCenter, select View ToolsCenter downloads.
d. Select Advanced Settings Utility (ASU).
e. Scroll down and click on the link and download the ASU version for your
operating system.
2. ASU sets the DMI in the Integrated Management Module (IMM). Select one of
the following methods to access the Integrated Management Module (IMM) to
set the DMI:
v Online from the target system (LAN or keyboard console style (KCS) access)
v Remote access to the target system (LAN based)
v Bootable media containing ASU (LAN or KCS, depending upon the bootable
media)
3. Copy and unpack the ASU package, which also includes other required files, to
the server. Make sure that you unpack the ASU and the required files to the
same directory. In addition to the application executable (asu or asu64), the
following files are required:
v For Windows based operating systems:
– ibm_rndis_server_os.inf
42IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
– device.cat
v For Linux based operating systems:
– cdc_interface.sh
4. After you install ASU, Type the following commands to set the DMI:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model> [access_method]
asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n> [access_method]
asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag> [access_method]
Where:
<m/t_model>
The server machine type and model number. Type mtm xxxxyyy, where
xxxx is the machine type and yyy is the server model number.
<s/n>The serial number on the server. Type sn zzzzzzz, where zzzzzzz is the
serial number.
<asset_method>
The server asset tag number. Type asset
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, where
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa is the asset tag number.
[access_method]
The access method that you select to use from the following methods:
v Online authenticated LAN access, type the command:
The IMM internal LAN/USB IP address. The default value is
169.254.95.118.
imm_user_id
The IMM account (1 of 12 accounts). The default value is USERID.
imm_password
The IMM account password (1 of 12 accounts). The default value is
PASSW0RD (with a zero 0 not an O).
Note: If you do not specify any of these parameters, ASU will use the
default values. When the default values are used and ASU is unable to access
the IMM using the online authenticated LAN access method, ASU will
automatically use the unauthenticated KCS access method.
The following commands are examples of using the userid and password
default values and not using the default values:
Examples that do not use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model>
--user <imm_user_id> --password <imm_password>
asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n> --user <imm_user_id>
--password <imm_password>
asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag>
--user <imm_user_id> --password <imm_password>
Examples that do use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model>
Chapter 2. Configuration information and instructions43
asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n>
asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag>
v Online KCS access (unauthenticated and user restricted):
You do not need to specify a value for access_method when you use this
access method.
The KCS access method uses the IPMI/KCS interface. This method requires
that the IPMI driver be installed. Some operating systems have the IPMI
driver installed by default. ASU provides the corresponding mapping layer.
To download the Advanced Settings Utility Users Guide, complete the
following steps:
Note: Changes are made periodically to the IBM website. The actual
procedure might vary slightly from what is described in this document.
a. Go to http://www.ibm.com/supportportal.
b. Click on the Downloads tab at the top of the panel.
c. Under ToolsCenter, select View ToolsCenter downloads.
d. Select Advanced Settings Utility (ASU).
e. Scroll down and click on the link and download the ASU version for
your operating system. Scroll down and look under Online Help to
download the Advanced Settings Utility Users Guide.
v The following commands are examples of using the userid and password
default values and not using the default values:
Examples that do not use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model>
asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n>
asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag>
v Remote LAN access, type the command:
Note: When using the remote LAN access method to access IMM using the
LAN from a client, the host and the imm_external_ip address are required
parameters.
The external IMM LAN IP address. There is no default value. This
parameter is required.
imm_user_id
The IMM account (1 of 12 accounts). The default value is USERID.
imm_password
The IMM account password (1 of 12 accounts). The default value is
PASSW0RD (with a zero 0 not an O).
The following commands are examples of using the userid and password
default values and not using the default values:
Examples that do not use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model> --host <imm_ip>
--user <imm_user_id> --password <imm_password>
asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n> --host <imm_ip>
--user <imm_user_id> --password <imm_password>
asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag> --host <imm_ip>
44IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
--user <imm_user_id> --password <imm_password>
Examples that do use the userid and password default values:
asu set SYSTEM_PROD_DATA.SysInfoProdName <m/t_model> --host <imm_ip>
asu set SYSTEM_PROD_DATA.SysInfoSerialNum <s/n> --host <imm_ip>
asu set SYSTEM_PROD_DATA.SysEncloseAssetTag <asset_tag> --host <imm_ip>
v Bootable media:
You can also build a bootable media using the applications available through
the ToolsCenter website at http://www.ibm.com/support/entry/portal/
docdisplay?lndocid=TOOL-CENTER. From the IBM ToolsCenter page, scroll
down for the available tools.
5. Restart the server.
Chapter 2. Configuration information and instructions45
46IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Chapter 3. Troubleshooting
This chapter describes the diagnostic tools and troubleshooting information that
are available to help you solve problems that might occur in the server.
If you cannot diagnose and correct a problem by using the information in this
chapter, see Appendix D, “Getting help and technical assistance,” on page 373 for
more information.
Start here
You can solve many problems without outside assistance by following the
troubleshooting procedures in this documentation and on the World Wide Web.
This document describes the diagnostic tests that you can perform, troubleshooting
procedures, and explanations of error messages and error codes. The
documentation that comes with your operating system and software also contains
troubleshooting information.
Diagnosing a problem
Before you contact IBM or an approved warranty service provider, follow these
procedures in the order in which they are presented to diagnose a problem with
your server.
1. Return the server to the condition it was in before the problem occurred. If
any hardware, software, or firmware was changed before the problem occurred,
if possible, reverse those changes. This might include any of the following
items:
v Hardware components
v Device drivers and firmware
v System software
v UEFI firmware
v System input power or network connections
2. View the light path diagnostics LEDs and event logs. The server is designed
for ease of diagnosis of hardware and software problems.
v Light path diagnostics LEDs: See Fan and power controller indicators,
controls, and connectors of the IBM NeXtScale n1200 Enclosure Type 5456
Installation and Service Guide for information about using light path
diagnostics LEDs.
v Event logs: See “Event logs” on page 55 for information about notification
events and diagnosis.
v Software or operating-system error codes: See the documentation for the
software or operating system for information about a specific error code. See
the manufacturer's website for documentation.
3. Run IBM Dynamic System Analysis (DSA) and collect system data. Run
Dynamic System Analysis (DSA) to collect information about the hardware,
firmware, software, and operating system. Have this information available
when you contact IBM or an approved warranty service provider. For
instructions for running DSA, see the Dynamic System Analysis Installation andUser's Guide.
To download the latest version of DSA code and the Dynamic System AnalysisInstallation and User's Guide, go to http://www.ibm.com/support/entry/portal/
docdisplay?lndocid=SERV-DSA.
4. Check for and apply code updates. Fixes or workarounds for many problems
might be available in updated UEFI firmware, device firmware, or device
drivers. To display a list of available updates for the server, go to
http://www.ibm.com/support/fixcentral.
Attention: Installing the wrong firmware or device-driver update might cause
the server to malfunction. Before you install a firmware or device-driver
update, read any readme and change history files that are provided with the
downloaded update. These files contain important information about the
update and the procedure for installing the update, including any special
procedure for updating from an early firmware or device-driver version to the
latest version.
Important: Some cluster solutions require specific code levels or coordinated
code updates. If the device is part of a cluster solution, verify that the latest
level of code is supported for the cluster solution before you update the code.
a. Install UpdateXpress system updates. You can install code updates that are
packaged as an UpdateXpress System Pack or UpdateXpress CD image. An
UpdateXpress System Pack contains an integration-tested bundle of online
firmware and device-driver updates for your server. In addition, you can
use IBM ToolsCenter Bootable Media Creator to create bootable media that
is suitable for applying firmware updates and running preboot diagnostics.
For more information about UpdateXpress System Packs, see and “Updating
the firmware” on page 21. For more information about the Bootable Media
Creator, see http://www.ibm.com/support/entry/portal/
docdisplay?lndocid=TOOL-BOMC.
Be sure to separately install any listed critical updates that have release
dates that are later than the release date of the UpdateXpress System Pack
or UpdateXpress image (see step 4b).
b. Install manual system updates.
1) Determine the existing code levels.
In DSA, click Firmware/VPD to view system firmware levels, or click
Software to view operating-system levels.
2) Download and install updates of code that is not at the latest level.
To display a list of available updates for the server, go to
http://www.ibm.com/support/fixcentral.
When you click an update, an information page is displayed, including
a list of the problems that the update fixes. Review this list for your
specific problem; however, even if your problem is not listed, installing
the update might solve the problem.
5. Check for and correct an incorrect configuration. If the server is incorrectly
configured, a system function can fail to work when you enable it; if you make
an incorrect change to the server configuration, a system function that has been
enabled can stop working.
a. Make sure that all installed hardware and software are supported. See
http://www.ibm.com/systems/info/x86servers/serverproven/compat/us
to verify that the server supports the installed operating system, optional
devices, and software levels. If any hardware or software component is not
supported, uninstall it to determine whether it is causing the problem. You
must remove nonsupported hardware before you contact IBM or an
approved warranty service provider for support.
48IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
b. Make sure that the server, operating system, and software are installed
and configured correctly. Many configuration problems are caused by loose
power or signal cables or incorrectly seated adapters. You might be able to
solve the problem by turning off the server, reconnecting cables, reseating
adapters, and turning the server back on. For information about performing
the checkout procedure, see “About the checkout procedure” on page 50.
For information about configuring the server, see Chapter 2, “Configuration
information and instructions,” on page 21.
6. See controller and management software documentation. If the problem is
associated with a specific function (for example, if a RAID hard disk drive is
marked offline in the RAID array), see the documentation for the associated
controller and management or controlling software to verify that the controller
is correctly configured.
Problem determination information is available for many devices such as RAID
and network adapters.
For problems with operating systems or IBM software or devices, go to
http://www.ibm.com/supportportal.
7. Check for troubleshooting procedures and RETAIN tips. Troubleshooting
procedures and RETAIN tips document known problems and suggested
solutions. To search for troubleshooting procedures and RETAIN tips, go to
http://www.ibm.com/supportportal.
8. Use the troubleshooting tables. See “Troubleshooting by symptom” on page 62
to find a solution to a problem that has identifiable symptoms.
A single problem might cause multiple symptoms. Follow the troubleshooting
procedure for the most obvious symptom. If that procedure does not diagnose
the problem, use the procedure for another symptom, if possible.
If the problem remains, contact IBM or an approved warranty service provider
for assistance with additional problem determination and possible hardware
replacement. To open an online service request, go to http://www.ibm.com/
support/entry/portal/Open_service_request. Be prepared to provide
information about any error codes and collected data.
Undocumented problems
If you have completed the diagnostic procedure and the problem remains, the
problem might not have been previously identified by IBM. After you have
verified that all code is at the latest level, all hardware and software configurations
are valid, and no light path diagnostics LEDs or log entries indicate a hardware
component failure, contact IBM or an approved warranty service provider for
assistance.
To open an online service request, go to http://www.ibm.com/support/entry/
portal/Open_service_request. Be prepared to provide information about any error
codes and collected data and the problem determination procedures that you have
used.
Service bulletins
IBM continually updates the support website with the latest tips and techniques
that you can use to solve problem that you might have with the IBM NeXtScale
nx360 M4 Compute Node server.
To find service bulletins that are available for the IBM NeXtScale nx360 M4
Compute Node server, go to and search for Type 5455, and retain.
Chapter 3. Troubleshooting49
Checkout procedure
The checkout procedure is the sequence of tasks that you should follow to
diagnose a problem in the server.
About the checkout procedure
Before you perform the checkout procedure for diagnosing hardware problems,
review the following information:
v Read the safety information that begins on page “Safety” on page vii.
v IBM Dynamic System Analysis (DSA) provides the primary methods of testing
the major components of the server, such as the system board, Ethernet
controller, keyboard, mouse (pointing device), serial ports, and hard disk drives.
You can also use them to test some external devices. If you are not sure whether
a problem is caused by the hardware or by the software, you can use the
diagnostic programs to confirm that the hardware is working correctly.
v When you run DSA, a single problem might cause more than one error message.
When this happens, correct the cause of the first error message. The other error
messages usually will not occur the next time you run DSA.
Exception: If multiple error codes or light path diagnostics LEDs indicate a
microprocessor error, the error might be in the microprocessor or in the
microprocessor socket. See “Microprocessor problems” on page 67 for
information about diagnosing microprocessor problems.
v Before you run DSA, you must determine whether the failing server is part of a
shared hard disk drive cluster (two or more servers sharing external storage
devices). If it is part of a cluster, you can run all diagnostic programs except the
ones that test the storage unit (that is, a hard disk drive in the storage unit) or
the storage adapter that is attached to the storage unit. The failing server might
be part of a cluster if any of the following conditions is true:
– You have identified the failing server as part of a cluster (two or more servers
sharing external storage devices).
– One or more external storage units are attached to the failing server and at
least one of the attached storage units is also attached to another server or
unidentifiable device.
– One or more servers are located near the failing server.
Important: If the server is part of a shared hard disk drive cluster, run one test
at a time. Do not run any suite of tests, such as “quick” or “normal” tests,
because this might enable the hard disk drive diagnostic tests.
v If the server is halted and a POST error code is displayed, see Appendix B,
“UEFI (POST) error codes,” on page 309. If the server is halted and no error
message is displayed, see “Troubleshooting by symptom” on page 62 and
“Solving undetermined problems” on page 78.
v For information about power-supply problems, see “Solving power problems”
on page 75, “Power problems” on page 72, and “Power-supply LEDs” on page
53.
v For intermittent problems, check the event log; see “Event logs” on page 55 and
Appendix C, “DSA diagnostic test results,” on page 321.
50IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Performing the checkout procedure
Use this information to perform the checkout procedure.
To perform the checkout procedure, complete the following steps:
1. Is the server part of a cluster?
v No: Go to step 2.
v Yes: Shut down all failing servers that are related to the cluster. Go to step 2.
2. Complete the following steps:
a. Check the power supply LEDs (see “Power-supply LEDs” on page 53).
b. Turn off the server and all external devices.
c. Check all internal and external devices for compatibility at
d. Check all cables and power cords.
e. Set all display controls to the middle positions.
f. Turn on all external devices.
g. Turn on the server. If the server does not start, see “Troubleshooting by
symptom” on page 62.
h. Check the system-error LED on the operator information panel. If it is lit,
check the light path diagnostics LEDs (see “Compute node controls,
connectors, and LEDs” on page 13).
i. Check for the following results:
v Successful completion of POST (see “POST” on page 58 for more
information)
v Successful completion of startup, which is indicated by a readable display
of the operating-system desktop
3. Is there a readable image on the monitor screen?
v No: Find the failure symptom in “Troubleshooting by symptom” on page 62;
if necessary, see “Solving undetermined problems” on page 78.
v Yes: Run DSA (see “Running DSA Preboot diagnostic programs” on page 60).
– If DSA reports an error, follow the instructions in Appendix C, “DSA
diagnostic test results,” on page 321.
– If DSA does not report an error but you still suspect a problem, see
“Solving undetermined problems” on page 78.
Diagnostic tools
The section introduces available tools to help you diagnose and solve
hardware-related problems.
v Light path diagnostics
Use light path diagnostics to diagnose system errors quickly. See Light path
diagnostics for more information.
v Event logs
The event logs list the error codes and messages that are generated when an
error is detected for the subsystems IMM2, POST, DSA, and the server
baseboard management controller. See “Event logs” on page 55 for more
information.
v Integrated management module II
Chapter 3. Troubleshooting51
The integrated management module II (IMM2) combines service processor
functions, video controller, and remote presence and blue-screen capture features
in a single chip. The IMM provides advanced service-processor control,
monitoring, and alerting function. If an environmental condition exceeds a
threshold or if a system component fails, the IMM lights LEDs to help you
diagnose the problem, records the error in the IMM event log, and alerts you to
the problem. Optionally, the IMM also provides a virtual presence capability for
remote server management capabilities. The IMM provides remote server
management through the following industry-standard interfaces:
– Intelligent Platform Management Protocol (IPMI) version 2.0
– Simple Network Management Protocol (SNMP) version 3
– Common Information Model (CIM)
– Web browser
For more information about the integrated management module II (IMM2), see
“Using the integrated management module” on page 33, Appendix A,
“Integrated Management Module II (IMM2) error messages,” on page 185, and
the Integrated Management Module II User's Guide at http://www-947.ibm.com/
support/entry/portal/docdisplay?lndocid=migr-5086346.
v IBM Dynamic System Analysis
Two editions of IBM Dynamic System Analysis (DSA) are available for
diagnosing problems, DSA Portable and DSA Preboot:
– DSA Portable
DSA Portable collects and analyzes system information to aid in diagnosing
server problems. DSA Portable runs on the server operating system and
collects the following information about the server:
- Drive health information
- Event logs for ServeRAID controllers and service processors
- Installed hardware, including PCI and USB information
- Installed applications and hot fixes
- Kernel modules
- Light path diagnostics status
- Microprocessor, input/out hub, and UEFI error logs
- Network interfaces and settings
- RAID controller configuration
- Service processor (integrated management module) status and
configuration
- System configuration
- Vital product data, firmware, and UEFI configuration
DSA Portable creates a DSA log, which is a chronologically ordered merge of
the system-event log (as the IPMI event log), the integrated management
module (IMM) event log (as the ASM event log), and the operating-system
event logs. You can send the DSA log as a file to IBM Support (when
requested by IBM Support) or view the information as a text file or HTML
file.
Note: Use the latest available version of DSA to make sure you are using the
most recent configuration data. For documentation and download information
for DSA, see http://www.ibm.com/systems/management.
For additional information, see “IBM Dynamic System Analysis” on page 58
and DSA messages.
52IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
– DSA Preboot
DSA Preboot diagnostic program is stored in the integrated USB memory on
the server. DSA Preboot collects and analyzes system information to aid in
diagnosing server problems, as well as offering a rich set of diagnostic tests of
the major components of the server. DSA Preboot collects the following
information about the server:
- Drive health information
- Event logs for ServeRAID controllers and service processors
- Installed hardware, including PCI and USB information
- Light path diagnostics status
- Microprocessor, input/output hub, and UEFI error logs
- Network interfaces and settings
- RAID controller configuration
- Service processor (integrated management module) status and
configuration
- System configuration
- Vital product data, firmware, and UEFI configuration
DSA Preboot also provides diagnostics for the following system components
(when they are installed):
1. Emulex network adapter
2. IMM I2C bus
3. Light path diagnostics panel
4. Memory modules
5. Microprocessors
6. Optical devices (CD or DVD)
7. SAS or SATA drives
See “Running DSA Preboot diagnostic programs” on page 60 for more
information on running the DSA Preboot program on the server.
v Troubleshooting by symptom
These tables list problem symptoms and actions to correct the problems. See
“Troubleshooting by symptom” on page 62 for more information.
Power-supply LEDs
The following minimum configuration is required for the server to start.
v One microprocessor in microprocessor socket 1
v One 2 GB DIMM on the system board
v One power supply
v Power cord
v Four cooling fans
v One PCI riser-card assembly in PCI connector 1
AC power-supply LEDs
Use this information to view AC power-supply LEDs.
The following minimum configuration is required for the DC LED on the power
supply to be lit:
v Power supply
v Power cord
Note: You must turn on the server for the DC LED on the power supply to be lit.
Chapter 3. Troubleshooting53
The following illustration shows the locations of the power-supply LEDs on the ac
power supply.
AC power
LED (green)
Figure 13. AC power-supply LEDs
DC power
LED (green)
Power-supply
error LED (yellow)
The following table describes the problems that are indicated by various
combinations of the power-supply LEDs on an ac power supply and suggested
actions to correct the detected problems.
AC power-supply LEDs
DescriptionActionNotesACDCError (!)
OnOnOffNormal operation.
OffOffOffNo ac power to the
server or a problem
with the ac power
source.
OffOffOnThe power supply
has failed.
OffOnOffThe power supply
has failed.
OffOnOnThe power supply
has failed.
1. Check the ac power to the
server.
2. Make sure that the power
cord is connected to a
functioning power source.
3. Restart the server. If the error
remains, check the
power-supply LEDs.
4. If the problem remains,
replace the power-supply.
Replace the power supply.
Replace the power supply.
Replace the power supply.
This is a normal
condition when no ac
power is present.
54IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
AC power-supply LEDs
DescriptionActionNotesACDCError (!)
OnOffOffPower-supply not
fully seated, faulty
system board, or
the power supply
has failed.
OnOffOnThe power supply
has failed.
OnOnOnThe power supply
has failed.
1. Reseat the power supply.
2. Follow actions in “Power
problems” on page 72.
3. Follow actions in “Solving
power problems” on page 75
until the problem is solved.
Replace the power supply.
Replace the power supply.
Typically indicates a
power-supply is not
fully seated.
System pulse LEDs
Use this information to view the system pulse LEDs.
The following LEDs are on the system board and monitor the system power-on
and power-off sequencing and boot progress (see “System-board LEDs and
controls” on page 19 for the location of these LEDs).
Table 2. System pulse LEDs
LEDDescriptionAction
RTMM heartbeatPower-on and power-off
sequencing.
IMM2 heartbeatIMM2 heartbeat boot process.The following steps describe the different
1. If the LED blinks at 1Hz, it is functioning
properly and no action is necessary.
2. If the LED is not blinking, (trained
technician only) replace the system board.
stages of the IMM2 heartbeat sequencing
process.
1. When this LED is blinking fast
(approximately 4Hz), this indicates, that the
IMM2 code is in the loading process.
2. When this LED goes off momentarily, this
indicates that the IMM2 code has loaded
completely.
3. When this LED goes off momentarily and
then starts blinking slowing (approximately
1Hz), this indicates that IMM2 is fully
operational. You can now press the
power-control button to power-on the
server.
4. If this LED does not blink within 30
seconds of connecting a power source to
the server, (trained technician only) replace
the system board.
Event logs
Error codes and messages displayed in POST event log, system-event log,
integrated management module (IMM2) event log, and DSA event log.
v POST event log: This log contains the most recent error codes and messages
that were generated during POST. You can view the contents of the POST event
Chapter 3. Troubleshooting55
log from the Setup utility (see “Starting the Setup utility” on page 26). For more
information about POST error codes, see Appendix B, “UEFI (POST) error
codes,” on page 309.
v System-event log: This log contains POST and system management interrupt
(SMI) events and all events that are generated by the baseboard management
controller that is embedded in the integrated management module (IMM). You
can view the contents of the system-event log through the Setup utility and
through the Dynamic System Analysis (DSA) program (as IPMI event log).
The system-event log is limited in size. When it is full, new entries will not
overwrite existing entries; therefore, you must periodically clear the
system-event log through the Setup utility. When you are troubleshooting an
error, you might have to save and then clear the system-event log to make the
most recent events available for analysis. For more information about the
system-event log, see Appendix A, “Integrated Management Module II (IMM2)
error messages,” on page 185.
Messages are listed on the left side of the screen, and details about the selected
message are displayed on the right side of the screen. To move from one entry
to the next, use the Up Arrow (↑) and Down Arrow (↓) keys.
Some IMM sensors cause assertion events to be logged when their setpoints are
reached. When a setpoint condition no longer exists, a corresponding deassertion
event is logged. However, not all events are assertion-type events.
v Integrated management module II (IMM2) event log: This log contains a
filtered subset of all IMM, POST, and system management interrupt (SMI)
events. You can view the IMM event log through the IMM web interface. For
more information, see “Logging on to the web interface” on page 36. You can
also view the IMM event log through the Dynamic System Analysis (DSA)
program (as the ASM event log). For more information about IMM error
messages, see Appendix A, “Integrated Management Module II (IMM2) error
messages,” on page 185.
v DSA event log: This log is generated by the Dynamic System Analysis (DSA)
program, and it is a chronologically ordered merge of the system-event log (as
the IPMI event log), the IMM chassis-event log (as the ASM event log), and the
operating-system event logs. You can view the DSA event log through the DSA
program (see “Viewing event logs without restarting the server”). For more
information about DSA and DSA messages, see “IBM Dynamic System Analysis”
on page 58 and Appendix C, “DSA diagnostic test results,” on page 321.
Viewing event logs through the Setup utility
To view the POST event log or system-event log, complete the following steps:
1. Turn on the server.
2. When the prompt <F1> Setup is displayed, press F1. If you have set both a
power-on password and an administrator password, you must type the
administrator password to view the event logs.
3. Select System Event Logs and use one of the following procedures:
v To view the POST event log, select POST Event Viewers.
v To view the system-event log, select System Event Log.
Viewing event logs without restarting the server
If the server is not hung and the IMM is connected to a network, methods are
available for you to view one or more event logs without having to restart the
server.
56IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
If you have installed Dynamic System Analysis (DSA) Portable, you can use it to
view the system-event log (as the IPMI event log), or the IMM event log (as the
ASM event log), the operating-system event logs, or the merged DSA log. You can
also use DSA Preboot to view these logs, although you must restart the server to
use DSA Preboot. To install DSA Portable or check for and download a later
version of DSA Preboot CD image, go to http://www.ibm.com/support/entry/
portal/docdisplay?lndocid=SERV-DSA.
If IPMItool is installed in the server, you can use it to view the system-event log.
Most recent versions of the Linux operating system come with a current version of
IPMItool. For an overview of IPMI, go to http://www.ibm.com/developerworks/
linux/blueprints/ and click Using Intelligent Platform Management Interface(IPMI) on IBM Linux platforms.
You can view the IMM event log through the Event Log link in the integrated
management module II (IMM2) web interface. For more information, see “Logging
on to the web interface” on page 36.
The following table describes the methods that you can use to view the event logs,
depending on the condition of the server. The first three conditions generally do
not require that you restart the server.
Table 3. Methods for viewing event logs
ConditionAction
The server is not hung and is connected to a
network (using an operating system
controlled network ports).
The server is not hung and is not connected
to a network (using an operating system
controlled network ports).
The server is not hung and the integrated
management module II (IMM2) is connected
to a network.
Use any of the following methods:
v Run DSA Portable to view the diagnostic
event log (requires IPMI driver) or create
an output file that you can send to IBM
service and support (using ftp or local
copy).
v Use IPMItool to view the system-event log
(requires IPMI driver).
v Use the web browser interface to the IMM
to view the system-event log locally
(requires RNDIS USB LAN driver).
v Run DSA Portable to view the diagnostic
event log (requires IPMI driver) or create
an output file that you can send to IBM
service and support (using ftp or local
copy).
v Use IPMItool to view the system-event log
(requires IPMI driver).
v Use the web browser interface to the IMM
to view the system-event log locally
(requires RNDIS USB LAN driver).
In a web browser, type the IP address for
the IMM2 and go to the Event Log page. For
more information, see “Obtaining the IMM
host name” on page 35 and “Logging on to
the web interface” on page 36.
Chapter 3. Troubleshooting57
Table 3. Methods for viewing event logs (continued)
ConditionAction
The server is hung, and no communication
can be made with the IMM.
v If DSA Preboot is installed, restart the
server and press F2 to start DSA Preboot
and view the event logs (see “Running
DSA Preboot diagnostic programs” on
page 60 for more information).
v Alternatively, you can restart the server
and press F1 to start the Setup utility and
view the POST event log or system-event
log. For more information, see “Viewing
event logs through the Setup utility” on
page 56.
Clearing the event logs
Use this information to clear the event logs.
To clear the event logs, complete the following steps:
Note: The POST error log is automatically cleared each time the server is restarted.
1. Turn on the server.
2. When the prompt <F1> Setup is displayed, press F1. If you have set both a
power-on password and an administrator password, you must type the
administrator password to view the event logs.
3. To clear the IMM system-event log, select System Event Logs > Clear System
Event Log, then, press Enter twice.
POST
When you turn on the server, it performs a series of tests to check the operation of
the server components and some optional devices in the server. This series of tests
is called the power-on self-test, or POST.
Note: This server does not use beep codes for server status.
If a power-on password is set, you must type the password and press Enter (when
you are prompted), for POST to run.
If POST detects a problem, an error message is displayed. See Appendix B, “UEFI
(POST) error codes,” on page 309 for more information.
If POST detects a problem, an error message is sent to the POST event log, see
“Event logs” on page 55 for more information.
IBM Dynamic System Analysis
IBM Dynamic System Analysis (DSA) collects and analyzes system information to
aid in diagnosing server problems.
DSA collects the following information about the server:
v Drive health information
v Event logs for ServeRAID controllers and service processors
v Hardware inventory, including PCI and USB information
v Installed applications and hot fixes (available in DSA Portable only)
58IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Kernel modules (available in DSA Portable only)
v Light path diagnostics status
v Network interfaces and settings
v Performance data and details about processes that are running
v RAID controller configuration
v Service processor (integrated management module) status and configuration
v System configuration
v Vital product data and firmware information
For system-specific information about the action that you should take as a result of
a message that DSA generates, see Appendix C, “DSA diagnostic test results,” on
page 321.
If you cannot find a problem by using DSA, see “Solving undetermined problems”
on page 78 for information about testing the server.
Note: DSA Preboot might appear to be unresponsive when you start the program.
This is normal operation while the program loads.
Make sure that the server has the latest version of the DSA code. To obtain DSA
code and the Dynamic System Analysis Installation and User's Guide,goto
http://www.ibm.com/support/entry/portal/docdisplay?lndocid=SERV-DSA.
DSA editions
Two editions of Dynamic System Analysis are available.
v DSA Portable
DSA Portable Edition runs within the operating system; you do not have to
restart the server to run it. It is packaged as a self-extracting file that you
download from the web. When you run the file, it self-extracts to a temporary
folder and performs comprehensive collection of hardware and operating-system
information. After it runs, it automatically deletes the temporary files and folder
and leaves the results of the data collection and diagnostics on the server.
If you are able to start the server, use DSA Portable.
v DSA Preboot
DSA Preboot runs outside of the operating system; you must restart the server to
run it. It is provided in the flash memory on the server, or you can create a
bootable media such as a CD, DVD, ISO, USB, or PXE using the IBM
ToolsCenter Bootable Media Creator (BoMC). For more details, see the BoMC
Installation and User's Guide at http://www.ibm.com/support/entry/portal/
docdisplay?lndocid=TOOL-BOMC. In addition to the capabilities of the other
editions of DSA, DSA Preboot includes diagnostic routines that would be
disruptive to run within the operating-system environment (such as resetting
devices and causing loss of network connectivity). It has a graphical user
interface that you can use to specify which diagnostics to run and to view the
diagnostic and data collection results.
DSA Preboot provides diagnostics for the following system components, if they
are installed:
– Checkpoint panel
– I2C bus
– SAS and SATA drives
If you are unable to restart the server or if you need comprehensive diagnostics,
use DSA Preboot.
For more information and to download the utilities, go to http://www.ibm.com/
support/entry/portal/docdisplay?lndocid=SERV-DSA.
Running DSA Preboot diagnostic programs
Use this information to run the DSA Preboot diagnostic programs.
Note: The DSA memory test might take up to 30 minutes to run. If the problem is
not a memory problem, skip the memory test.
To run the DSA Preboot diagnostic programs, complete the following steps:
1. If the server is running, turn off the server and all attached devices.
2. Turn on all attached devices; then, turn on the server.
3. When the prompt <F2> Diagnostics is displayed, press F2.
Note: The DSA Preboot diagnostic program might appear to be unresponsive
for an unusual length of time when you start the program. This is normal
operation while the program loads. The loading process may take up to 10
minutes.
4. Optionally, select Quit to DSA to exit from the stand-alone memory diagnostic
program.
Note: After you exit from the stand-alone memory diagnostic environment,
you must restart the server to access the stand-alone memory diagnostic
environment again.
5. Type gui to display the graphical user interface, or type cmd to display the
DSA interactive menu.
6. Follow the instructions on the screen to select the diagnostic test to run.
If the diagnostic programs do not detect any hardware errors but the problem
remains during normal server operation, a software error might be the cause. If
you suspect a software problem, see the information that comes with your
software.
A single problem might cause more than one error message. When this happens,
correct the cause of the first error message. The other error messages usually will
not occur the next time you run the diagnostic programs.
If the server stops during testing and you cannot continue, restart the server and
try running the DSA Preboot diagnostic programs again. If the problem remains,
replace the component that was being tested when the server stopped.
Diagnostic text messages
Diagnostic text messages are displayed while the tests are running.
A diagnostic text message contains one of the following results:
Passed: The test was completed without any errors.
Failed: The test detected an error.
60IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Aborted: The test could not proceed because of the server configuration.
Additional information concerning test failures is available in the extended
diagnostic results for each test.
Viewing the test log results and transferring the DSA collection
Use this information to view the test log results and transferring the DSA
collection.
To view the test log for the results when the tests are completed, click the Success
link in the Status column, if you are running the DSA graphical user interface, or
type :x to exit the Execute Tests menu, if you are running the DSA interactive
menu, or select Diagnostic Event Log in the graphical user interface. To transfer
DSA Preboot collections to an external USB device, type the copy command in the
DSA interactive menu.
v If you are running the DSA graphical user interface (GUI), click the Success link
in the Status column.
v If you are running the DSA interactive menu (CLI), type :x to exit the Execute
Tests menu; then, select completed tests to view the results.
You can also send the DSA error log to IBM support to aid in diagnosing the
server problems.
Automated service request (call home)
IBM Electronic Service Agent
Error messages
Error messages
IBM provides tools that can automatically collect and send data or call IBM
Support when an error is detected. These tools can help IBM Support speed up the
process of diagnosing problems.
The following sections provide information about the call home tools.
IBM Electronic Service Agent monitors, tracks, and captures system hardware
errors and hardware and software inventory information, and reports serviceable
problems directly to IBM Support. You can also choose to collect data manually. It
uses minimal system resources, and can be downloaded from the IBM website.
For more information and to download IBM Electronic Service Agent, go to
http://www-01.ibm.com/support/esa/.
This section provides the list of error codes and messages for UEFI/POST, IMM,
and DSA that are generated when a problem is detected.
See UEFI/POST error codes, Integrated management module II (IMM2) error
messages, and DSA messages for more information.
This section provides the list of error codes and messages for UEFI/POST, IMM,
and DSA that are generated when a problem is detected.
See UEFI/POST error codes, Integrated management module II (IMM2) error
messages, and DSA messages for more information.
Chapter 3. Troubleshooting61
Troubleshooting by symptom
Use the troubleshooting tables to find solutions to problems that have identifiable
symptoms.
If you cannot find a solution to the problem in these tables, see DSA messages for
information about testing the server and “Running DSA Preboot diagnostic
programs” on page 60 for additional information about running DSA Preboot
program. For additional information to help you solve problems, see “Start here”
on page 47.
If you have just added new software or a new optional device and the server is
not working, complete the following steps before you use the troubleshooting
tables:
1. Check the system-error LED on the operator information panel; if it is lit, check
the light path diagnostics LEDs (see Light path diagnostics).
2. Remove the software or device that you just added.
3. Run IBM Dynamic System Analysis (DSA) to determine whether the server is
running correctly (for information about using DSA, see DSA messages).
4. Reinstall the new software or new device.
General problems
Use this information to solve general problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
A cover latch is broken, an LED
is not working, or a similar
problem has occurred.
The server is hung while the
screen is on. Cannot start the
Setup utility by pressing F1.
If the part is a CRU, replace it. If the part is a microprocessor or the system board,
the part must be replaced by a trained technician.
1. See “Nx-boot failure” on page 83 for more information.
2. See “Recovering the server firmware (UEFI update failure)” on page 80 for
more information.
Hard disk drive problems
Table 4. Hard disk drive symptoms and actions
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only)”, that step must be performed only by a trained
technician.
v Go to the IBM support website at http://www.ibm.com/supportportal to check for technical information, hints,
tips, and new device drivers or to submit a request for information.
SymptomAction
Not all drives are recognized by
the hard disk drive diagnostic
tests.
Remove the drive that is indicated by the diagnostic tests; then, run the hard disk
drive diagnostic tests again. If the remaining drives are recognized, replace the
drive that you removed with a new one.
62IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Table 4. Hard disk drive symptoms and actions (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only)”, that step must be performed only by a trained
technician.
v Go to the IBM support website at http://www.ibm.com/supportportal to check for technical information, hints,
tips, and new device drivers or to submit a request for information.
SymptomAction
The server stops responding
during the hard disk drive
diagnostic test.
A hard disk drive was not
detected while the operating
system was being started.
A hard disk drive passes the
diagnostic Fixed Disk Test, but
the problem remains.
Remove the hard disk drive that was being tested when the server stopped
responding, and run the diagnostic test again. If the hard disk drive diagnostic
test runs successfully, replace the drive that you removed with a new one.
Reseat all hard disk drives and cables; then, run the hard disk drive diagnostic
tests again.
Run the diagnostic SCSI Fixed Disk Test (see “Running DSA Preboot diagnostic
programs” on page 60).
Note: This test is not available on servers that have RAID arrays or servers that
have SATA hard disk drives.
Hypervisor problems
Use this information to solve hypervisor problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
If an optional embedded
hypervisor flash device is not
listed in the expected boot
order, does not appear in the
list of boot devices, or a similar
problem has occurred.
1. Make sure that the optional embedded hypervisor flash device is selected on
the boot manager <F12> Select Boot Device at startup.
2. Make sure that the embedded hypervisor flash device is seated in the
connector correctly (see “Removing the USB flash drive” on page 162 and
“Installing the USB flash drive” on page 163).
3. See the documentation that comes with the optional embedded hypervisor
flash device for setup and configuration information.
4. Make sure that other software works on the server.
Chapter 3. Troubleshooting63
Intermittent problems
Use this information to solve intermittent problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
A problem occurs only
occasionally and is difficult to
diagnose.
The server resets (restarts)
occasionally.
1. Make sure that:
v All cables and cords are connected securely to the rear of the server and
attached devices.
v When the server is turned on, air is flowing from the fan grille. If there is no
airflow, the fan is not working. This can cause the server to overheat and
shut down.
2. Check the system-error log or IMM event logs (see “Event logs” on page 55).
1. If the reset occurs during POST and the POST watchdog timer is enabled (click
System Settings > Recovery > System Recovery > POST Watchdog Timer in
the Setup utility to see the POST watchdog setting), make sure that sufficient
time is allowed in the watchdog timeout value (POST Watchdog Timer). If the
server continues to reset during POST, see UEFI/POST error codes and DSA
messages.
2. If neither condition applies, check the system-error log or IMM system-event
log (see “Event logs” on page 55).
Keyboard, mouse, or USB-device problems
Use this information to solve keyboard, mouse, or USB-device problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
All or some keys on the
keyboard do not work.
1. Make sure that:
v The keyboard cable is securely connected.
v The server and the monitor are turned on.
2. If you are using a USB keyboard, run the Setup utility and enable keyboardless
operation.
3. If you are using a USB keyboard and it is connected to a USB hub, disconnect
the keyboard from the hub and connect it directly to the server.
4. Replace the keyboard.
64IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
The mouse or USB-device does
not work.
1. Make sure that:
v The mouse or USB device cable is securely connected to the server.
v The mouse or USB device drivers are installed correctly.
v The server and the monitor are turned on.
v The mouse option is enabled in the Setup utility.
2. If you are using a USB mouse or USB device and it is connected to a USB hub,
disconnect the mouse or USB device from the hub and connect it directly to
the server.
3. Replace the mouse or USB-device.
Chapter 3. Troubleshooting65
Memory problems
Use this information to solve memory problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
The amount of system memory
that is displayed is less than the
amount of installed physical
memory.
Note: Each time you install or remove a DIMM, you must disconnect the server
from the power source; then, wait 10 seconds before restarting the server.
1. Make sure that:
v No error LEDs are lit on the operator information panel.
v No DIMM error LEDs are lit on the system board.
v Memory mirrored channel does not account for the discrepancy.
v The memory modules are seated correctly.
v You have installed the correct type of memory.
v If you changed the memory, you updated the memory configuration in the
Setup utility.
v All banks of memory are enabled. The server might have automatically
disabled a memory bank when it detected a problem, or a memory bank
might have been manually disabled.
v There is no memory mismatch when the server is at the minimum memory
configuration.
2. Reseat the DIMMs, and then restart the server.
3. Check the POST error log:
v If a DIMM was disabled by a systems-management interrupt (SMI), replace
the DIMM.
v If a DIMM was disabled by the user or by POST, reseat the DIMM; then, run
the Setup utility and enable the DIMM.
4. Check that all DIMMs are initialized in the Setup utility; then, run memory
diagnostics (see “Running DSA Preboot diagnostic programs” on page 60).
5. Reverse the DIMMs between the channels (of the same microprocessor), and
then restart the server. If the problem is related to a DIMM, replace the failing
DIMM.
6. Re-enable all DIMMs using the Setup utility, and then restart the server.
7. (Trained technician only) Install the failing DIMM into a DIMM connector for
microprocessor 2 (if installed) to verify that the problem is not the
microprocessor or the DIMM connector.
8. (Trained technician only) Replace the system board.
66IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
Multiple DIMMs in a channel
are identified as failing.
Note: Each time you install or remove a DIMM, you must disconnect the server
from the power source; then, wait 10 seconds before restarting the server.
1. Reseat the DIMMs; then, restart the server.
2. Remove the highest-numbered DIMM of those that are identified and replace it
with an identical known good DIMM; then, restart the server. Repeat as
necessary. If the failures continue after all identified DIMMs are replaced, go to
step 4.
3. Return the removed DIMMs, one at a time, to their original connectors,
restarting the server after each DIMM, until a DIMM fails. Replace each failing
DIMM with an identical known good DIMM, restarting the server after each
DIMM replacement. Repeat step 3 until you have tested all removed DIMMs.
4. Replace the highest-numbered DIMM of those identified; then, restart the
server. Repeat as necessary.
5. Reverse the DIMMs between the channels (of the same microprocessor), and
then restart the server. If the problem is related to a DIMM, replace the failing
DIMM.
6. (Trained technician only) Install the failing DIMM into a DIMM connector for
microprocessor 2 (if installed) to verify that the problem is not the
microprocessor or the DIMM connector.
7. (Trained technician only) Replace the system board.
Microprocessor problems
Use this information to solve microprocessor problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
The server goes directly to the
POST Event Viewer when it is
turned on.
1. Correct any errors that are indicated by the light path diagnostics LEDs (see
Light path diagnostics).
2. Make sure that the server supports all the microprocessors and that the
microprocessors match in speed and cache size. To view the microprocessor
information, run the Setup utility and select System Information > SystemSummary > Processor Details.
3. (Trained technician only) Make sure that microprocessor 1 is seated correctly.
4. (Trained technician only) Remove microprocessor 2 and restart the server.
5. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. (Trained technician only) Microprocessor
b. (Trained technician only) System board
Chapter 3. Troubleshooting67
Monitor and video problems
Use this information to solve monitor and video problems.
Some IBM monitors have their own self-tests. If you suspect a problem with your
monitor, see the documentation that comes with the monitor for instructions for
testing and adjusting the monitor. If you cannot diagnose the problem, call for
service.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
Testing the monitor.
The screen is blank.
1. Make sure that the monitor cables are firmly connected.
2. Try using a different monitor on the server, or try using the monitor that is
being tested on a different server.
3. Run the diagnostic programs. If the monitor passes the diagnostic programs,
the problem might be a video device driver.
4. (Trained technician only) Replace the system board.
1. If the server is attached to a KVM switch, bypass the KVM switch to eliminate
it as a possible cause of the problem: connect the monitor cable directly to the
correct connector on the rear of the server.
2. The IMM2 remote presence function is disabled if you install an optional video
adapter. To use the IMM2 remote presence function, remove the optional video
adapter.
3. If the server installed with the graphical adapters while turning on the server,
the IBM logo displays on the screen after approximately 3 minutes. This is
normal operation while the system loads.
4. Make sure that:
v The server is turned on. If there is no power to the server, see “Power
problems” on page 72.
v The monitor cables are connected correctly.
v The monitor is turned on and the brightness and contrast controls are
adjusted correctly.
5. Make sure that the correct server is controlling the monitor, if applicable.
6. Make sure that damaged server firmware is not affecting the video; see
“Updating the firmware” on page 21.
7. Observe the checkpoint LEDs on the system board; if the codes are changing,
go to step 6.
8. Replace the following components one at a time, in the order shown, restarting
the server each time:
a. Monitor
b. Video adapter (if one is installed)
c. (Trained technician only) System board.
9. See “Solving undetermined problems” on page 78.
68IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
The monitor works when you
turn on the server, but the
screen goes blank when you
start some application
programs.
1. Make sure that:
v The application program is not setting a display mode that is higher than
the capability of the monitor.
v You installed the necessary device drivers for the application.
2. Run video diagnostics (see “Running DSA Preboot diagnostic programs” on
page 60).
v If the server passes the video diagnostics, the video is good; see “Solving
undetermined problems” on page 78.
v (Trained technician only) If the server fails the video diagnostics, replace the
system board.
The monitor has screen jitter, or
the screen image is wavy,
unreadable, rolling, or
distorted.
1. If the monitor self-tests show that the monitor is working correctly, consider
the location of the monitor. Magnetic fields around other devices (such as
transformers, appliances, fluorescents, and other monitors) can cause screen
jitter or wavy, unreadable, rolling, or distorted screen images. If this happens,
turn off the monitor.
Attention: Moving a color monitor while it is turned on might cause screen
discoloration.
Move the device and the monitor at least 305 mm (12 in.) apart, and turn on
the monitor.
Notes:
a. To prevent diskette drive read/write errors, make sure that the distance
between the monitor and any external diskette drive is at least 76 mm (3
in.).
b. Non-IBM monitor cables might cause unpredictable problems.
2. Reseat the monitor cable.
3. Replace the components listed in step 2 one at a time, in the order shown,
restarting the server each time:
a. Monitor cable
b. Video adapter (if one is installed)
c. Monitor
d. (Trained technician only) System board.
Wrong characters appear on the
screen.
1. If the wrong language is displayed, update the server firmware to the latest
level (see “Updating the firmware” on page 21) with the correct language.
2. Reseat the monitor cable.
3. Replace the components listed in step 2 one at a time, in the order shown,
restarting the server each time:
a. Monitor cable
b. Video adapter (if one is installed)
c. Monitor
d. (Trained technician only) System board.
Chapter 3. Troubleshooting69
Network connection problems
Use this information to solve network connection problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
Unable to wake the server
using the Wake on LAN
feature.
Log in failed by using LDAP
account with SSL enabled.
1. If you are using the dual-port network adapter and the server is connected to
the network using Ethernet 5 connector, check the system-error log or IMM2
system event log (see “Event logs” on page 55), make sure:
a. Fan 3 is running in standby mode, if Emulex dual port 10GBase-T
embedded adapter is installed.
b. The room temperature is not too high (see “Features and specifications” on
page 5).
c. The air vents are not blocked.
d. The air baffle is installed securely.
2. Reseat the dual-port network adapter.
3. Turn off the server and disconnect it from the power source; then, wait 10
seconds before restarting the server.
4. If the problem still remains, replace the dual-port network adapter.
1. Make sure the license key is valid.
2. Generate a new license key and log in again.
Optional-device problems
Use this information to solve optional-device problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
An IBM optional device that
was just installed does not
work.
1. Make sure that:
v The device is designed for the server (see http://www.ibm.com/systems/
info/x86servers/serverproven/compat/us).
v You followed the installation instructions that came with the device and the
device is installed correctly.
v You have not loosened any other installed devices or cables.
v You updated the configuration information in the Setup utility. Whenever
memory or any other device is changed, you must update the configuration.
2. Reseat the device that you just installed.
3. Replace the device that you just installed.
70IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
An IBM optional device that
worked previously does not
work now.
1. Make sure that all of the cable connections for the device are secure.
2. If the device comes with test instructions, use those instructions to test the
device.
3. If the failing device is a SCSI device, make sure that:
v The cables for all external SCSI devices are connected correctly.
v The last device in each SCSI chain, or the end of the SCSI cable, is
terminated correctly.
v Any external SCSI device is turned on. You must turn on an external SCSI
device before you turn on the server.
4. Reseat the failing device.
5. Replace the failing device.
Chapter 3. Troubleshooting71
Power problems
Use this information to solve power problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
The power-control button does
not work, and the reset button
does not work (the server does
not start).
Note: The power-control button
will not function until
approximately 5 to 10 seconds
after the server has been
connected to power.
1. Make sure that the power-control button is working correctly:
a. Disconnect the server power cords.
b. Reconnect the power cords.
c. (Trained technician only) Reseat the operator information panel cable, and
then repeat steps 1a and 1b.
v (Trained technician only) If the server starts, reseat the operator
information panel. If the problem remains, replace the operator
information panel.
v If the server does not start, bypass the power-control button by using the
force power-on jumper. If the server starts, reseat the operator
information panel. If the problem remains, replace the operator
information panel.
2. Make sure that the reset button is working correctly:
a. Disconnect the server power cords.
b. Reconnect the power cords.
c. (Trained technician only) Reseat the operator information panel cable, and
then repeat steps 2a and 2b.
v (Trained technician only) If the server starts, replace the operator
information panel.
v If the server does not start, go to step 3.
3. Make sure that both power supplies installed in the server are of the same
type. Mixing different power supplies in the server will cause a system error
(the system-error LED on the front panel turns on).
4. Make sure that:
v The power cords are correctly connected to the server and to a working
electrical outlet.
v The type of memory that is installed is correct.
v The DIMMs are fully seated.
v The LEDs on the power supply do not indicate a problem.
v The microprocessors are installed in the correct sequence.
5. Reseat the following components:
a. Operator information panel connector
b. Power supplies
6. Replace the components listed in step 5 one at a time, in the order shown,
restarting the server each time.
7. If you just installed an optional device, remove it, and restart the server. If the
server now starts, you might have installed more devices than the power
supply supports.
8. See “Power-supply LEDs” on page 53.
9. See “Solving undetermined problems” on page 78.
72IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
The server does not turn off.
The server unexpectedly shuts
down, and the LEDs on the
operator information panel are
not lit.
1. Determine whether you are using an Advanced Configuration and Power
Interface (ACPI) or a non-ACPI operating system. If you are using a non-ACPI
operating system, complete the following steps:
a. Press Ctrl+Alt+Delete.
b. Turn off the server by pressing the power-control button and hold it down
for 5 seconds.
c. Restart the server.
d. If the server fails POST and the power-control button does not work,
disconnect the power cord for 20 seconds; then, reconnect the power cord
and restart the server.
2. If the problem remains or if you are using an ACPI-aware operating system,
suspect the system board.
See “Solving undetermined problems” on page 78.
Serial-device problems
Use this information to solve serial-device problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
The number of serial ports that
are identified by the operating
system is less than the number
of installed serial ports.
1. Make sure that:
v Each port is assigned a unique address in the Setup utility and none of the
serial ports is disabled.
v The serial-port adapter (if one is present) is seated correctly.
2. Reseat the serial port adapter.
3. Replace the serial port adapter.
Chapter 3. Troubleshooting73
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
A serial device does not work.
1. Make sure that:
v The device is compatible with the server.
v The serial port is enabled and is assigned a unique address.
v The device is connected to the correct connector (see “System-board internal
connectors” on page 16).
2. Reseat the following components:
a. Failing serial device
b. Serial cable
3. Replace the components listed in step 2 one at a time, in the order shown,
restarting the server each time.
4. (Trained technician only) Replace the system board.
ServerGuide problems
Use this information to solve ServerGuide problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
The MegaRAID Storage
Manager program cannot view
all installed drives, or the
operating system cannot be
installed.
The operating-system
installation program
continuously loops.
The ServerGuide program will
not start the operating-system
CD.
The operating system cannot be
installed; the option is not
available.
1. Make sure that the hard disk drive is connected correctly.
2. Make sure that the SAS/SATA hard disk drive cables are securely connected.
Make more space available on the hard disk.
Make sure that the operating-system CD is supported by the ServerGuide
program. For a list of supported operating-system versions, go to
http://www.ibm.com/support/entry/portal/docdisplay?lndocid=SERV-GUIDE,
click the link for your ServerGuide version, and scroll down to the list of
supported Microsoft Windows operating systems.
Make sure that the server supports the operating system. If it does, either no
logical drive is defined (SCSI RAID servers), or the ServerGuide System Partition
is not present. Run the ServerGuide program and make sure that setup is
complete.
74IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Software problems
Use this information to solve software problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
You suspect a software
problem.
1. To determine whether the problem is caused by the software, make sure that:
v The server has the minimum memory that is needed to use the software. For
memory requirements, see the information that comes with the software. If
you have just installed an adapter or memory, the server might have a
memory-address conflict.
v The software is designed to operate on the server.
v Other software works on the server.
v The software works on another server.
2. If you received any error messages when using the software, see the
information that comes with the software for a description of the messages and
suggested solutions to the problem.
3. Contact the software vendor.
Universal Serial Bus (USB) port problems
Use this information to solve Universal Serial Bus (USB) port problems.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved.
v If an action step is preceded by “(Trained technician only),” that step must be performed only by a trained
technician.
v Go to the IBM support website at to check for technical information, hints, tips, and new device drivers or to
submit a request for information.
SymptomAction
A USB device does not work.
1. Make sure that:
v The correct USB device driver is installed.
v The operating system supports USB devices.
2. Make sure that the USB configuration options are set correctly in the Setup
utility (see “Using the Setup utility” on page 25 for more information).
3. If you are using a USB hub, disconnect the USB device from the hub and
connect it directly to the server.
Video problems
Use this information to solve video problems.
See “Monitor and video problems” on page 68.
Solving power problems
Use this information to solve power problems.
Chapter 3. Troubleshooting75
Power problems can be difficult to solve. For example, a short circuit can exist
anywhere on any of the power distribution buses. Usually, a short circuit will
cause the power subsystem to shut down because of an overcurrent condition. To
diagnose a power problem, use the following general procedure:
1. Turn off the server and disconnect all power cords.
2. Check for loose cables in the power subsystem. Also check for short circuits, for
example, if a loose screw is causing a short circuit on a circuit board.
3. Check the lit LEDs on the operator information panel (see Light path
diagnostics).
4. If the check log LED on the light path diagnostics panel is lit, check the IMM
event log for faulty Pwr rail and complete the following steps. Table 5 identifies
the components that are associated with each Pwr rail and the order in which
to troubleshoot the components.
a. Disconnect the cables and power cords to all internal and external devices
(see “Internal cable routing and connectors” on page 180). Leave the
power-supply cords connected.
b. For Pwr rail A error, complete the following steps:
1) (Trained technician only) Replace the system board.
2) (Trained technician only) Replace the microprocessor.
c. For other rail errors (Pwr rail A error, see step 4b), remove each component
that is associated with the faulty Pwr rail, one at a time, in the sequence
indicated in Table 5, restarting the server each time, until the cause of the
overcurrent condition is identified.
Table 5. Components associated with power rail errors
Pwr rail error in the IMM event log Components
Pwr rail A error
Pwr rail B error
Pwr rail C error
Pwr rail D error
Pwr rail E error
Pwr rail F error
Pwr rail G error
v Microprocessor 1
v Microprocessor 2
v Adapter (if one is installed) in PCI riser-card
assembly 1
v PCI riser-card assembly 1
v Fan 1
v DIMMs 1 through 6
v Dual-port network adapter
v Fan 2
v DIMMs 7 through 12
v Hard disk drives
v DIMMs 13 through 18
v Adapter (if one is installed) in PCI riser-card
assembly 1
v PCI riser-card assembly 1
v Fan 4
v DIMMs 19 through 24
v PCI adaptor power cable (if one is present)
v Fan 3
v Hard disk drives
v Hard disk drive backplane assembly
76IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Table 5. Components associated with power rail errors (continued)
Pwr rail error in the IMM event log Components
Pwr rail H error
v Hard disk drive power cable
v Hard disk drives
v Hard disk drive backplane
or
v PCI adapter power cable
v Adapter installed in PCI riser-card assembly 2
v PCI riser-card assembly 2
d. Replace the identified component.
5. Remove the adapters and disconnect the cables and power cords to all internal
and external devices until the server is at the minimum configuration that is
required for the server to start (see “Power-supply LEDs” on page 53 for the
minimum configuration).
6. Reconnect all power cords and turn on the server. If the server starts
successfully, reseat the adapters and devices one at a time until the problem is
isolated.
If the server does not start from the minimum configuration, see “Power-supply
LEDs” on page 53 to replace the components in the minimum configuration one at
a time until the problem is isolated.
Solving Ethernet controller problems
Use this information to solve Ethernet controller problems.
The method that you use to test the Ethernet controller depends on which
operating system you are using. See the operating-system documentation for
information about Ethernet controllers, and see the Ethernet controller
device-driver readme file.
Try the following procedures:
v Make sure that the correct device drivers, which come with the server are
installed and that they are at the latest level.
v Make sure that the Ethernet cable is installed correctly.
– The cable must be securely attached at all connections. If the cable is attached
but the problem remains, try a different cable.
– If you set the Ethernet controller to operate at 100 Mbps, you must use
Category 5 cabling.
– If you directly connect two servers (without a hub), or if you are not using a
hub with X ports, use a crossover cable. To determine whether a hub has an X
port, check the port label. If the label contains an X, the hub has an X port.
v Determine whether the hub supports auto-negotiation. If it does not, try
configuring the integrated Ethernet controller manually to match the speed and
duplex mode of the hub.
v Check the Ethernet controller LEDs on the rear panel of the server. These LEDs
indicate whether there is a problem with the connector, cable, or hub.
– The Ethernet link status LED is lit when the Ethernet controller receives a link
pulse from the hub. If the LED is off, there might be a defective connector or
cable or a problem with the hub.
Chapter 3. Troubleshooting77
– The Ethernet transmit/receive activity LED is lit when the Ethernet controller
sends or receives data over the Ethernet network. If the Ethernet
transmit/receive activity is off, make sure that the hub and network are
operating and that the correct device drivers are installed.
v Check the LAN activity LED on the rear of the server. The LAN activity LED is
lit when data is active on the Ethernet network. If the LAN activity LED is off,
make sure that the hub and network are operating and that the correct device
drivers are installed.
v Check for operating-system-specific causes of the problem.
v Make sure that the device drivers on the client and server are using the same
protocol.
If the Ethernet controller still cannot connect to the network but the hardware
appears to be working, the network administrator must investigate other possible
causes of the error.
Solving undetermined problems
If Dynamic System Analysis (DSA) did not diagnose the failure or if the server is
inoperative, use the information in this section.
If you suspect that a software problem is causing failures (continuous or
intermittent), see “Software problems” on page 75.
Corrupted data in CMOS memory or corrupted UEFI firmware can cause
undetermined problems. To reset the CMOS data, use the CMOS clear jumper (JP1)
to clear the CMOS memory and override the power-on password; see
“System-board switches and jumpers” on page 18 for more information. If you
suspect that the UEFI firmware is corrupted, see “Recovering the server firmware
(UEFI update failure)” on page 80.
If the power supplies are working correctly, complete the following steps:
1. Turn off the server.
2. Make sure that the server is cabled correctly.
3. Remove or disconnect the following devices, one at a time, until you find the
failure. Turn on the server and reconfigure it each time.
v Any external devices.
v Surge-suppressor device (on the server).
v Printer, mouse, and non-IBM devices.
v Each adapter.
v Hard disk drives.
v Memory modules. The minimum configuration requirement is 2 GB DIMM
in slot 1.
4. Turn on the server.
If the problem is solved when you remove an adapter from the server but the
problem recurs when you reinstall the same adapter, suspect the adapter; if the
problem recurs when you replace the adapter with a different one, suspect the riser
card.
If you suspect a networking problem and the server passes all the system tests,
suspect a network cabling problem that is external to the server.
78IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Problem determination tips
Because of the variety of hardware and software combinations that can encounter,
use the following information to assist you in problem determination.
If possible, have this information available when requesting assistance from IBM.
The model name and serial number are located on the ID label on the front of the
server as shown in the following illustration.
Note: The illustrations in this document might differ slightly from your hardware.
Figure 14. ID label
v Machine type and model
v Microprocessor or hard disk drive upgrades
v Failure symptom
– Does the server fail the diagnostic tests?
– What occurs? When? Where?
– Does the failure occur on a single server or on multiple servers?
– Is the failure repeatable?
– Has this configuration ever worked?
– What changes, if any, were made before the configuration failed?
– Is this the original reported failure?
v Diagnostic program type and version level
v Hardware configuration (print screen of the system summary)
v UEFI firmware level
v IMM firmware level
v Operating system software
You can solve some problems by comparing the configuration and software setups
between working and nonworking servers. When you compare servers to each
other for diagnostic purposes, consider them identical only if all the following
factors are exactly the same in all the servers:
v Machine type and model
v UEFI firmware level
Chapter 3. Troubleshooting79
v IMM firmware level
v Adapters and attachments, in the same locations
v Address jumpers, terminators, and cabling
v Software versions and levels
v Diagnostic program type and version level
v Configuration option settings
v Operating-system control-file setup
See Appendix D, “Getting help and technical assistance,” on page 373 for
information about calling IBM for service.
Recovering the server firmware (UEFI update failure)
Use this information to recover the server firmware.
Important: Some cluster solutions require specific code levels or coordinated code
updates. If the device is part of a cluster solution, verify that the latest level of
code is supported for the cluster solution before you update the code.
If the server firmware has become corrupted, such as from a power failure during
an update, you can recover the server firmware in the following way:
v In-band method: Recover server firmware, using either the boot block jumper
(Automated Boot Recovery) and a server Firmware Update Package Service
Pack.
v Out-of-band method: Use the IMM web interface to update the firmware, using
the latest server firmware update package.
Note: You can obtain a server update package from one of the following sources:
v Download the server firmware update from the World Wide Web.
v Contact your IBM service representative.
To download the server firmware update package from the World Wide Web, go to
.
The flash memory of the server consists of a primary bank and a backup bank. You
must maintain a bootable UEFI firmware image in the backup bank. If the server
firmware in the primary bank becomes corrupted, you can either manually boot
the backup bank with the UEFI boot backup jumper (JP2), or in the case of image
corruption, this will occur automatically with the Automated Boot Recovery
function.
In-band manual recovery method
Use this information to recover the server firmware and restore the server
operation to the primary bank.
To recover the server firmware and restore the server operation to the primary
bank, complete the following steps:
1. Read the safety information that begins on “Safety” on page vii and
“Installation guidelines” on page 93.
2. Turn off the server, and disconnect all power cords and external cables.
3. Remove the cover (see “Removing the compute node cover” on page 110).
80IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
4. Locate the UEFI boot backup jumper (JP2) on the system board.
1
2
3
1
2
3
Lightpath button
UEFI boot
recovery jumper
Clear CMOS jumper
NMI button
Figure 15. UEFI boot backup jumper (JP2) location
5. Move the UEFI boot backup jumper (JP2) from pins 1 and 2 to pins 2 and 3 to
enable the UEFI recovery mode.
6. Reinstall the server cover; then, reconnect all power cords.
7. Restart the server. The system begins the power-on self-test (POST).
8. Boot the server to an operating system that is supported by the firmware
update package that you downloaded.
9. Perform the firmware update by following the instructions that are in the
firmware update package readme file.
10. Turn off the server and disconnect all power cords and external cables, and
then remove the cover (see “Removing the compute node cover” on page 110).
11. Move the UEFI boot backup jumper (JP2) from pins 2 and 3 back to the
primary position (pins 1 and 2).
Chapter 3. Troubleshooting81
12. Reinstall the cover (see “Installing the compute node cover” on page 111).
13. Reconnect the power cord and any cables that you removed.
14. Restart the server. The system begins the power-on self-test (POST). If this
does not recover the primary bank, continue with the following steps.
15. Remove the cover (see “Removing the compute node cover” on page 110).
16. Reset the CMOS by removing the system battery (see “Removing the system
battery” on page 130).
17. Leave the system battery out of the server for approximately 5 to 15 minutes.
18. Reinstall the system battery (see “Replacing the system battery” on page 131).
19. Reinstall the cover (see “Installing the compute node cover” on page 111).
20. Reconnect the power cord and any cables that you removed.
21. Restart the server. The system begins the power-on self-test (POST).
22. If these recovery efforts fail, contact your IBM service representative for
support.
In-band automated boot recovery method
Use this information to use the in-band automated boot recovery method.
Note: Use this method if the system-error LED on the operator information panel
is lit and there is a log entry or Booting Backup Image is displayed on the firmware
splash screen; otherwise, use the in-band manual recovery method.
1. Boot the server to an operating system that is supported by the firmware
update package that you downloaded.
2. Perform the firmware update by following the instructions that are in the
firmware update package readme file.
3. Restart the server.
4. At the firmware splash screen, press F3 when prompted to restore to the
primary bank. The server boots from the primary bank.
Out-of-band method
Use this information to use the out-of-band method.
See the IMM2 documentation (Integrated Management Module II User's Guide)at
http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-
5086346.
Automated boot recovery (ABR)
While the server is starting, if the integrated management module II detects
problems with the server firmware in the primary bank, the server automatically
switches to the backup firmware bank and gives you the opportunity to recover
the firmware in the primary bank.
For instructions for recovering the UEFI firmware, see “Recovering the server
firmware (UEFI update failure)” on page 80. After you have recovered the
firmware in the primary bank, complete the following steps:
1. Restart the server.
2. When the prompt Press F3 to restore to primary is displayed, press F3 to
start the server from the primary bank.
82IBM NeXtScale nx360 M4 Type 5455: Installation and Service Guide
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.