IBM Power System, Power System 9006-22P, Power System 5104-22C, Power System 9006-12P, Power System 9006-22C Problem Analysis, System Parts, And Locations

Power Systems
Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
IBM
Note
Before using this information and the product it supports, read the information in “Safety notices” on page v, “Notices” on page 109, the IBM Systems Safety Notices manual, G229-9054, and the IBM Environmental Notices and User Guide, Z125–5823.
This edition applies to IBM® Power Systems servers that contain the POWER9™ processor and to all associated models.
©
Copyright International Business Machines Corporation 2017, 2019.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

Contents

Safety notices........................................................................................................v
Beginning troubleshooting and problem analysis....................................................1
Determining the problem analysis procedure to perform.......................................................................... 1
Resolving a BMC access problem................................................................................................................2
Resolving a power problem......................................................................................................................... 5
Resolving a system rmware boot failure...................................................................................................5
Resolving a VGA monitor problem...............................................................................................................7
Resolving an operating system boot failure................................................................................................7
Resolving a sensor indicator problem......................................................................................................... 9
Resolving a hardware problem..................................................................................................................10
Resolving a PCIe adapter or device problem............................................................................................11
Resolving a network adapter problem.................................................................................................13
Resolving an NVMe Flash adapter problem........................................................................................ 15
Resolving a storage device problem....................................................................................................15
Identifying the location of the PCIe adapter by using the slot number..............................................16
Identifying the location of the NVMe Flash adapter........................................................................... 17
Identifying the location of the storage device.....................................................................................17
User guides for PCIe adapters............................................................................................................. 17
Identifying a service action....................................................................................................................... 18
Identifying a service action by using system event logs.....................................................................18
Identifying service action keywords in system event logs..................................................................23
Identifying a service action by using sensor and event information ................................................. 24
Isolation procedures..................................................................................................................................49
EPUB_PRC_FIND_DECONFIGURE_PART isolation procedure...........................................................49
EPUB_PRC_SP_CODE isolation procedure.......................................................................................... 50
EPUB_PRC_PHYP_CODE isolation procedure..................................................................................... 50
EPUB_PRC_ALL_PROCS isolation procedure......................................................................................50
EPUB_PRC_ALL_MEMCRDS isolation procedure................................................................................51
EPUB_PRC_LVL_SUPPORT isolation procedure................................................................................. 52
EPUB_PRC_MEMORY_PLUGGING_ERROR isolation procedure........................................................52
EPUB_PRC_FSI_PATH isolation procedure.........................................................................................52
EPUB_PRC_PROC_AB_BUS isolation procedure................................................................................ 53
EPUB_PRC_PROC_XYZ_BUS isolation procedure..............................................................................54
EPUB_PRC_EIBUS_ERROR isolation procedure.................................................................................55
EPUB_PRC_POWER_ERROR isolation procedure...............................................................................56
EPUB_PRC_MEMORY_UE isolation procedure....................................................................................56
EPUB_PRC_HB_CODE isolation procedure......................................................................................... 57
EPUB_PRC_TOD_CLOCK_ERR isolation procedure.............................................................................58
EPUB_PRC_COOLING_SYSTEM_ERR isolation procedure................................................................. 59
Verifying a repair........................................................................................................................................60
Collecting diagnostic data......................................................................................................................... 60
Contacting IBM service and support.........................................................................................................61
Finding parts and locations ................................................................................. 63
5104-22C or 9006-22C locations.............................................................................................................63
5104-22C or 9006-22C parts................................................................................................................... 68
Finding parts and locations ................................................................................. 75
9006-12P locations................................................................................................................................... 75
iii
9006-12P parts..........................................................................................................................................80
Finding parts and locations ................................................................................. 91
9006-22P locations................................................................................................................................... 91
9006-22P parts..........................................................................................................................................97
Notices..............................................................................................................109
Accessibility features for IBM Power Systems servers..........................................................................110
Privacy policy considerations .................................................................................................................111
Trademarks..............................................................................................................................................111
Electronic emission notices.....................................................................................................................111
Class A Notices...................................................................................................................................112
Class B Notices...................................................................................................................................115
Terms and conditions.............................................................................................................................. 117
iv

Safety notices

Safety notices may be printed throughout this guide:
DANGER notices call attention to a situation that is potentially lethal or extremely hazardous to people.
CAUTION notices call attention to a situation that is potentially hazardous to people because of some existing condition.
Attention notices call attention to the possibility of damage to a program, device, system, or data.
World Trade safety information
Several countries require the safety information contained in product publications to be presented in their national languages. If this requirement applies to your country, safety information documentation is included in the publications package (such as in printed documentation, on DVD, or as part of the product) shipped with the product. The documentation contains the safety information in your national language with references to the U.S. English source. Before using a U.S. English publication to install, operate, or service this product, you must rst become familiar with the related safety information documentation. You should also refer to the safety information documentation any time you do not clearly understand any safety information in the U.S. English publications.
Replacement or additional copies of safety information documentation can be obtained by calling the IBM Hotline at 1-800-300-8751.
German safety information
Das Produkt ist nicht für den Einsatz an Bildschirmarbeitsplätzen im Sinne § 2 der Bildschirmarbeitsverordnung geeignet.
Laser safety information
IBM servers can use I/O cards or features that are ber-optic based and that utilize lasers or LEDs.
Laser compliance
IBM servers may be installed inside or outside of an IT equipment rack.
DANGER:
Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard:
• If IBM supplied the power cord(s), connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
• Do not open or service any power supply assembly.
• Do not connect or disconnect any cables or perform installation, maintenance, or reconguration of this product during an electrical storm.
• The product might be equipped with multiple power cords. To remove all hazardous voltages, disconnect all power cords.
– For AC power, disconnect all power cords from their AC power source. – For racks with a DC power distribution panel (PDP), disconnect the customer’s DC power
• When connecting power to the product ensure all power cables are properly connected.
When working on or around the system, observe the following precautions:
source to the PDP.
– For racks with AC power, connect all power cords to a properly wired and grounded electrical
outlet. Ensure that the outlet supplies proper voltage and phase rotation according to the system rating plate.
©
Copyright IBM Corp. 2017, 2019 v
– For racks with a DC power distribution panel (PDP), connect the customer’s DC power source
to the PDP. Ensure that the proper polarity is used when attaching the DC power and DC power return wiring.
• Connect any equipment that will be attached to this product to properly wired outlets.
• When possible, use one hand only to connect or disconnect signal cables.
• Never turn on any equipment when there is evidence of re, water, or structural damage.
• Do not attempt to switch on power to the machine until all possible unsafe conditions are corrected.
• Assume that an electrical safety hazard is present. Perform all continuity, grounding, and power checks specied during the subsystem installation procedures to ensure that the machine meets safety requirements.
• Do not continue with the inspection if any unsafe conditions are present.
• Before you open the device covers, unless instructed otherwise in the installation and conguration procedures: Disconnect the attached AC power cords, turn off the applicable circuit breakers located in the rack power distribution panel (PDP), and disconnect any telecommunications systems, networks, and modems.
DANGER:
• Connect and disconnect cables as described in the following procedures when installing, moving, or opening covers on this product or attached devices.
To Disconnect:
1. Turn off everything (unless instructed otherwise).
2. For AC power, remove the power cords from the outlets.
3. For racks with a DC power distribution panel (PDP), turn off the circuit breakers located in the PDP and remove the power from the Customer's DC power source.
4. Remove the signal cables from the connectors.
5. Remove all cables from the devices.
To Connect:
1. Turn off everything (unless instructed otherwise).
2. Attach all cables to the devices.
3. Attach the signal cables to the connectors.
4. For AC power, attach the power cords to the outlets.
5. For racks with a DC power distribution panel (PDP), restore the power from the Customer's DC power source and turn on the circuit breakers located in the PDP.
6. Turn on the devices.
Sharp edges, corners and joints may be present in and around the system. Use care when handling equipment to avoid cuts, scrapes and pinching. (D005)
(R001 part 1 of 2):
DANGER:
• Heavy equipment–personal injury or equipment damage might result if mishandled.
• Always lower the leveling pads on the rack cabinet.
• Always install stabilizer brackets on the rack cabinet unless the earthquake option is to be installed.
• To avoid hazardous conditions due to uneven mechanical loading, always install the heaviest devices in the bottom of the rack cabinet. Always install servers and optional devices starting from the bottom of the rack cabinet.
Observe the following precautions when working on or around your IT rack system:
vi Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
• Rack-mounted devices are not to be used as shelves or work spaces. Do not place objects on top of rack-mounted devices. In addition, do not lean on rack mounted devices and do not use them to stabilize your body position (for example, when working from a ladder).
• Stability hazard:
– The rack may tip over causing serious personal injury. – Before extending the rack to the installation position, read the installation instructions. – Do not put any load on the slide-rail mounted equipment mounted in the installation position. – Do not leave the slide-rail mounted equipment in the installation position.
• Each rack cabinet might have more than one power cord.
– For AC powered racks, be sure to disconnect all power cords in the rack cabinet when directed
to disconnect power during servicing.
– For racks with a DC power distribution panel (PDP), turn off the circuit breaker that controls
the power to the system unit(s), or disconnect the customer’s DC power source, when directed to disconnect power during servicing.
• Connect all devices installed in a rack cabinet to power devices installed in the same rack cabinet. Do not plug a power cord from a device installed in one rack cabinet into a power device installed in a different rack cabinet.
• An electrical outlet that is not correctly wired could place hazardous voltage on the metal parts of the system or the devices that attach to the system. It is the responsibility of the customer to ensure that the outlet is correctly wired and grounded to prevent an electrical shock. (R001 part 1 of 2)
(R001 part 2 of 2):
CAUTION:
• Do not install a unit in a rack where the internal rack ambient temperatures will exceed the manufacturer's recommended ambient temperature for all your rack-mounted devices.
• Do not install a unit in a rack where the air flow is compromised. Ensure that air flow is not blocked or reduced on any side, front, or back of a unit used for air flow through the unit.
• Consideration should be given to the connection of the equipment to the supply circuit so that overloading of the circuits does not compromise the supply wiring or overcurrent protection. To provide the correct power connection to a rack, refer to the rating labels located on the equipment in the rack to determine the total power requirement of the supply circuit.
(For sliding drawers.) Do not pull out or install any drawer or feature if the rack stabilizer brackets are not attached to the rack or if the rack is not bolted to the floor. Do not pull out more than one drawer at a time. The rack might become unstable if you pull out more than one drawer at a time.
(For xed drawers.) This drawer is a xed drawer and must not be moved for servicing unless specied by the manufacturer. Attempting to move the drawer partially or completely out of the rack might cause the rack to become unstable or cause the drawer to fall out of the rack. (R001 part 2 of 2)
Safety notices
vii
CAUTION: Removing components from the upper positions in the rack cabinet improves rack
stability during relocation. Follow these general guidelines whenever you relocate a populated rack cabinet within a room or building.
• Reduce the weight of the rack cabinet by removing equipment starting at the top of the rack cabinet. When possible, restore the rack cabinet to the conguration of the rack cabinet as you received it. If this conguration is not known, you must observe the following precautions:
– Remove all devices in the 32U position (compliance ID RACK-001 or 22U (compliance ID
RR001) and above. – Ensure that the heaviest devices are installed in the bottom of the rack cabinet. – Ensure that there are little-to-no empty U-levels between devices installed in the rack cabinet
below the 32U (compliance ID RACK-001 or 22U (compliance ID RR001) level, unless the
received conguration specically allowed it.
• If the rack cabinet you are relocating is part of a suite of rack cabinets, detach the rack cabinet from the suite.
• If the rack cabinet you are relocating was supplied with removable outriggers they must be reinstalled before the cabinet is relocated.
• Inspect the route that you plan to take to eliminate potential hazards.
• Verify that the route that you choose can support the weight of the loaded rack cabinet. Refer to the documentation that comes with your rack cabinet for the weight of a loaded rack cabinet.
• Verify that all door openings are at least 760 x 230 mm (30 x 80 in.).
• Ensure that all devices, shelves, drawers, doors, and cables are secure.
• Ensure that the four leveling pads are raised to their highest position.
• Ensure that there is no stabilizer bracket installed on the rack cabinet during movement.
• Do not use a ramp inclined at more than 10 degrees.
• When the rack cabinet is in the new location, complete the following steps:
(L001)
(L002)
– Lower the four leveling pads. – Install stabilizer brackets on the rack cabinet or in an earthquake environment bolt the rack to
the floor.
– If you removed any devices from the rack cabinet, repopulate the rack cabinet from the
lowest position to the highest position.
• If a long-distance relocation is required, restore the rack cabinet to the conguration of the rack cabinet as you received it. Pack the rack cabinet in the original packaging material, or equivalent. Also lower the leveling pads to raise the casters off of the pallet and bolt the rack cabinet to the pallet.
(R002)
DANGER:
this label attached. Do not open any cover or barrier that contains this label. (L001)
Hazardous voltage, current, or energy levels are present inside any component that has
viii
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
(L003)
or
DANGER: Rack-mounted devices are not to be used as shelves or work spaces. Do not place objects on top of rack-mounted devices. In addition, do not lean on rack-mounted devices and do not use them to stabilize your body position (for example, when working from a ladder). Stability hazard:
• The rack may tip over causing serious personal injury.
• Before extending the rack to the installation position, read the installation instructions.
• Do not put any load on the slide-rail mounted equipment mounted in the installation position.
• Do not leave the slide-rail mounted equipment in the installation position.
(L002)
or
or
Safety notices
ix
or
DANGER: Multiple power cords. The product might be equipped with multiple AC power cords or multiple DC power cables. To remove all hazardous voltages, disconnect all power cords and power cables. (L003)
(L007)
CAUTION:
x Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
A hot surface nearby. (L007)
(L008)
CAUTION: Hazardous moving parts nearby. (L008)
All lasers are certied in the U.S. to conform to the requirements of DHHS 21 CFR Subchapter J for class 1 laser products. Outside the U.S., they are certied to be in compliance with IEC 60825 as a class 1 laser product. Consult the label on each part for laser certication numbers and approval information.
CAUTION: This product might contain one or more of the following devices: CD-ROM drive, DVD­ROM drive, DVD-RAM drive, or laser module, which are Class 1 laser products. Note the following information:
• Do not remove the covers. Removing the covers of the laser product could result in exposure to hazardous laser radiation. There are no serviceable parts inside the device.
• Use of the controls or adjustments or performance of procedures other than those specied herein might result in hazardous radiation exposure.
(C026)
CAUTION: Data processing environments can contain equipment transmitting on system links with laser modules that operate at greater than Class 1 power levels. For this reason, never look into the end of an optical ber cable or open receptacle. Although shining light into one end and looking into the other end of a disconnected optical ber to verify the continuity of optic bers may not injure the eye, this procedure is potentially dangerous. Therefore, verifying the continuity of optical bers by shining light into one end and looking at the other end is not recommended. To verify continuity of a ber optic cable, use an optical light source and power meter. (C027)
CAUTION: This product contains a Class 1M laser. Do not view directly with optical instruments. (C028)
CAUTION: Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following information:
• Laser radiation when open.
• Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam. (C030)
(C030)
CAUTION: The battery contains lithium. To avoid possible explosion, do not burn or charge the battery.
Do Not:
• Throw or immerse into water
• Heat to more than 100 degrees C (212 degrees F)
• Repair or disassemble
Exchange only with the IBM-approved part. Recycle or discard the battery as instructed by local regulations. In the United States, IBM has a process for the collection of this battery. For information, call 1-800-426-4333. Have the IBM part number for the battery unit available when you call. (C003)
CAUTION: Regarding IBM provided VENDOR LIFT TOOL:
• Operation of LIFT TOOL by authorized personnel only.
Safety notices xi
• LIFT TOOL intended for use to assist, lift, install, remove units (load) up into rack elevations. It is not to be used loaded transporting over major ramps nor as a replacement for such designated tools like pallet jacks, walkies, fork trucks and such related relocation practices. When this is not practicable, specially trained persons or services must be used (for instance, riggers or movers).
• Read and completely understand the contents of LIFT TOOL operator's manual before using. Failure to read, understand, obey safety rules, and follow instructions may result in property damage and/or personal injury. If there are questions, contact the vendor's service and support. Local paper manual must remain with machine in provided storage sleeve area. Latest revision manual available on vendor's web site.
• Test verify stabilizer brake function before each use. Do not over-force moving or rolling the LIFT TOOL with stabilizer brake engaged.
• Do not raise, lower or slide platform load shelf unless stabilizer (brake pedal jack) is fully engaged. Keep stabilizer brake engaged when not in use or motion.
• Do not move LIFT TOOL while platform is raised, except for minor positioning.
• Do not exceed rated load capacity. See LOAD CAPACITY CHART regarding maximum loads at center versus edge of extended platform.
• Only raise load if properly centered on platform. Do not place more than 200 lb (91 kg) on edge of sliding platform shelf also considering the load's center of mass/gravity (CoG).
• Do not corner load the platforms, tilt riser, angled unit install wedge or other such accessory options. Secure such platforms -- riser tilt, wedge, etc options to main lift shelf or forks in all four (4x or all other provisioned mounting) locations with provided hardware only, prior to use. Load objects are designed to slide on/off smooth platforms without appreciable force, so take care not to push or lean. Keep riser tilt [adjustable angling platform] option flat at all times except for nal minor angle adjustment when needed.
• Do not stand under overhanging load.
• Do not use on uneven surface, incline or decline (major ramps).
• Do not stack loads.
• Do not operate while under the influence of drugs or alcohol.
• Do not support ladder against LIFT TOOL (unless the specic allowance is provided for one following qualied procedures for working at elevations with this TOOL).
• Tipping hazard. Do not push or lean against load with raised platform.
• Do not use as a personnel lifting platform or step. No riders.
• Do not stand on any part of lift. Not a step.
• Do not climb on mast.
• Do not operate a damaged or malfunctioning LIFT TOOL machine.
• Crush and pinch point hazard below platform. Only lower load in areas clear of personnel and obstructions. Keep hands and feet clear during operation.
• No Forks. Never lift or move bare LIFT TOOL MACHINE with pallet truck, jack or fork lift.
• Mast extends higher than platform. Be aware of ceiling height, cable trays, sprinklers, lights, and other overhead objects.
• Do not leave LIFT TOOL machine unattended with an elevated load.
• Watch and keep hands, ngers, and clothing clear when equipment is in motion.
• Turn Winch with hand power only. If winch handle cannot be cranked easily with one hand, it is probably over-loaded. Do not continue to turn winch past top or bottom of platform travel. Excessive unwinding will detach handle and damage cable. Always hold handle when lowering, unwinding. Always assure self that winch is holding load before releasing winch handle.
• A winch accident could cause serious injury. Not for moving humans. Make certain clicking sound is heard as the equipment is being raised. Be sure winch is locked in position before releasing handle. Read instruction page before operating this winch. Never allow winch to unwind freely.
xii
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Freewheeling will cause uneven cable wrapping around winch drum, damage cable, and may cause serious injury.
• This TOOL must be maintained correctly for IBM Service personnel to use it. IBM shall inspect condition and verify maintenance history before operation. Personnel reserve the right not to use TOOL if inadequate. (C048)
Power and cabling information for NEBS (Network Equipment-Building System) GR-1089-CORE
The following comments apply to the IBM servers that have been designated as conforming to NEBS (Network Equipment-Building System) GR-1089-CORE:
The equipment is suitable for installation in the following:
• Network telecommunications facilities
• Locations where the NEC (National Electrical Code) applies
The intrabuilding ports of this equipment are suitable for connection to intrabuilding or unexposed wiring or cabling only. The intrabuilding ports of this equipment must not be metallically connected to the interfaces that connect to the OSP (outside plant) or its wiring. These interfaces are designed for use as intrabuilding interfaces only (Type 2 or Type 4 ports as described in GR-1089-CORE) and require isolation from the exposed OSP cabling. The addition of primary protectors is not sufcient protection to connect these interfaces metallically to OSP wiring.
Note: All Ethernet cables must be shielded and grounded at both ends.
The ac-powered system does not require the use of an external surge protection device (SPD).
The dc-powered system employs an isolated DC return (DC-I) design. The DC battery return terminal shall not be connected to the chassis or frame ground.
The dc-powered system is intended to be installed in a common bonding network (CBN) as described in GR-1089-CORE.
Safety notices
xiii
xiv Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P

Beginning troubleshooting and problem analysis

This information provides a starting point for analyzing problems.
This information is the starting point for diagnosing and repairing systems. From this point, you are guided to the appropriate information to help you diagnose problems, determine the appropriate repair action, and then complete the necessary steps to repair the system.
Note: Update the system rmware to the latest level before you start problem analysis. If you update the system rmware, you will have the latest available xes and improvements for error handling, reporting, and isolation. For instructions about updating the system rmware, see Getting xes.
What type of problem are you dealing with? Problem analysis procedure
You do not know the type of problem. Go to “Determining the problem analysis
procedure to perform” on page 1.
A baseboard management controller (BMC) access problem occurred.
The system does not power on (the power button or the BMC power on command does not power on the system).
A system rmware boot failure occurred (the system started but was not able to boot to the Petitboot menu).
A video graphics array (VGA) monitor problem occurred (the system started but no video is displayed on the monitor).
An operating system boot failure occurred (the system booted to the Petitboot menu but the operating system did not start).
A sensor on the sensor readings GUI display is red. Go to “Resolving a sensor indicator problem” on
A processor, memory, power, or cooling hardware failure occurred.
Missing or faulty PCIe adapter or device. Go to Resolving a PCIe adapter or device problem.
You have an FQPSPxxxxxxx event code. Go to FQPSPxxxxxxx Event Codes.
Go to “Resolving a BMC access problem” on page
2.
Go to “Resolving a power problem” on page 5.
Go to “Resolving a system rmware boot failure” on page 5.
Go to “Resolving a VGA monitor problem” on page
7.
Go to “Resolving an operating system boot failure” on page 7.
page 9.
Go to “Resolving a hardware problem” on page
10.

Determining the problem analysis procedure to perform

Learn how to identify the correct problem analysis procedure to perform.
About this task
To determine the correct problem analysis procedure to perform, complete the following steps:
Procedure
1. After you apply power to the system, are the power supply LEDs green (either steady or flashing)?
If
Yes: Continue with the next step.
©
Copyright IBM Corp. 2017, 2019 1
Then
If Then
No: Go to “Resolving a power problem” on page 5.
2. Can you access the baseboard management controller (BMC) across the network?
If Then
Yes: Continue with the next step.
No: Go to “Resolving a BMC access problem” on page 2.
3. Can you boot the system to the Petitboot menu?
If Then
Yes: Continue with the next step.
No: Go to “Resolving a system rmware boot failure” on page 5.
4. Is video displayed on the video graphics array (VGA) monitor?
If Then
Yes: Continue with the next step.
No: Go to “Resolving a VGA monitor problem” on page 7.
5. Can you start the operating system?
If Then
Yes: Continue with the next step.
No: Go to “Resolving an operating system boot failure” on page 7.
6. On the sensor readings GUI display, are any sensors red?
If
Yes: Go to “Resolving a sensor indicator problem” on page 9.
No: Continue with the next step.
7. Go to “Resolving a hardware problem” on page 10. This ends the procedure.
Then

Resolving a BMC access problem

Learn how to identify the service action that is needed to resolve a baseboard management controller (BMC) access problem.
Procedure
1. Ensure that the BMC password is not set to the default password. For information about changing the default password, see Logging on to the BMC GUI. Does the problem persist?
If
Yes: Continue with the next step.
No: This ends the procedure.
Then
2. Are both ends of the network cable seated securely?
If
Yes: Continue with the next step.
No: Seat both ends of the cable securely. If the problem persists, continue with the next
2 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Then
step.
3. Power off the system and disconnect all AC power cords for 30 seconds. Then, reconnect the AC power cords and power on the system. Does the BMC access problem persist?
If Then
Yes: Continue with the next step.
No: This ends the procedure.
4. Verify that the BMC network settings are correct.
a) Power on the system by using the power button on the front of the system. Wait 1 - 2 minutes for
the system to display the Petitboot menu.
b) When the Petitboot menu is displayed, press any key to interrupt the boot process. Then, select
Exit to Shell.
c) Type the following command and press Enter:
ipmitool lan print 1
d) Verify that the MAC address and the IP address settings are correct. Then, continue with the next
step.
Note: If the IP address setting is incorrect, go to Conguring the rmware IP address website (http://www.ibm.com/support/knowledgecenter/linuxonibm/liabw/ liabwenablenetwork.htm). If the MAC address is 00:00:00:00:00:00, go to “Contacting IBM service and support” on page 61.
5. Are you able to log in to the BMC web interface?
If
Then
Yes: To update the BMC rmware, go to Updating the system rmware by using the BMC.
If the problem persists, go to step “12” on page 4.
No: Continue with the next step.
6. Complete the following steps:
a. Connect a VGA monitor to the system. b. Press the power button to power on the system.
c. Boot the system to the Petitboot menu. From the Petitboot menu, select Exit to shell.
7. Are you mounting the storage that contains the pUpdate utility and the BMC rmware le from a network storage location?
If
Yes: Continue with the next step.
No: Go to step “9” on page 4.
8. To update the BMC rmware by using a network storage location, complete the following steps:
a) Type mkdir /tmp/media and press Enter. b) Type the following command and press Enter:
mount -t nfs xxx.xxx.xx.xx:/path/of/files /tmp/media, where xxx.xxx.xx.xx is the
IP address of the system to which you want to establish the connection. c) Type cd /tmp/media and press Enter. d) To update the BMC rmware, type the following command and press Enter:
Then
./pUpdate -f bmc.bin -i bt, where bmc.bin is the name of the BMC image le. e) Allow at least 2 minutes for the BMC to reboot. Does the problem persist?
If
Yes: Go to step “12” on page 4.
Then
Beginning troubleshooting and problem analysis 3
If Then
No: This ends the procedure.
9. Update the BMC rmware by using a USB device. Complete the following steps: a) Ensure that the USB device is formatted by using the VFAT le system. b) Insert the USB device into the system if you have not already done so. c) Type mount and press Enter.
Is the following output displayed?
/dev/mapper/sdb1 mounted on /var/petitboot/mnt/dev/sdb1
If Then
Yes: Continue with the next step.
No: Go to step “11” on page 4.
10. Complete the following steps: a) Type cd /var/petitboot/mnt/dev/sdb1 and press Enter. b) To update the BMC rmware, type the following command and press Enter:
./pUpdate -f bmc.bin -i bt, where bmc.bin is the name of the BMC image le.
c) Allow at least 2 minutes for the BMC to reboot. Does the problem persist?
If
Yes: Go to step “12” on page 4.
No: This ends the procedure.
11. Complete the following steps: a) Type mkdir /tmp/media and press Enter. b) Type mount /dev/mapper/sdb1 /tmp/media and press Enter. c) Type cd /tmp/media and press Enter. d) To update the BMC rmware, type the following command and press Enter:
./pUpdate -f bmc.bin -i bt, where bmc.bin is the name of the BMC image le.
e) Allow at least 2 minutes for the BMC to reboot. Does the problem persist?
If
Yes: Go to step “12” on page 4.
No: This ends the procedure.
12. Replace the system backplane.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
Then
Then
This ends the procedure.
4
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P

Resolving a power problem

Learn how to identify the service action that is needed to resolve a power problem.
Procedure
1. Is the identify LED on the front of the system flashing red slowly at 0.25 Hz? For more information about LEDs, see LEDs on the 9006-12P system or LEDs on the 5104-22C, 9006-22C, or 9006-22P system.
If Then
Yes: Continue with the next step.
No: No service action is required. This ends the procedure.
2. Perform the following actions, one at a time until the problem is resolved:
a. Ensure that all of the power cords are fully seated in the power supplies.
b. Ensure that the power supply is fully seated in the system.
c. Ensure that the power supply fan is not blocked.
d. Ensure that all of the power cords are fully seated in the power distribution units (PDUs) or wall
outlets.
e. If the power cords are plugged into PDUs, ensure that the PDUs are turned on.
f. Replace the power cords.
g. Replace the power supplies.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
This ends the procedure.
Resolving a system rmware boot failure
Learn how to identify the service action that is needed to resolve a failure while booting your system
rmware.
Procedure
1. Does the baseboard management controller (BMC) respond to commands and are you able to access the BMC web interface?
Note: To determine whether the BMC responds to commands, run the following ipmitool command:
ipmitool -I lanplus -U <username> -P <password> -H <bmc ip or bmc hostname> chassis status
If
Then
Yes: Continue with step “3” on page 6.
No: Continue with the next step.
2. Complete the following actions, one at a time, until the problem is resolved:
a. Reset the BMC remotely by entering the following command:
Beginning troubleshooting and problem analysis
5
ipmitool -I lanplus -U <username> -P <password> -H <bmc ip or bmc hostname> mc reset cold
b. Disconnect the power cords from the system for 30 seconds. Reconnect the power cords, wait 5
minutes, and then go to step “1” on page 5.
c. Update the BMC rmware by using the pUpdate command with the block transfer (BT) option:
1) Type mkdir /tmp/media and press Enter.
2) Type the following command and press Enter:
mount -t nfs xxx.xxx.xx.xx:/path/of/files /tmp/media, where xxx.xxx.xx.xx is the IP address of the system to which you want to establish the connection.
3) Type cd /tmp/media and press Enter.
4) To update the BMC rmware, type the following command and press Enter:
./pUpdate -f bmc.bin -i bt, where bmc.bin is the name of the BMC image le.
5) Allow at least 2 minutes for the BMC to reboot.
d. Replace the system backplane.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
This ends the procedure.
3. After you pressed the power button, did the system turn on but fail to display the Petitboot menu?
If
Then
Yes: Continue with the next step.
No: This ends the procedure.
4. Complete the following actions, one at a time, until the problem is resolved:
a. Ensure that the TPM card is fully seated.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location.
b. Disconnect the power cords from the system for 30 seconds. Reconnect the power cords, wait 5
minutes, and then go to step “3” on page 6.
c. Update the PNOR rmware. For instructions, see Getting xes.
Note: If your system is a 9006-12P or 9006-22P, the PNOR rmware level must be V2.12-20190404, or later.
d. Replace the system backplane.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
6
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
This ends the procedure.

Resolving a VGA monitor problem

Learn how to identify the service action that is needed to resolve a video graphics array (VGA) monitor problem.
Procedure
1. Is the system powered on and is the VGA monitor connected to the VGA display port, but no video is displayed?
If Then
Yes: Continue with the next step.
No: This ends the procedure.
2. Complete the following steps, one at a time until the problem is resolved:
a) Ensure that the VGA cable is properly seated to the server port and to the monitor port. b) Verify that your monitor and your VGA cable are working properly by testing them on a system that
is known to be working properly. If the monitor or the VGA cable does not work properly, replace it.
c) Verify that the system is powered on by activating a serial over LAN (SOL) session through the
baseboard management controller (BMC). If the system is not active, go to “Resolving a system rmware boot failure” on page 5.
d) Replace the system backplane.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
This ends the procedure.

Resolving an operating system boot failure

Learn how to identify the service action that is needed to resolve a failure while booting your operating system.
Procedure
1. Was the system recently installed, serviced, moved, or upgraded?
If
Yes: Ensure that all cables are properly seated in the connection path to the designated
No: Continue with the next step.
2. Are you booting the operating system from a network location?
If
Then
boot device. This ends the procedure.
Then
Yes: Continue with the next step.
No: Continue with step “4” on page 8.
3. Complete the following actions, one at a time until the problem is resolved:
Beginning troubleshooting and problem analysis
7
a. Ensure that a problem does not exist with the connection to the network location. b. Ensure that the adapter has a valid IP address for the network.
c. Replace the network adapter.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
4. Petitboot displays all recognized bootable images to use by default. Is the boot image recognized by Petitboot?
If Then
Yes: Continue with step “10” on page 9.
No: Select the Petitboot menu option to refresh the boot images. If the problem persists,
continue with the next step.
5. To determine the command to type on the Petitboot command line to verify that the boot drive is recognized and in optimal status, use Table 1 on page 8.
Table 1. Determine the command to verify that the boot drive is recognized and in optimal status
Boot drive conguration Commands
Virtual drive connected directly to the system backplane
Physical drive connected directly to the system backplane
Is the boot drive recognized and in optimal status?
If
Yes: Reinstall the operating system on the boot drive. This ends the procedure.
No: Continue with the next step.
6. Are the drives properly seated in their respective drive bays?
Note:
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63
to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical
location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical
location and the removal and replacement procedure.
If
Yes: Continue with the next step.
Then
Then
arcconf getconfig 1 LD
arcconf getconfig 1 PD
No: Properly seat the drives in the drive bays. Then, go to step “4” on page 8.
7. Refresh the Petitboot boot options. Is the boot image on the boot drive recognized?
If
Yes: Boot the operating system. Then, continue with step “10” on page 9.
No: Continue with the next step.
8 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Then
8. To determine the command to type on the Petitboot command line to verify that the drives that are known to be in a RAID array are recognized, use Table 2 on page 9.
Table 2. Determine the command to verify that the drives that are known to be in a RAID array are recognized
Drive conguration Commands
Drive connected directly to the system backplane
Are the drives that are known to be in the RAID array recognized?
If Then
Yes: Reinstall the operating system on the boot drive. This ends the procedure.
No: Continue with the next step.
9. Complete the following actions, one at a time until the physical drives are recognized in the RAID array:
Note:
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63
to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical
location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical
location and the removal and replacement procedure.
a. If the drive is connected directly to the system backplane, ensure that the mini-SAS cable and
SATA cables are properly seated in the disk drive backplane and system backplane.
b. Replace the SAS or SATA cable.
c. If the drive is connected directly to the system backplane, replace the system backplane.
arcconf getconfig 1 LD
arcconf getconfig 1 PD
This ends the procedure.
10. Does an operating system error occur during the boot?
If
Yes: Recover the operating system with the tools for the operating system. If that does
No: Reinstall the operating system. This ends the procedure.
Then
not resolve the problem, reinstall the operating system. This ends the procedure.

Resolving a sensor indicator problem

Learn how to resolve a sensor indicator problem.
About this task
To determine whether a service action is required, complete the following procedure:
Note: For more information about sensors, see Sensor readings GUI display.
Procedure
1. If the system is not powered on, boot the system to the operational state. Log in to the BMC web interface. Then, click Server Health > Sensor Readings.
Are any of the sensor indicator LEDs red?
Beginning troubleshooting and problem analysis
9
Yes: Continue with the next step.
No: This ends the procedure.
2. Record the names of any sensors that have a red LED indicator status.
Note: Repeat steps 3 - 6 for every sensor that you record in this step.
3. Use one of the following commands to list the sensor event logs (SELs).
• To list SELs by using an in-band network, enter the following command:
ipmitool sel elist
• To list SELs remotely over the LAN, enter the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP addres or BMC hostname> sel elist
4. Review the list of SELs and locate the log entry that meets the following criteria:
• The name of any of the sensors you recorded in step 2
.
• A service action keyword is present. For a list of service action keywords, see “Identifying service
action keywords in system event logs” on page 23.
Asserted is in the description.
Did you identify a log entry that meets the above criteria?
Yes: Continue with the next step.
No: Go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and
support” on page 61. This ends the procedure.
5. Use one of the following options to display the SEL details for the sensor:
Note: You must specify the SEL record ID in hexadecimal format. For example: 0x1a.
• To display SEL details by using an in-band network, enter the following command:
ipmitool sel get <SEL record ID>
• To display SEL details remotely over the LAN, enter the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP address or BMC hostname> sel get <SEL record ID>
6. The sensor ID eld contains sensor information in the sensor name (sensor ID) format. Record the sensor name, sensor ID, and event description. Then, use this information to determine the service action to perform:
• If your system is a 5104-22C, 9006-12P, 9006-22C, or 9006-22P, go to “Identifying a service action
by using sensor and event information for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P” on page 24 to determine the service action to perform. This ends the procedure.

Resolving a hardware problem

Learn how to identify the service action that is needed to resolve a hardware problem.
Procedure
1. If you have not already done so, manually boot the system.
2. Go to “Identifying a service action by using system event logs” on page 18. Then, continue with the next step.
3. Was a service action identied?
If
Yes: Continue with the next step.
10 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Then
If Then
No: Go to step “5” on page 11.
4. Did the service action x the problem?
If Then
Yes: This ends the procedure.
No: Go to step “5” on page 11.
5. Go to “Resolving a PCIe adapter or device problem” on page 11. Then, continue with the next step.
6. Was a service action identied?
If Then
Yes: Continue with the next step.
No: Go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service
and support” on page 61. This ends the procedure.
7. Did the service action x the problem?
If Then
Yes: This ends the procedure.
No: Go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service
and support” on page 61. This ends the procedure.

Resolving a PCIe adapter or device problem

Learn how to access log les, information to identify types of events, and a list of potential problems and service actions.
About this task
Procedure
1. To identify the correct service procedure to perform by using operating system log information, complete the following steps:
a) Log in as the root user. b) At the command prompt, type dmesg and press Enter.
2. Scan the operating system logs for the rst occurrence of keywords, such as fail, failure, or failed. When you nd a keyword that accompanies one or more of the resource names in Table 3 on page 12, a service action is required.
Did you nd an operating system log that requires a service action?
If
Yes: Use Table 3 on page 12 to determine the service procedure to perform for your type
No: Continue with the next step.
Then
of problem. This ends the procedure.
Beginning troubleshooting and problem analysis 11
Table 3. Resource names, examples, and service procedures for different types of operating system logs.
Resource name Example of a log
requiring a service action
eth1, eth2, eth3, enPxxxxx, where xxxxx indicates the network port.
mlx5_core Link Down
tg3 PCI I/O error
nvme Failed status:
sda, sdb, sdc FAILED Result Storage Go to “Resolving a
EEH Detected error on
Failed to re­initialize device
health_care: handling bad device here
detected. Link is Down
ffffffff, reset controller
PHB#xxx, where xxx is
the PHB number.
Type of problem Service procedure
Network Go to “Resolving a
network adapter problem” on page 13.
Network Go to “Resolving a
network adapter problem” on page 13.
Network Go to “Resolving a
network adapter problem” on page 13.
NVMe Flash adapter Go to “Resolving an
NVMe Flash adapter problem” on page 15.
storage device problem” on page 15.
PCIe bus or adapter Resolve any device
driver errors that are related to I/O and that occurred near the time of this operating system log entry.
xxx has failed 6 times in the last hour and has been permanently disabled, where xxx
is the PCI bus number.
3. Are all of the adapters in the system missing or failed?
If
Yes: Perform the following actions, one at a time, until the problem is resolved:
Then
a. Ensure that the PCIe risers are fully seated in the system. b. Replace system processor CPU 1. c. Replace the system backplane.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
PCIe bus or adapter Ensure that the correct
device drivers are properly installed for the device. If the problem persists, replace the adapter in the PCIe slot that is specied in the operating system log entry.
12 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
If Then
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
No: Go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service
and support” on page 61.

Resolving a network adapter problem

Learn about the possible problems and service actions that you can perform to resolve a network adapter problem.
About this task
Note: To determine the location of the PCIe adapter, see “Identifying the location of the PCIe adapter by
using the slot number” on page 16.
Table 4. Network adapter problems and service actions
Problem Service action
System is unable to nd the adapter or the negotiated PCIe bandwidth of the adapter is less than expected
1. Verify that the adapter is properly seated in a compatible slot.
2. Install the adapter in a different compatible slot.
3. Verify that the drivers for the adapter are installed.
4. Verify that the most recent rmware is installed on the system, or install the most recent rmware if it is not already installed.
5. Restart the system.
6. Replace the adapter.
7. If the adapter is connected to a PCIe riser, replace the PCIe riser.
8. If the adapter is in UIO slot 1, UIO slot 2, or UIO slot 3, replace CPU 1. Otherwise, replace CPU 2.
9. Replace the system backplane.
Beginning troubleshooting and problem analysis 13
Table 4. Network adapter problems and service actions (continued)
Problem Service action
Adapter suddenly stops working
1. If the system was recently installed, moved, serviced, or upgraded, verify that the adapter is seated properly and all associated cables are correctly connected.
2. Inspect the PCIe socket and verify that there is no dirt or debris in the socket.
3. Inspect the card and verify that it is not physically damaged.
4. Verify that all cables are properly seated and are not physically damaged. If you recently added one or more new adapters, remove them and then test to determine whether the failing adapter is functioning properly again. If the network adapter is functioning again, review the IBM support tips to conrm that there are no PCI address, driver, or rmware conflicts. Then, reinstall the new adapters again one at a time until all adapters function properly.
5. Replace the adapter.
6. If the adapter is connected to a PCIe riser, replace the PCIe riser.
7. If the adapter is in UIO slot 1, UIO slot 2, or UIO slot 3, replace CPU 1. Otherwise, replace CPU 2.
8. Replace the system backplane.
Link indicator light on the adapter is off
Link light on the adapter is on, but there is no communication from the adapter
Other problems For information about adapter diagnostics, see
1. Verify that the cable functions properly by testing it with a known working connection.
2. Verify that the port or ports on the switch are enabled and functional.
3. Verify that the switch and adapter are compatible.
4. Replace the adapter.
1. Verify that the most recent driver is installed, or install the most recent driver if it is not already installed.
2. Verify that the adapter and its link have compatible settings, such as speed and duplex
conguration.
Supporting diagnostics. For information about adapter user information, see User guides for PCIe adapters.
14 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P

Resolving an NVMe Flash adapter problem

Learn about the possible problems and service actions that you can perform to resolve a Non-Volatile Memory Express (NVMe) Flash adapter problem.
About this task
Note: To determine the location of the NVMe Flash adapter, see “Identifying the location of the NVMe
Flash adapter” on page 17.
Table 5. NVMe Flash adapter problems and service actions
Problem Service action
System is unable to nd the NVMe Flash adapter
NVMe Flash adapter stops working suddenly
Other problems Check the messages and resolve any other problems that are detected. Then, test
1. If the system was recently installed, moved, serviced, or upgraded, verify that the NVMe Flash adapter is seated and installed properly.
2. Verify that the NVMe Flash adapter is compatible with the system.
3. Verify that the most recent rmware is installed on the system. Otherwise install the most recent rmware if it is not already installed.
4. Replace the NVMe Flash adapter.
1. Check the system logs to verify whether the system detected a problem.
2. Replace the NVMe Flash adapter.
the NVMe Flash adapter again.

Resolving a storage device problem

Learn about the possible problems and service actions that you can perform to resolve a storage device problem.
About this task
Note: To determine the location of the storage device, see “Identifying the location of the storage device”
on page 17.
Table 6. Storage device problems and service actions
Problem Service action
System is unable to nd more than one storage device
1. If the system was recently installed, moved, serviced, or upgraded, verify that the device is seated and installed properly.
2. Verify that the device is compatible with your system.
3. Verify that all internal cables are properly seated and are not physically damaged.
4. Verify that the most recent rmware is installed on the system, or install the most recent rmware if it is not already installed.
5. If the devices are part of a RAID conguration, ensure that the device has been enabled and is part of an array.
6. Replace the cable that connects the disk drive backplane to the system backplane.
Beginning troubleshooting and problem analysis 15
Table 6. Storage device problems and service actions (continued)
Problem Service action
System unable to nd a storage device
More than one storage device suddenly stops working
1. If the system was recently installed, moved, serviced, or upgraded, verify that the device is seated and installed properly.
2. Verify that the device is compatible with your system.
3. Verify that all internal cables are properly seated and are not physically damaged.
4. Verify that the most recent rmware is installed on the system, or install the most recent rmware if it is not already installed.
5. If the device is part of a RAID conguration, ensure that the device has been enabled and is part of an array.
6. Install the device in an open or free slot. If the device is able to be found replace the component with the failing connector.
7. Replace the storage device.
8. Replace any applicable attached cable.
1. If the system was recently installed, moved, serviced, or upgraded, verify that the device is seated and installed properly.
2. Check the system logs to verify whether the system detected a problem.
3. Replace the cable that connects the disk drive backplane to the system backplane.
One storage device suddenly stops working
Other problems Check the messages and resolve any other
1. Verify that all internal cables are properly seated and are not physically damaged.
2. Check the system logs to verify whether the system detected a problem.
3. Replace the drive.
4. Replace the system backplane.
5. Replace the cable.
problems that were detected. Then, test the drive again. If the drive continues not to function, refer to the documentation for the drive.

Identifying the location of the PCIe adapter by using the slot number

The error message provides information to help you to determine the location of the PCIe adapter.
About this task
For example, the log might contain an error similar to the following text:
[131779.752714] EEH: PHB#0 failure detected, location: WIO-R Slot
16
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Replace the PCIe adapter. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 and use the slot number information in the operating system log to identify the physical location and the removal and replacement procedure.

Identifying the location of the NVMe Flash adapter

Use this procedure to identify the location of a Non-Volatile Memory Express (NVMe) Flash adapter.
Procedure
1. Does the operating system log contain the slot number? For example, the log might contain an error message similar to the following text:
[131779.752714] EEH: PHB#0 failure detected, location: WIO-R Slot
If Then
Yes: Replace the adapter. Go to “5104-22C or 9006-22C locations” on page 63,
“9006-12P locations” on page 75, or “9006-22P locations” on page 91 and use the slot number information to identify the physical location and the removal and replacement procedure. This ends the procedure.
No: Continue with the next step.
2. Locate the NVMe Flash adapter by using the PCI address:
a) The operating system log contains information about the NVMe Flash adapter in the form of a PCI
address. Record the PCI address information for the NVMe Flash adapter that has failed. For example, in the operating system log message nvme 0006:01:00.0: Failed status: ffffffff, reset controller, the PCI address of the failing NVMe Flash adapter is 0006:01:00.0.
b) At the command line, type lscfg -vl pciaddress, where pciaddress is the NVMe Flash
adapter information that you recorded in step 2.a. Then, press Enter. c) Record the slot number information that is in the location code eld. d) Replace the adapter. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations”
on page 75, or “9006-22P locations” on page 91 and use the slot number information to identify
the physical location and the removal and replacement procedure. This ends the procedure.

Identifying the location of the storage device

Use this procedure to identify the location of a storage device.
About this task
The storage device location is determined in the drive removal and replacement procedures for your system. See Removing a disk drive from the 5104-22C or 9006-22C system, Removing and replacing a storage drive in the 9006-12P, or Removing and replacing a disk drive in the 9006-22P.

User guides for PCIe adapters

Use this information to nd the user guide for your PCIe adapter.
About this task
Use the following table to nd the user guide for the PCIe adapter that you are using.
Table 7. PCIe adapter user guides
Name User guide
Broadcom Broadcom website (http://www.broadcom.com)
Beginning troubleshooting and problem analysis 17
Table 7. PCIe adapter user guides (continued)
Name User guide
Mellanox Mellanox Technologies website (http://mymellanox.force.com/support/
VF_SerialSearch)
Microsemi Microsemi website (http://www.microsemi.com)
QLogic QLogic website (http://driverdownloads.qlogic.com/QLogicDriverDownloads_UI/
IBM_Search.aspx)

Identifying a service action

Use the following procedures to help you identify the service action that is needed.

Identifying a service action by using system event logs

Use the Intelligent Platform Management Interface (IPMI) program to examine system event logs (SELs) to identify a service action.
Procedure
1. Use the ipmitool command to examine SELs.
• To list SELs by using an in-band network, use the following command:
ipmitool sel elist
• To list SELs remotely over the LAN, use the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP addres or BMC hostname> sel elist
2. Scan the SELs for an event with the value OEM record de. Did you nd a SEL with the value OEM record de?
If
Yes: Continue with the next step.
No Go to step “4” on page 20.
3. The OEM record de specic log information is indicated by the rightmost digits of the SEL with the value OEM record de. Use Table 1 to determine the service action to perform.
Table 8. OEM record de
OEM record de specic log information Service action
00xxxxxxxxxx Go to Getting xes and update the system
Then
specic log information and service action
rmware to the most recent level of rmware that is available. If this SEL event continues to be logged, go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and support” on page 61.
01xxxxxxxxxx Go to “EPUB_PRC_FIND_DECONFIGURE_PART
isolation procedure” on page 49.
04xxxxxxxxxx Go to “EPUB_PRC_SP_CODE isolation
procedure” on page 50.
05xxxxxxxxxx Go to “EPUB_PRC_PHYP_CODE isolation
procedure” on page 50.
18 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 8. OEM record de specic log information and service action (continued)
OEM record de specic log information Service action
08xxxxxxxxxx Go to “EPUB_PRC_ALL_PROCS isolation
procedure” on page 50.
09xxxxxxxxxx Go to “EPUB_PRC_ALL_MEMCRDS isolation
procedure” on page 51.
0Axxxxxxxxxx Go to Getting xes and update the system
rmware to the most recent level of rmware that is available. If this SEL event continues to be logged, go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and support” on page 61.
10xxxxxxxxxx Go to “EPUB_PRC_LVL_SUPPORT isolation
procedure” on page 52.
11xxxxxxxxxx Go to Getting xes and update the system
rmware to the most recent level of rmware that is available. If this SEL event continues to be logged, go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and support” on page 61.
16xxxxxxxxxx Go to Getting xes and update the system
rmware to the most recent level of rmware that is available. If this SEL event continues to be logged, go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and support” on page 61.
1Cxxxxxxxxxx Go to Getting xes and update the system
rmware to the most recent level of rmware that is available. If this SEL event continues to be logged, go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and support” on page 61.
22xxxxxxxxxx Go to
“EPUB_PRC_MEMORY_PLUGGING_ERROR isolation procedure” on page 52.
2Dxxxxxxxxxx Go to “EPUB_PRC_FSI_PATH isolation
procedure” on page 52.
30xxxxxxxxxx Go to “EPUB_PRC_PROC_AB_BUS isolation
procedure” on page 53.
31xxxxxxxxxx Go to “ EPUB_PRC_PROC_XYZ_BUS isolation
procedure” on page 54.
34xxxxxxxxxx Go to Getting xes and update the system
rmware to the most recent level of rmware that is available. If this SEL event continues to be logged, go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and support” on page 61.
37xxxxxxxxxx Go to “EPUB_PRC_EIBUS_ERROR isolation
procedure” on page 55.
Beginning troubleshooting and problem analysis 19
Table 8. OEM record de specic log information and service action (continued)
OEM record de specic log information Service action
3Fxxxxxxxxxx Go to “EPUB_PRC_POWER_ERROR isolation
procedure” on page 56.
4Dxxxxxxxxxx Go to Getting xes and update the system
rmware to the most recent level of rmware that is available. If this SEL event continues to be logged, go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and support” on page 61.
4Fxxxxxxxxxx Go to “EPUB_PRC_MEMORY_UE isolation
procedure” on page 56.
55xxxxxxxxxx Go to “EPUB_PRC_HB_CODE isolation
procedure” on page 57.
56xxxxxxxxxx Go to “EPUB_PRC_TOD_CLOCK_ERR isolation
procedure” on page 58.
5Cxxxxxxxxxx Go to “EPUB_PRC_COOLING_SYSTEM_ERR
isolation procedure” on page 59.
5Dxxxxxxxxxx Go to Getting xes and update the system
rmware to the most recent level of rmware that is available. If this SEL event continues to be logged, go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and support” on page 61.
5Exxxxxxxxxx Go to Getting xes and update the system
rmware to the most recent level of rmware that is available. If this SEL event continues to be logged, go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and support” on page 61.
This ends the procedure.
4. Scan the SELs for an event with the value OEM record df. Did you nd a SEL with the value OEM record df?
If
Yes: Continue with the next step.
No Go to step “10” on page 21.
5. One or more events might be logged around the same time as the event with the value OEM record df. These events require a service action if they meet the following criteria:
• A service action keyword is present. For a list of service action keywords, see “Identifying service
action keywords in system event logs” on page 23.
Asserted is in the description.
OEM record is not in the description.
• The event has a time stamp close to the time stamp of the event with the value OEM record df.
6. Did you nd any SEL events that require a service action as dened in step “5” on page 20?
If
Then
Then
Yes: Continue with the next step.
20 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
If Then
No: Go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM
service and support” on page 61.
7. Did you nd only one SEL event that requires a service action as dened in step “5” on page 20?
If Then
Yes: Continue with the next step.
No: Go to step “9” on page 21.
8. Record the SEL record ID for the event you identied in step “5” on page 20. The SEL record ID is indicated by the leftmost digits of the SEL. Use the ipmitool command to display the SEL details.
• To display SEL details by using an in-band network, use the following command:
ipmitool sel get <SEL record ID>
Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
• To display SEL details remotely over the LAN, use the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP address or BMC hostname> sel get <SEL record ID>
Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
The sensor ID eld contains sensor information in the format sensor name (sensor ID). Record the sensor name, sensor ID, and event description. Then, use the following information to determine the service action to perform:
• If your system is a 5104-22C, 9006-12P, 9006-22C, or 9006-22P, go to “Identifying a service
action by using sensor and event information for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P” on page 24.
This ends the procedure.
9. You identied more than one event in step “5” on page 20. The service actions for all of the events that were identied in step “5” on page 20 must be performed to successfully complete the repair. Record the SEL record IDs for the events that you identied in step “5” on page 20. The SEL record ID is indicated by the leftmost digits of the SEL. Use the ipmitool command to display SEL details for each SEL record ID that you recorded.
• To display SEL details by using an in-band network, use the following command:
ipmitool sel get <SEL record ID>
Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
• To display SEL details remotely over the LAN, use the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP address or BMC hostname> sel get <SEL record ID>
Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
The sensor ID eld contains sensor information in the format sensor name (sensor ID). Record the sensor name, sensor ID, and event description. Then, use this information to determine the service action to perform:
• If your system is a 5104-22C, 9006-12P, 9006-22C, or 9006-22P, go to “Identifying a service
action by using sensor and event information for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P” on page 24.
This ends the procedure.
10. Scan the SEL for an event with the value OEM record c0.
11. Did you nd an event with the value OEM record c0?
Beginning troubleshooting and problem analysis
21
If Then
Yes: Continue with the next step.
No: Go to step “13” on page 22.
12. The OEM record c0 specic log information is indicated by the rightmost digits of the SEL with the value OEM record c0. Use Table 9 on page 22 to determine the service action to perform.
Table 9. OEM record c0 specic log information, description, and service action
OEM record c0 specic log information
2aff6ffxxxxx A session audit event occurred No service action is required.
cdxx6fffffff An automatic shutdown event
ceff6fffffff A machine check event
cfff6fffffff An unexpected problem
Description Service action
occurred due to high system temperature
occurred
occurred with the voltage regulator output
• Search for SEL events that are related to high system temperature and resolve them.
• Ensure that the room temperature meets the requirements that are specied for the system.
• Ensure that there are no air flow obstructions at the front or at the rear of the system.
Search for serviceable SEL events and resolve them.
If a machine check event is present with a time stamp close to the time stamp of this event, search for serviceable SEL events and resolve them. If a machine check event is not present with a time stamp close to the time stamp of this event, reboot the system to recover from the system hang. If the problem persists, replace the system backplane.
13. One or more SEL events might require a service action. These events require a service action if they meet the following criteria:
• A service action keyword is present. For a list of service action keywords, see “Identifying service
action keywords in system event logs” on page 23.
Asserted is in the description.
OEM record is not in the description.
14. Did you nd one or more SEL events that require a service action as dened in step “13” on page 22?
If
Yes: Continue with the next step.
No: This ends the procedure.
15. The service actions for all of the events that were identied in step “13” on page 22 must be performed to successfully complete the repair. Record the SEL record IDs for the events that you
22
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Then
identied in step “13” on page 22. The SEL record ID is indicated by the leftmost digits of the SEL. Use the ipmitool command to display SEL details for each SEL record ID that you recorded.
• To display SEL details by using an in-band network, use the following command:
ipmitool sel get <SEL record ID>
Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
• To display SEL details remotely over the LAN, use the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP address or BMC hostname> sel get <SEL record ID>
Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
The sensor ID eld contains sensor information in the format sensor name (sensor ID). Record the sensor name, sensor ID, and event description. Then, use this information to determine the service action to perform:
• If your system is a 5104-22C, 9006-12P, 9006-22C, or 9006-22P, go to “Identifying a service
action by using sensor and event information for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P” on page 24.
This ends the procedure.

Identifying service action keywords in system event logs

System event logs (SELs) that have Asserted and any of the keywords indicated below in the description require a service action.
Temperature and voltage service action keywords
• Transition to Critical from Less Severe
• Transition to Critical from Non-recoverable
• Transition to Non-recoverable
• Transition to Non-recoverable from Less Severe
Backplane service action keywords
• State Asserted
Chassis service action keywords
• General Chassis intrusion
Fan service action keywords
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non-recoverable
• Device Removed / Device Absent
• Transition to degraded
• Install error
• Redundancy lost
• Non-redundant insufcient resources
Memory service action keywords
Conguration Error
Beginning troubleshooting and problem analysis
23
• Transition to Non-recoverable
• Predictive Failure
Processor service action keywords
• IERR
• Transition to Non-recoverable
• Predictive Failure
• Device Disabled
Power supply service action keywords
• Power Supply Failure Detected
• Predictive Failure
• Power Supply Input Lost or AC DC
• Power Supply Input Lost Or Out of Range
• Power Supply Input Out of Range But Present
System event service action keywords
• Undetermined system hardware failure
Watchdog service action keywords
• Hard Reset
• Power Down
• Power Cycle
• Timer Interrupt

Identifying a service action by using sensor and event information

You can use sensor and event information from the system event log (SEL) to determine a service action.
Identifying a service action by using sensor and event information for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P
You can use the sensor and event information from the system event log to determine a service action to perform for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P.
Procedure
If you have not done so already, complete “Identifying a service action by using system event logs” on page 18. Then, use the following table to determine the service action to perform.
24
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P
Sensor name (Sensor ID) Event description Service action
System Temp (0x01)
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
Ensure that there are no air flow obstructions at the front or at the rear of the system. Ensure that the fans are operating properly.
No service action is required.
Beginning troubleshooting and problem analysis 25
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
Peripheral Temp (0x02)
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
Ensure that the room temperature meets the requirements that are specied for the system. Ensure that there are no air flow obstructions at the front or at the rear of the system.
No service action is required.
Backplane Fault (0x03) State Deasserted No service action is required.
State Asserted Replace the system backplane.
Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
Unknown Backplane Fault (0x03) Transition to Non-recoverable Power on the system. If a
message is displayed that indicates that the TPM card is missing, reseat the TPM card. If the problem persists, replace the TPM card. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
26 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
System Event (0x04) Undetermined system hardware
failure
• System Recongured
• OEM System boot event
• Entry added to auxiliary log
• PEF Action
• Timestamp Clock Sync
Boot Progress (0x05)
• PCIE CPU1 Pwr (0x06)
• PCIE CPU2 Pwr (0x07)
• Unknown Error
• Unknown Hang
• Unknown Progress
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
Go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and support” on page 61.
No service action is required.
No service action required.
• OCC Active 1 (0x08)
• OCC Active 2 (0x09)
Device Disabled If the sensor name is OCC Active
1, replace CPU 1. If the sensor name is OCC Active 2, replace CPU 2. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
• State Deasserted
• Device Enabled
Beginning troubleshooting and problem analysis 27
No service action is required.
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• CPU1 Temp (0x0B)
• CPU2 Temp (0x0D)
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical - going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
Ensure that there are no air flow obstructions at the front or at the rear of the system. Ensure that the fans are operating properly.
No service action is required.
28 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• CPU Func 1 (0x0C)
• CPU Func 2 (0x0E)
• IERR
• Transition to Non-recoverable
• Predictive Failure
• Thermal Trip
• FRB1 BIST Failure
• FRB2 Hang In POST Failure
• FRB3 Processor Startup Initialization Failure
Conguration Error
• SMBIOS Uncorrectable CPU Complex Error
• Processor Disabled
• Terminator Presence Detected
• Processor Automatically Throttled
• Machine Check Exception
• Correctable Machine Check Error
• State Deasserted
• Device Disabled
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Processor Presence Detected
• State Asserted
• Device Enabled
• Transition to OK
• Transition to Non-Critical from OK
• Transition to Non-Critical from More Severe
• Monitor
• Informational
If the sensor name is CPU Func 1, replace CPU 1. If the sensor name is CPU Func 2, replace CPU
2. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
No service action is required.
Beginning troubleshooting and problem analysis 29
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• P1-DIMMA1 Func (0x10)
• P1-DIMMA2 Func (0x11)
• P1-DIMMB1 Func (0x12)
• P1-DIMMB2 Func (0x13)
• P1-DIMMC1 Func (0x14)
• P1-DIMMC2 Func (0x15)
• P1-DIMMD1 Func (0x16)
• P1-DIMMD2 Func (0x17)
• P2-DIMMA1 Func (0x18)
• P2-DIMMA2 Func (0x19)
• P2-DIMMB1 Func (0x1A)
• P2-DIMMB2 Func (0x1B)
• P2-DIMMC1 Func (0x1C)
• P2-DIMMC2 Func (0x1D)
• P2-DIMMD1 Func (0x1E)
• P2-DIMMD2 Func (0x1F)
• Memory Device Disabled
• Uncorrectable Memory Error
• Memory Scrub Failed
• State Deasserted
• Device Disabled
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Correctable Memory Error
• Parity
• Correctable Memory Error Logging Limit Reached
• Memory Automatically Throttled
• Critical Over temperature
• Presence Detected
• Spare
• State Asserted
• Device Enabled
• Transition to OK
• Transition to Non-Critical from OK
• Transition to Non-Critical from More Severe
• Monitor
• Informational
No service action is required.
• Transition to Non-recoverable
• Predictive Failure
30 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
If the sensor name is P1­DIMMA1 Func, replace P1­DIMMA1. If the sensor name is P1-DIMMA2 Func, replace P1­DIMMA2. And so on. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• P1-DIMMA1 Func (0x10)
• P1-DIMMA2 Func (0x11)
• P1-DIMMB1 Func (0x12)
• P1-DIMMB2 Func (0x13)
• P1-DIMMC1 Func (0x14)
• P1-DIMMC2 Func (0x15)
• P1-DIMMD1 Func (0x16)
• P1-DIMMD2 Func (0x17)
• P2-DIMMA1 Func (0x18)
• P2-DIMMA2 Func (0x19)
• P2-DIMMB1 Func (0x1A)
• P2-DIMMB2 Func (0x1B)
• P2-DIMMC1 Func (0x1C)
• P2-DIMMC2 Func (0x1D)
• P2-DIMMD1 Func (0x1E)
• P2-DIMMD2 Func (0x1F)
Conguration Error Complete the following steps:
a. If the sensor name is P1-
DIMMA1 Func, ensure that P1-DIMMA1 is seated properly. If the sensor name is P1-DIMMA2 Func, ensure that P1-DIMMA2 is seated properly. And so on.
b. If you recently installed or
replaced memory DIMMs, ensure that the DIMMs are plugged in the correct memory slots.
c. If the sensor name is P1-
DIMMA1 Func, replace P1­DIMMA1. If the sensor name is P1-DIMMA2 Func, replace P1-DIMMA2. And so on. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
Beginning troubleshooting and problem analysis
31
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• P1-DIMMA1 Temp (0x20)
• P1-DIMMA2 Temp (0x21)
• P1-DIMMB1 Temp (0x22)
• P1-DIMMB2 Temp (0x23)
• P1-DIMMC1 Temp (0x24)
• P1-DIMMC2 Temp (0x25)
• P1-DIMMD1 Temp (0x26)
• P1-DIMMD2 Temp (0x27)
• P2-DIMMA1 Temp (0x28)
• P2-DIMMA2 Temp (0x29)
• P2-DIMMB1 Temp (0x2A)
• P2-DIMMB2 Temp (0x2B)
• P2-DIMMC1 Temp (0x2C)
• P2-DIMMC2 Temp (0x2D)
• P2-DIMMD1 Temp (0x2E)
• P2-DIMMD2 Temp (0x2F)
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
Ensure that there are no air flow obstructions at the front or at the rear of the system. Ensure that the fans are operating properly.
No service action is required.
• CPU Core Temp 25 (0x30)
• CPU Core Temp 26 (0x31)
• CPU Core Temp 27 (0x32)
• CPU Core Temp 28 (0x33)
• CPU Core Temp 29 (0x34)
• CPU Core Temp 30 (0x35)
• CPU Core Temp 31 (0x36)
• CPU Core Temp 32 (0x37)
• CPU Core Temp 33 (0x38)
• CPU Core Temp 34 (0x39)
• CPU Core Temp 35 (0x3A)
• CPU Core Temp 36 (0x3B)
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
No service action is required.
32 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• CPU Core Temp 37 (0x3C)
• CPU Core Temp 38 (0x3D)
• CPU Core Temp 39 (0x3E)
• CPU Core Temp 40 (0x3F)
• CPU Core Temp 41 (0x40)
• CPU Core Temp 42 (0x41)
• CPU Core Temp 43 (0x42)
• CPU Core Temp 44 (0x43)
• CPU Core Temp 45 (0x44)
• CPU Core Temp 46 (0x45)
• CPU Core Temp 47 (0x46)
• CPU Core Temp 48 (0x47)
• Turbo Allowed (0x48)
• TPM Required (0x49)
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
• State Deasserted
• State Asserted
No service action is required.
No service action is required.
Beginning troubleshooting and problem analysis 33
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• SAS Temp (0x4A)
• HDD Temp (0x4B)
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
Ensure that the ambient temperature is within operating specications. Ensure that there are no blockages to the air inlet and outlets. If blockages are found, remove them. Ensure that all of the fans are working properly by looking for serviceable events related to fans and resolving them.
No service action is required.
HDD Status (0x4C)
34 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
• State Deasserted
• State Asserted
No service action is required.
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
Total Power (0x4D)
• CPU1 Power (0x4E)
• CPU2 Power (0x4F)
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
No service action is required.
No service action required.
Beginning troubleshooting and problem analysis 35
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• CPU Core Func 25 (0x50)
• CPU Core Func 26 (0x51)
• CPU Core Func 27 (0x52)
• CPU Core Func 28 (0x53)
• CPU Core Func 29 (0x54)
• CPU Core Func 30 (0x55)
• CPU Core Func 31 (0x56)
• CPU Core Func 32 (0x57)
• CPU Core Func 33 (0x58)
• CPU Core Func 34 (0x59)
• CPU Core Func 35 (0x5A)
• CPU Core Func 36 (0x5B)
• IERR
• Transition to Non-recoverable
• Predictive Failure
• FRB1 BIST Failure
• FRB2 Hang In POST Failure
• FRB3 Processor Startup Initialization Failure
Conguration Error
• SMBIOS Uncorrectable CPU Complex Error
• Processor Disabled
• Terminator Presence Detected
• Machine Check Exception
• Correctable Machine Check Error
• State Deasserted
• Device Disabled
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Thermal Trip
• Processor Automatically Throttled
• Processor Presence Detected
• State Asserted
• Device Enabled
• Transition to OK
• Transition to Non-Critical from OK
• Transition to Non-Critical from More Severe
• Monitor
• Informational
Replace system processor CPU 1. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
No service action is required.
36 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• CPU Core Func 37 (0x5C)
• CPU Core Func 38 (0x5D)
• CPU Core Func 39 (0x5E)
• CPU Core Func 40 (0x5F)
• CPU Core Func 41 (0x60)
• CPU Core Func 42 (0x61)
• CPU Core Func 43 (0x62)
• CPU Core Func 44 (0x63)
• CPU Core Func 45 (0x64)
• CPU Core Func 46 (0x65)
• CPU Core Func 47 (0x66)
• CPU Core Func 48 (0x67)
• IERR
• Transition to Non-recoverable
• Predictive Failure
• FRB1 BIST Failure
• FRB2 Hang In POST Failure
• FRB3 Processor Startup Initialization Failure
Conguration Error
• SMBIOS Uncorrectable CPU Complex Error
• Processor Disabled
• Terminator Presence Detected
• Machine Check Exception
• Correctable Machine Check Error
• State Deasserted
• Device Disabled
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Thermal Trip
• Processor Automatically Throttled
• Processor Presence Detected
• State Asserted
• Device Enabled
• Transition to OK
• Transition to Non-Critical from OK
• Transition to Non-Critical from More Severe
• Monitor
• Informational
Replace system processor CPU 2. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
No service action is required.
Beginning troubleshooting and problem analysis 37
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• Freq Limit OT 1 (0x68)
• Mem Thrttl OT 1 (0x6A)
• Freq Limit OT 2 (0x6C)
• Mem Thrttl OT 2 (0x6E)
Performance Met If Asserted is in the event
description, no service action is required.
If Deasserted is in the event description, ensure that the ambient temperature is within operating specications. Ensure that there are no blockages to the air inlet and outlets. If blockages are found, remove them. Ensure that all of the fans are working properly by looking for serviceable events related to fans and resolving them.
Performance Lags If Deasserted is in the event
description, no service action is required.
If Asserted is in the event description, ensure that the ambient temperature is within operating specications. Ensure that there are no blockages to the air inlet and outlets. If blockages are found, remove them. Ensure that all of the fans are working properly by looking for serviceable events related to fans and resolving them.
38 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• Freq Limit Pwr 1 (0x69)
• Freq Limit Pwr 2 (0x6D)
Performance Met If Asserted is in the event
description, no service action is required.
If Deasserted is in the event description, ensure that both power supplies are working properly. Search for serviceable events related to system power and voltage and resolve them. Ensure all fans are working properly by looking for serviceable events related to fans and resolving them.
Performance Lags If Deasserted is in the event
description, no service action is required.
If Asserted is in the event description, ensure that both power supplies are working properly. Search for serviceable events related to system power and voltage and resolve them. Ensure all fans are working properly by looking for serviceable events related to fans and resolving them.
Beginning troubleshooting and problem analysis 39
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
VBAT (0x9C)
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
Replace the time-of-day battery. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
No service action is required.
40 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• GPU1 Temp (0xA0)
• GPU2 Temp (0xA1)
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
• Ensure that there are no air flow obstructions at the front or at the rear of the system.
• Ensure that the fans are operating properly.
No service action required.
• CPU Core Temp 1 (0xB0)
• CPU Core Temp 2 (0xB1)
• CPU Core Temp 3 (0xB2)
• CPU Core Temp 4 (0xB3)
• CPU Core Temp 5 (0xB4)
• CPU Core Temp 6 (0xB5)
• CPU Core Temp 7 (0xB6)
• CPU Core Temp 8 (0xB7)
• CPU Core Temp 9 (0xB8)
• CPU Core Temp 10 (0xB9)
• CPU Core Temp 11 (0xBA)
• CPU Core Temp 12 (0xBB)
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
No service action is required.
Beginning troubleshooting and problem analysis 41
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• CPU Core Temp 13 (0xBC)
• CPU Core Temp 14 (0xBD)
• CPU Core Temp 15 (0xBE)
• CPU Core Temp 16 (0xBF)
• CPU Core Temp 17 (0xC0)
• CPU Core Temp 18 (0xC1)
• CPU Core Temp 19 (0xC2)
• CPU Core Temp 20 (0xC3)
• CPU Core Temp 21 (0xC4)
• CPU Core Temp 22 (0xC5)
• CPU Core Temp 23 (0xC6)
• CPU Core Temp 24 (0xC7)
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
No service action is required.
42 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• CPU Core Func 1 (0xC8)
• CPU Core Func 2 (0xC9)
• CPU Core Func 3 (0xCA)
• CPU Core Func 4 (0xCB)
• CPU Core Func 5 (0xCC)
• CPU Core Func 6 (0xCD)
• CPU Core Func 7 (0xCE)
• CPU Core Func 8 (0xCF)
• CPU Core Func 9 (0xD0)
• CPU Core Func 10 (0xD1)
• CPU Core Func 11 (0xD2)
• CPU Core Func 12 (0xD3)
• IERR
• Transition to Non-recoverable
• Predictive Failure
• FRB1 BIST Failure
• FRB2 Hang In POST Failure
• FRB3 Processor Startup Initialization Failure
Conguration Error
• SMBIOS Uncorrectable CPU Complex Error
• Processor Disabled
• Terminator Presence Detected
• Machine Check Exception
• Correctable Machine Check Error
• State Deasserted
• Device Disabled
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Thermal Trip
• Processor Automatically Throttled
• Processor Presence Detected
• State Asserted
• Device Enabled
• Transition to OK
• Transition to Non-Critical from OK
• Transition to Non-Critical from More Severe
• Monitor
• Informational
Replace system processor CPU 1. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
No service action is required.
Beginning troubleshooting and problem analysis 43
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• CPU Core Func 13 (0xD4)
• CPU Core Func 14 (0xD5)
• CPU Core Func 15 (0xD6)
• CPU Core Func 16 (0xD7)
• CPU Core Func 17 (0xD8)
• CPU Core Func 18 (0xD9)
• CPU Core Func 19 (0xDA)
• CPU Core Func 20 (0xDB)
• CPU Core Func 21 (0xDC)
• CPU Core Func 22 (0xDD)
• CPU Core Func 23 (0xDE)
• CPU Core Func 24 (0xDF)
• IERR
• Transition to Non-recoverable
• Predictive Failure
• FRB1 BIST Failure
• FRB2 Hang In POST Failure
• FRB3 Processor Startup Initialization Failure
Conguration Error
• SMBIOS Uncorrectable CPU Complex Error
• Processor Disabled
• Terminator Presence Detected
• Machine Check Exception
• Correctable Machine Check Error
• State Deasserted
• Device Disabled
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Thermal Trip
• Processor Automatically Throttled
• Processor Presence Detected
• State Asserted
• Device Enabled
• Transition to OK
• Transition to Non-Critical from OK
• Transition to Non-Critical from More Severe
• Monitor
• Informational
Replace system processor CPU 2. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
No service action is required.
44 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
MB_10G Temp (0xE0)
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
Ensure that there are no air flow obstructions at the front or at the rear of the system. Ensure that the fans are operating properly.
No service action is required.
Beginning troubleshooting and problem analysis 45
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
NVMe_SSD Temp (0xE1)
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
Ensure that there are no air flow obstructions at the front or at the rear of the system. Ensure that the fans are operating properly.
No service action is required.
Chassis Intru (0xE2)
• Drive Bay intrusion
• I/O Card area intrusion
• Processor area intrusion
• System unplugged from LAN
• Unauthorized dock
• FAN area intrusion
General Chassis intrusion Ensure that the top cover is
No service action is required.
properly installed on the system. See Installing the service access cover on an 5104-22C, 9006-22C, or 9006-22P system or Installing the service access cover on an 9006-12P system.
46 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• FAN1 (0xE3)
• FAN2 (0xE4)
• FAN3 (0xE5)
• FAN4 (0xE6)
• FAN5 (0xE7)
• FAN6 (0xE8)
• FAN7 (0xE9)
• FAN8 (0xEA)
• Transition to Critical from Less Severe
• Transition to Non-recoverable from Less Severe
• Transition to Critical from Non­recoverable
• Lower Non-critical – going low
• Lower Non-critical – going high
• Lower Critical – going low
• Lower Critical – going high
• Lower Non-recoverable – going low
• Lower Non-recoverable – going high
• Upper Non-critical – going low
• Upper Non-critical – going high
• Upper Critical - going low
• Upper Critical - going high
• Upper Non-recoverable – going low
• Upper Non-recoverable – going high
• Device Inserted/Device Present
If the sensor name is FAN1, FAN4, FAN5, or FAN8, no service action is required. If the sensor name is FAN2, replace Fan 2. If the sensor name is FAN3, replace Fan 3. If the sensor name is FAN6, replace Fan 6. If the sensor name is FAN7, replace Fan 7. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
No service action is required.
• Device Removed/Device Absent
• Transition to degraded
• Install error
• Redundancy lost
• Non-redundant insufcient resources
Beginning troubleshooting and problem analysis 47
Ensure that all fans are seated securely. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
• PS1 Status (0xF3)
• PS2 Status (0xF4)
• Predictive Failure
• Power Supply Input Out of Range But Present
Power Supply Failure Detected An assert event immediately
If the sensor name is PS1 Status, replace PSU 1. If the sensor name is PS2 Status, replace PSU
2. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
followed by a deassert event indicates that a power cycle of the system occurred. No service action is required. If there is no deassert event immediately following the assert event, replace the power supply. If the sensor name is PS1 Status, replace PSU 1. If the sensor name is PS2 Status, replace PSU
2. Go to “5104-22C or 9006-22C locations” on page 63, “9006-12P locations” on page 75, or “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
• Power Supply Input Lost or AC DC
• Power Supply Input Lost Or Out Of Range
• State Deasserted
• State Asserted
• Presence Detected
• Ensure that AC power is supplied to the rack.
• Ensure that the system power cords are plugged tightly into both the power supply and the rack power distribution unit (PDU) for both system power supplies.
No service action is required.
48 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 10. Sensor information, event description, and service action for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P (continued)
Sensor name (Sensor ID) Event description Service action
Watchdog (0xFF)
• Timer Expired
• Reserved1
• Reserved2
• Reserved3
• Reserved4
• Hard Reset
• Power Down
• Power Cycle
• Timer Interrupt

Isolation procedures

Use this information to isolate problems that might occur with your system.

EPUB_PRC_FIND_DECONFIGURE_PART isolation procedure

A part vital to the system has been decongured.
No service action is required.
Search for serviceable SEL events that have a time stamp close to the time stamp of this SEL event. If you found a serviceable SEL event, perform the service action that is indicated in this table for the SEL event. If you cannot boot the system to the Petitboot menu, go to “Resolving a system rmware boot failure” on page 5.
Procedure
1. Use the ipmitool command to examine system event logs (SELs).
• To list SELs by using an in-band network, use the following command:
ipmitool sel elist
• To list SELs remotely over the LAN, use the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP addres or BMC hostname> sel elist
2. Identify all SELs with the value OEM record df and Correctable Machine Check Error or Transition to Non-recoverable in the description. Did you nd one or more SELs with the value OEM record df and Correctable Machine Check Error or Transition to Non-recoverable in the description?
If
Yes: Continue with the next step.
No: Go to “Contacting IBM service and support” on page 61. This ends the procedure.
3. For each of the SELs that you identied in step “2” on page 49, determine the sensor name that is associated with each SEL. Replace the following items, one at a time until the problem is resolved:
Note:
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to
identify the physical location and removal and replacement procedure.
Then
Beginning troubleshooting and problem analysis
49
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical
location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical
location and the removal and replacement procedure.
• If the sensor name is CPU Func 1 or CPU Core Func x, where x is 1 - 12, replace system processor
CPU 1.
• If the sensor name is CPU Func 2 or CPU Core Func x, where x is 13 - 24, replace system processor
CPU 2.
Does the problem persist?
If Then
Yes: Replace the system backplane. If the replacement of the system backplane does not
resolve the problem, go to “Contacting IBM service and support” on page 61. This ends the procedure.
No: This ends the procedure.

EPUB_PRC_SP_CODE isolation procedure

A problem was detected in the system rmware.
About this task
Update the system rmware image. Go to Getting xes and update the system rmware with the most recent level of rmware. Then, reboot the system. If the system rmware update does not resolve the problem, go to “Contacting IBM service and support” on page 61. This ends the procedure.

EPUB_PRC_PHYP_CODE isolation procedure

A problem was detected in the system rmware.
About this task
Update the system rmware image. Go to Getting xes recent level of rmware. Then, reboot the system. If the system rmware update does not resolve the problem, go to “Contacting IBM service and support” on page 61. This ends the procedure.

EPUB_PRC_ALL_PROCS isolation procedure

A problem was detected with a system processor.
About this task
Use the following table to determine the service action:
and update the system rmware with the most
50
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Table 11. EPUB_PRC_ALL_PROCS service actions
System Service action
5104-22C or 9006-22C Replace the following items, one at a time, in the
order that is shown until the problem is resolved:
1. System processor CPU 1
2. System processor CPU 2
3. System backplane
Go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and removal and replacement procedure. If the replacement of the system processors and the system backplane does not resolve the problem, go to “Contacting IBM service and support” on page 61. This ends
the procedure.
9006-12P Replace the following items, one at a time, in the
order that is shown until the problem is resolved:
1. System processor CPU 1
2. System processor CPU 2
3. System backplane
9006-22P Replace the following items, one at a time, in the

EPUB_PRC_ALL_MEMCRDS isolation procedure

A problem was detected with a memory DIMM, but it cannot be isolated to a specic memory DIMM.
Go to “9006-12P locations” on page 75 to identify the physical location and removal and replacement procedure. If the replacement of the system processors and the system backplane does not resolve the problem, go to “Contacting IBM service and support” on page 61. This ends the
procedure.
order that is shown until the problem is resolved:
1. System processor CPU 1
2. System processor CPU 2
3. System backplane
Go to “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure. If the replacement of the system processors and the system backplane does not resolve the problem, go to “Contacting IBM service and support” on page 61. This ends the
procedure.
Procedure
1. Use the ipmitool command to examine system event logs (SELs).
• To list SELs by using an in-band network, use the following command:
ipmitool sel elist
Beginning troubleshooting and problem analysis
51
• To list SELs remotely over the LAN, use the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP addres or BMC hostname> sel elist
2. Identify all SELs with the value OEM record df and Transition to Non-recoverable in the description. Did you nd one or more SELs with the value OEM record df and Transition to Non-
recoverable in the description?
If Then
Yes: Continue with the next step.
No: Go to “Contacting IBM service and support” on page 61. This ends the procedure.
3. For each of the SELs that you identied in step “2” on page 52, determine the sensor name that is associated with each SEL. Replace the following items, one at a time, until the problem is resolved:
Note:
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to
identify the physical location and removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical
location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91
location and the removal and replacement procedure.
• If the sensor name is Membuf Func x, replace the system backplane.
• If the sensor name is P1-DIMMA1 Func, replace P1-DIMMA1. If the sensor name is P1-DIMMA2
Func, replace P1-DIMMA2. And so on.
to identify the physical
Does the problem persist?
If
Yes: If you have not already done so, replace the system backplane. If the replacement of
No: This ends the procedure.
Then
the system backplane does not resolve the problem, go to “Contacting IBM service and support” on page 61. This ends the procedure.

EPUB_PRC_LVL_SUPPORT isolation procedure

Contact your next level of support for assistance.
About this task
Go to “Contacting IBM service and support” on page 61.

EPUB_PRC_MEMORY_PLUGGING_ERROR isolation procedure

Memory DIMMs are plugged in a conguration that is not valid.

EPUB_PRC_FSI_PATH isolation procedure

The system detected an error with the FSI path.
About this task
Use the following table to determine the service action:
52
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Table 12. EPUB_PRC_FSI_PATH service actions
System Service action
5104-22C or 9006-22C Replace the following items, one at a time, in the
order that is shown until the problem is resolved:
1. System processor CPU 1
2. System processor CPU 2
3. System backplane
Go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and removal and replacement procedure. If the replacement of the system processors and the system backplane does not resolve the problem, go to “Contacting IBM service and support” on page 61. This ends
the procedure.
9006-12P Replace the following items, one at a time, in the
order that is shown until the problem is resolved:
1. System processor CPU 1
2. System processor CPU 2
3. System backplane
9006-22P Replace the following items, one at a time, in the

EPUB_PRC_PROC_AB_BUS isolation procedure

A diagnostic function detected an external processor interface problem.
Go to “9006-12P locations” on page 75 to identify the physical location and removal and replacement procedure. If the replacement of the system processors and the system backplane does not resolve the problem, go to “Contacting IBM service and support” on page 61. This ends the
procedure.
order that is shown until the problem is resolved:
1. System processor CPU 1
2. System processor CPU 2
3. System backplane
Go to “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure. If the replacement of the system processors and the system backplane does not resolve the problem, go to “Contacting IBM service and support” on page 61. This ends the
procedure.
About this task
Use the following table to determine the service action:
Beginning troubleshooting and problem analysis
53
Table 13. EPUB_PRC_PROC_AB_BUS service actions
System Service action
5104-22C or 9006-22C Replace the system backplane. If replacing the
system backplane does not resolve the problem, replace system processor CPU 1. If replacing system processor CPU 1 does not resolve the problem, replace system processor CPU 2. Go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and removal and replacement procedure.
If replacing the system backplane and both system processors does not resolve the problem, go to “Contacting IBM service and support” on page
61. This ends the procedure.
9006-12P
9006-22P
Replace the system backplane. If replacing the system backplane does not resolve the problem, replace system processor CPU 1. If replacing system processor CPU 1 does not resolve the problem, replace system processor CPU 2. Go to “9006-12P locations” on page 75 to identify the physical location and removal and replacement procedure.
If replacing the system backplane and both system processors does not resolve the problem, go to “Contacting IBM service and support” on page
61. This ends the procedure.
Replace the system backplane. If replacing the system backplane does not resolve the problem, replace system processor CPU 1. If replacing system processor CPU 1 does not resolve the problem, replace system processor CPU 2. Go to “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
If replacing the system backplane and both system processors does not resolve the problem, go to “Contacting IBM service and support” on page
61. This ends the procedure.

EPUB_PRC_PROC_XYZ_BUS isolation procedure

A diagnostic function detected an internal processor interface problem.
About this task
Use the following table to determine the service action:
54
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Table 14. EPUB_PRC_PROC_XYZ_BUS service actions
System Service action
5104-22C or 9006-22C Replace system processor CPU 1. If replacing
system processor CPU 1 does not resolve the problem, replace system processor CPU 2. If replacing both system processors does not resolve the problem, replace the system backplane. Go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and removal and replacement procedure.
If replacing the system backplane and both system processors does not resolve the problem, go to “Contacting IBM service and support” on page
61. This ends the procedure.
9006-12P Replace system processor CPU 1. If replacing
system processor CPU 1 does not resolve the problem, replace system processor CPU 2. If replacing both system processors does not resolve the problem, replace the system backplane. Go to “9006-12P locations” on page 75 to identify the physical location and removal and replacement procedure.
9006-22P Replace system processor CPU 1. If replacing

EPUB_PRC_EIBUS_ERROR isolation procedure

A bus error occurred.
Procedure
1. Use the ipmitool command to examine system event logs (SELs).
If replacing the system backplane and both system processors does not resolve the problem, go to “Contacting IBM service and support” on page
61. This ends the procedure.
system processor CPU 1 does not resolve the problem, replace system processor CPU 2. If replacing both system processors does not resolve the problem, replace the system backplane. Go to “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
If replacing the system backplane and both system processors does not resolve the problem, go to “Contacting IBM service and support” on page
61. This ends the procedure.
• To list SELs by using an in-band network, use the following command:
ipmitool sel elist
• To list SELs remotely over the LAN, use the following command:
Beginning troubleshooting and problem analysis
55
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP addres or BMC hostname> sel elist
2. Identify all SELs with the value OEM record df and Correctable Machine Check Error or Transition to Non-recoverable in the description. Did you nd one or more SELs with the value OEM record df and Correctable Machine Check Error or Transition to Non-recoverable in the description?
If Then
Yes: Continue with the next step.
No: Go to “Contacting IBM service and support” on page 61. This ends the procedure.
3. For each of the SELs that you identied in step “2” on page 56, determine the sensor name that is associated with each SEL. Replace the following items, one at a time until the problem is resolved:
Note:
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to
identify the physical location and removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical
location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical
location and the removal and replacement procedure.
• If the sensor name is CPU Func 1 or CPU Core Func x, where x is 1 - 12, replace system processor
CPU 1.
• If the sensor name is CPU Func 2 or CPU Core Func x, where x is 13 - 24, replace system processor
CPU 2.
Does the problem persist?
If
Yes: Replace the system backplane. If the replacement of the system backplane does not
No: This ends the procedure.
Then
resolve the problem, go to “Contacting IBM service and support” on page 61. This ends the procedure.

EPUB_PRC_POWER_ERROR isolation procedure

A power problem occurred.
About this task
Perform the service action that is indicated for any system event logs that are related to power and occurred prior to the problem that you are working on. Go to “Identifying a service action by using system event logs” on page 18. This ends the procedure.

EPUB_PRC_MEMORY_UE isolation procedure

An uncorrectable memory problem occurred.
Procedure
1. Look for system event logs that are related to memory and occurred around the same time as the problem that you are working on. Go to “Identifying a service action by using system event logs” on page 18. Did you nd any system event logs that are related to memory?
If
Then
Yes: Perform the service actions that are indicated for the system event logs that are
related to memory. This ends the procedure.
56 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
If Then
No: Continue with the next step.
2. Use the following table to determine the service action:
Table 15. EPUB_PRC_MEMORY_UE service actions
System Service action
5104-22C or 9006-22C Replace system processor CPU 1. If replacing the
system processor CPU 1 does not resolve the problem, replace system processor CPU 2.
Go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and removal and replacement procedure. This ends
the procedure.
9006-12P Replace system processor CPU 1. If replacing the
system processor CPU 1 does not resolve the problem, replace system processor CPU 2.
Go to “9006-12P locations” on page 75 to identify the physical location and removal and replacement procedure. This ends the
procedure.
9006-22P Replace system processor CPU 1. If replacing the

EPUB_PRC_HB_CODE isolation procedure

The service processor detected a problem during the early boot process.
Procedure
1. Update the system rmware image. Go to Getting xes and update the system rmware with the most recent level of rmware. Then, reboot the system. Does the problem persist?
If
Yes: Continue with the next step.
No: This ends the procedure.
2. Use the ipmitool command to examine system event logs (SELs).
• To list SELs by using an in-band network, use the following command:
Then
system processor CPU 1 does not resolve the problem, replace system processor CPU 2.
Go to “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure. This ends the
procedure.
ipmitool sel elist
• To list SELs remotely over the LAN, use the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP addres or BMC hostname> sel elist
Beginning troubleshooting and problem analysis
57
3. Identify all SELs with the value OEM record df and Correctable Machine Check Error or Transition to Non-recoverable in the description. Did you nd one or more SELs with the value OEM record df and Correctable Machine Check Error or Transition to Non-recoverable in the description?
If Then
Yes: Continue with the next step.
No: Go to “Contacting IBM service and support” on page 61. This ends the procedure.
4. For each of the SELs that you identied in step “3” on page 58, determine the sensor name that is associated with each SEL. Replace the following items, one at a time, until the problem is resolved:
Note:
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to
identify the physical location and removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical
location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical
location and the removal and replacement procedure.
• If the sensor name is CPU Func 1 or CPU Core Func x, where x is 1 - 12, replace system processor
CPU 1.
• If the sensor name is CPU Func 2 or CPU Core Func x, where x is 13 - 24, replace system processor
CPU 2.
Does the problem persist?
If
Yes: Replace the system backplane. If the replacement of the system backplane does not
No: This ends the procedure.
Then
resolve the problem, go to “Contacting IBM service and support” on page 61. This ends the procedure.

EPUB_PRC_TOD_CLOCK_ERR isolation procedure

A diagnostic function detected a problem with the time of day or clock function.
About this task
Use the following table to determine the service action:
Table 16. EPUB_PRC_TOD_CLOCK_ERR service actions
System Service action
5104-22C or 9006-22C Replace the system backplane. If replacing the
system backplane does not resolve the problem, replace system processor CPU 1. If replacing system processor CPU 1 does not resolve the problem, replace system processor CPU 2. Go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and removal and replacement procedure.
If replacing the system backplane and both system processors does not resolve the problem, go to “Contacting IBM service and support” on page
61. This ends the procedure.
58 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 16. EPUB_PRC_TOD_CLOCK_ERR service actions (continued)
System Service action
9006-12P Replace the system backplane. If replacing the
system backplane does not resolve the problem, replace system processor CPU 1. If replacing system processor CPU 1 does not resolve the problem, replace system processor CPU 2. Go to “9006-12P locations” on page 75 to identify the physical location and removal and replacement procedure.
If replacing the system backplane and both system processors does not resolve the problem, go to “Contacting IBM service and support” on page
61. This ends the procedure.
9006-22P Replace the system backplane. If replacing the
system backplane does not resolve the problem, replace system processor CPU 1. If replacing system processor CPU 1 does not resolve the problem, replace system processor CPU 2. Go to “9006-22P locations” on page 91 to identify the physical location and removal and replacement procedure.
If replacing the system backplane and both system processors does not resolve the problem, go to “Contacting IBM service and support” on page
61. This ends the procedure.

EPUB_PRC_COOLING_SYSTEM_ERR isolation procedure

One or more processor sensors detected an over temperature condition.
About this task
To resolve the over temperature condition, complete the following steps:
Procedure
1. Is the room temperature less than 35°C (95°F)?
If
No: Bring the room temperature to within the allowable operating range. This ends the
Yes: Continue with the next step.
2. Are the system front and rear doors free of obstructions?
If
No: The system must be free of obstructions for proper air flow. Remove any obstructions.
Then
procedure.
Then
This ends the procedure.
Yes: Continue with the next step.
3. Perform the service action that is indicated for any system event logs that are related to fans and occurred prior to the problem that you are working on. Go to “Identifying a service action by using system event logs” on page 18. This ends the procedure.
Beginning troubleshooting and problem analysis
59

Verifying a repair

Learn how to verify hardware operation after you make repairs to the system.
Procedure
1. Power on the system.
2. Did you replace a PCIe adapter or device?
If Then
Yes: Go to step “5” on page 60.
No: Continue with the next step.
3. Scan the system event logs (SELs) for serviceable events that occurred after system hardware was replaced. For information about SELs that require a service action, see “Identifying a service action by using system event logs” on page 18.
4. Did any serviceable SEL events occur after hardware was replaced?
If Then
Yes: The problem is not resolved. Go to “Identifying a service action by using system event
No: The problem is resolved. This ends the procedure.
5. Use the following table to determine the verication action to complete:
logs” on page 18 and complete the service actions indicated. This ends the
procedure.
Table 17. Determining a
Adapter type Verication action
Devices that are not controlled by a RAID adapter If the device is a SAS or SATA drive, complete the
Network adapter Complete the following steps:
verication action for PCIe adapters and devices

Collecting diagnostic data

following steps:
a. Install the arcconf utility. b. Type arcconf getsmartstats 1 and press
Enter.
c. Verify that the SMART health assessment
passed.
a. At the command line, type ethtool ethx,
where x is the number of the physical port that you are testing. Verify that the connection speed that is indicated in the output is correct.
b. Perform a ping test to verify the network
connectivity.
Learn how to collect diagnostic data to send to IBM service and support.
About this task
To collect diagnostic data, complete the following steps:
60
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Procedure
1. Is the operating system available?
If Then
Yes: Continue with step “2” on page 61.
No: Continue with step “3” on page 61.
2. To collect diagnostic data from the operating system, complete the following steps:
a) Log in as root user. b) At the command prompt, type sosreport and press Enter. c) You are prompted for additional information. When the command is complete, the location of the
output le is displayed. Note the location of the output le. Then, continue with the next step.
3. To collect system event logs, complete the following steps:
a) Go to the IBM Support Portal (http://www.ibm.com/support/entry/portal/support). b) In the search eld, enter your machine type and model. Then, click the correct product support
entry for your system.
c) From the Downloads list, click the Scale-out LC System Event Log Collection Tool for your
machine type and model.
d) Follow the instructions to install and run the system event log collection tool. Then, continue with
the next step.
4. Send the data that you collected during this procedure to IBM service and support. This ends the
procedure.

Contacting IBM service and support

You can contact IBM service and support by telephone or through the IBM Support Portal.
Before you contact IBM service and support, go to “Beginning troubleshooting and problem analysis” on page 1 and complete all of the service actions indicated. If the service actions do not resolve the problem, or if you are directed to contact support, go to “Collecting diagnostic data” on page 60. Then, use the information below to contact IBM service and support.
Customers in the United States, United States territories, or Canada can place a hardware service request online. To place a hardware service request online, go to the IBM Support Portal (http://www.ibm.com/ support/entry/portal/product/power/scale-out_lc).
For up-to-date telephone contact information, go to the Directory of worldwide contacts website (www.ibm.com/planetwide/).
Table 18. Service and support contacts
Type of problem Call
• Advice
• Migrating
• "How to"
• Operating
Conguring
• Ordering
• Performance
• General information
• 1-800-IBM-CALL (1–800–426–2255)
• 1-800-IBM-4YOU (1–800–426–4968)
Beginning troubleshooting and problem analysis 61
Table 18. Service and support contacts (continued)
Type of problem Call
Software:
• Fix information
• Operating system problem
• IBM application program
• Loop, hang, or message
Hardware:
• IBM system hardware broken
• Hardware reference code
• IBM input/output (I/O) problem
• Upgrade
1-800-IBM-SERV (1–800–426–7378)
62 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P

Finding parts and locations

Locate physical part locations and identify parts with system diagrams.
Locate the FRU
Use the graphics and tables to locate the eld-replaceable unit (FRU) and identify the FRU part number.

5104-22C or 9006-22C locations

Use this information to nd the location of a FRU in the system unit.
Rack views
The following diagrams show eld-replaceable unit (FRU) layouts in the system. Use these diagrams with the following tables.
Figure 1. Front view
Table 19. Front view locations
Index number FRU description FRU removal and replacement
procedures
1 HDD 0 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
2 HDD 1 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
3 HDD 2 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
4 HDD 3 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
5 HDD 4 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
6 HDD 5 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
©
Copyright IBM Corp. 2017, 2019 63
Table 19. Front view locations (continued)
Index number FRU description FRU removal and replacement
procedures
7 HDD 6 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
8 HDD 7 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
9 HDD 8 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
10 HDD 9 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
11 HDD 10 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
12 HDD 11 See Removing and replacing a
disk drive in the 5104-22C or 9006-22C.
Figure 2. Top view
64
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Table 20. Top view locations
Index number FRU description FRU removal and replacement
procedures
13 Disk drive backplane See Removing and replacing the
disk drive backplane in the 5104-22C or 9006-22C.
14 Fan 2 See Removing and replacing fans
in the 5104-22C or 9006-22C.
15 Fan 3 See Removing and replacing fans
in the 5104-22C or 9006-22C.
16 Fan 6 See Removing and replacing fans
in the 5104-22C or 9006-22C.
17 Fan 7 See Removing and replacing fans
in the 5104-22C or 9006-22C.
18 CPU 1 See Removing and replacing a
system processor module for the 5104-22C or 9006-22C.
19 CPU 2 See Removing and replacing a
system processor module for the 5104-22C or 9006-22C.
20 Time-of-day battery See Removing and replacing the
time-of-day battery in the 5104-22C or 9006-22C.
21 System backplane See Removing and replacing the
system backplane in the 5104-22C or 9006-22C.
22 PSU 1 See Removing and replacing a
power supply in the 5104-22C or 9006-22C.
23 PSU 2 See Removing and replacing a
power supply in the 5104-22C or 9006-22C.
Figure 3. Rear view
Finding parts and locations
65
Table 21. Rear view locations
Index number FRU description FRU removal and replacement
procedures
22 PSU 1 See Removing and replacing a
power supply in the 5104-22C or 9006-22C.
23 PSU 2 See Removing and replacing a
power supply in the 5104-22C or 9006-22C.
25 PCIe adapter 0 (UIO Slot2) See Removing and replacing PCIe
adapters in the 5104-22C or 9006-22C.
26 PCIe adapter 1 (UIO Slot3) See Removing and replacing PCIe
adapters in the 5104-22C or 9006-22C.
27 PCIe adapter 2 (UIO Slot1) See Removing and replacing PCIe
adapters in the 5104-22C or 9006-22C.
28 PCIe adapter 3 (WIO Slot3) See Removing and replacing PCIe
adapters in the 9006-22C.
29 PCIe adapter 4 (WIO-R Slot) See Removing and replacing PCIe
adapters in the 9006-22C.
30 PCIe adapter 5 (WIO Slot2) See Removing and replacing PCIe
adapters in the 5104-22C or 9006-22C.
31 PCIe adapter 6 (WIO Slot4) See Removing and replacing PCIe
adapters in the 5104-22C or 9006-22C.
32 PCIe adapter 7 (WIO Slot1) See Removing and replacing PCIe
adapters in the 5104-22C or 9006-22C.
33 PCIe riser and network adapter
(UIO Network)
Memory locations
The following diagram shows memory DIMMs and their corresponding eld-replaceable unit (FRU) layouts in the system. Use this diagram with the following table.
See Removing and replacing PCIe adapters in the 5104-22C or 9006-22C.
66
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Figure 4. Memory locations
The following table provides the memory locations.
Table 22. Memory locations
Index number FRU description FRU removal and replacement
procedures
34 P1-DIMMA1 See Removing and replacing
memory in the 5104-22C or 9006-22C.
35 P1-DIMMA2 See Removing and replacing
memory in the 5104-22C or 9006-22C.
36 P1-DIMMB1 See Removing and replacing
memory in the 5104-22C or 9006-22C.
37 P1-DIMMB2 See Removing and replacing
memory in the 5104-22C or 9006-22C.
38 P1-DIMMC1 See Removing and replacing
memory in the 5104-22C or 9006-22C.
39 P1-DIMMC2 See Removing and replacing
memory in the 5104-22C or 9006-22C.
Finding parts and locations 67
Table 22. Memory locations (continued)
Index number FRU description FRU removal and replacement
procedures
40 P1-DIMMD1 See Removing and replacing
memory in the 5104-22C or 9006-22C.
41 P1-DIMMD2 See Removing and replacing
memory in the 5104-22C or 9006-22C.
42 P2-DIMMA1 See Removing and replacing
memory in the 5104-22C or 9006-22C.
43 P2-DIMMA2 See Removing and replacing
memory in the 5104-22C or 9006-22C.
44 P2-DIMMB1 See Removing and replacing
memory in the 5104-22C or 9006-22C.
45 P2-DIMMB2 See Removing and replacing
memory in the 5104-22C or 9006-22C.
46 P2-DIMMC1 See Removing and replacing
47 P2-DIMMC2 See Removing and replacing
48 P2-DIMMD1 See Removing and replacing
49 P2-DIMMD2 See Removing and replacing

5104-22C or 9006-22C parts

Use this information to nd the eld-replaceable unit (FRU) part number.
After you identify the part number of the part that you want to order, go to Advanced Part Exchange Warranty Service. Registration is required. If you are not able to identify the part number, go to Contacting IBM service and support.
memory in the 5104-22C or 9006-22C.
memory in the 5104-22C or 9006-22C.
memory in the 5104-22C or 9006-22C.
memory in the 5104-22C or 9006-22C.
68
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Rack nal assembly
Figure 5. Rack nal assembly
Table 23. Rack
Index number
1 01EM628
2 01EM628
nal assembly part numbers
IBM part number (Supermicro part number)
(MCP-290-0 0057-0N)
(MCP-290-0 0057-0N)
Units per assembly
1 Slide rail kit - contains left and right slide rails and
1 Slide rail kit - contains left and right slide rails and
Description
attaching screws
attaching screws
Finding parts and locations 69
System parts
Figure 6. System parts
70
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Table 24. System parts
Index number
1 1 Top cover assembly
2 1 CPU air baffle
3 01EM725 (SNK-
4 01EM276 (PP9-
IBM part number (Supermicro part number)
P0053P-IB001 )
MP02AA780-K­IB001)
02CL503 (PP9­MP02AA880-K­IB001)
01EM360 (PP9­MP02AA798-K­IB001)
Units per assembly
2 Screws
2 Heat sink kit (includes heat sink and thermal interface
1-2 16 core 2.9 GHz DD2.01 system processor module kit
1-2 16 core 2.9 GHz DD2.1 system processor module kit
1-2 16 core 2.7 GHz DD2.01system processor module kit
Description
material)
(includes system processor module tray and removal tool)
Note: System processor modules with version DD2.01 are not compatible with system processor modules with version DD2.1 or DD2.11.
(includes system processor module tray and removal tool)
Note: System processor modules with version DD2.1 or DD2.11 are not compatible with system processor modules with version DD2.01.
(includes system processor module tray and removal tool) (9006-22C)
02CL674 (PP9­MP02AA986-K­IB001)
5 01EM644 (MEM-
DR480L-SL02­ER26)
01EM738 (MEM­DR416L-SL02­ER26)
01EM739 (MEM­DR432L-SL02­ER26)
6 01EM416 (MBD-
P9DSU-K0-IB001­B)
Note: System processor modules with version DD2.01 are not compatible with system processor modules with version DD2.1 or DD2.11.
1-2 16 core 2.9 GHz DD2.11 system processor module kit
(includes system processor module tray and removal tool)
Note: System processor modules with version DD2.1 or DD2.11 are not compatible with system processor modules with version DD2.01.
16 8 GB, 2666 MHz 1RX4 DDR4 RDIMM* (9006-22C)
16 16 GB, 2666 MHz 1RX4 DDR4 RDIMM* (5104-22C)
16 32 GB, 2666 MHz 2RX4 DDR4 RDIMM*
1 System backplane kit
7 10 Screws
8 01EM604
(FAN-0166L4)
4 Fan
Finding parts and locations 71
Table 24. System parts (continued)
Index number
9 01EM614 (BPN-
10 7 Screws
11 01EM652 (HDD-
12 01EM619
13 01EM722 (RSC-
14 1 PCI adapter. Use the feature type of the adapter to nd the
15 1 PCI adapter. Use the feature type of the adapter to nd the
IBM part number (Supermicro part number)
SAS3-826A-N4)
A2000­ST2000NM0135)
01EM654 (HDD­A8000­ST8000NM0075)
(PWS-1K22A-1R)
W2R-A8P)
Units per assembly
1 Disk drive backplane (supports 12 SAS or SATA drives)
12 2 TB 3.5 inch SAS disk drive
2 1.2KW power supply
1 PCIe riser for PCIe adapter 4 (WIO-R Slot)
Description
8 TB 3.5 inch SAS disk drive (9006-22C)
FRU number in PCIe adapter information by feature type for the 5104-22C or 9006-22C
FRU number in PCIe adapter information by feature type for the 5104-22C or 9006-22C
16 01EM721
(AOC-2UR688­i4XTF-IB001)
17 1 PCIe cage
18 4 PCI adapters. Use the feature type of the adapter to nd the
19 1 PCIe riser
20 01EM723 (RSC-
W2-6688P)
21 1 PCI adapter. Use the feature type of the adapter to nd the
*All of the memory in a 5104-22C or 9006-22C system must be the same size. The 5104-22C and 9006-22C systems do not support mixing different sizes of memory.
1 2U UIO NIC PCIe adapter with integrated 4-port 10 GbE
Base-T, Intel XL710, and CAPI
Note: This PCIe adapter is also a PCIe riser.
FRU number in PCIe adapter information by feature type for the 5104-22C or 9006-22C.
1 PCIe riser for PCIe adapter 3 (WIO Slot3), PCIe adapter 5
(WIO Slot2), PCIe adapter 6 (WIO Slot4) and PCIe adapter 7 (WIO Slot1).
FRU number in PCIe adapter information by feature type for the 5104-22C or 9006-22C
72
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Miscellaneous parts
Table 25. Miscellaneous parts
Description IBM part
number (Supermicro part number)
50 cm OCuLink-OCuLink X8 cable 01EM626
(CBL­SAST-0934-12 )
Slide rail kit for round hole racks - contains left and right slide rails and attaching screws (5104-22C)
CR2032 Lithium time-of-day battery
01EM630 (MCP-290-829 14-0N)
Units per assembly
1
Finding parts and locations 73
74 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P

Finding parts and locations

Locate physical part locations and identify parts with system diagrams.
Locate the FRU
Use the graphics and tables to locate the eld-replaceable unit (FRU) and identify the FRU part number.

9006-12P locations

Use this information to nd the location of a FRU in the system unit.
Rack views
The following diagrams show eld-replaceable unit (FRU) layouts in the system. Use these diagrams with the following tables.
Figure 7. Front view
Table 26. Front view locations
Index number FRU description FRU removal and replacement
procedures
1 HDD 0 or NVMe 0 See Removing and replacing a
storage drive in the 9006-12P.
2 HDD 1 or NVMe 1 See Removing and replacing a
storage drive in the 9006-12P.
3 HDD 2 or NVMe 2 See Removing and replacing a
storage drive in the 9006-12P.
4 HDD 3 or NVMe 3 See Removing and replacing a
storage drive in the 9006-12P.
©
Copyright IBM Corp. 2017, 2019 75
Figure 8. Top view
Table 27. Top view locations
Index number FRU description FRU removal and replacement
procedures
5 Disk drive backplane See Removing and replacing the
disk drive backplane in the 9006-12P.
6 Fan 1 See Removing and replacing fans
in the 9006-12P.
7 Fan 2 See Removing and replacing fans
in the 9006-12P.
8 Fan 3 See Removing and replacing fans
in the 9006-12P.
9 Fan 4 See Removing and replacing fans
in the 9006-12P.
10 Fan 5 See Removing and replacing fans
in the 9006-12P.
11 Fan 6 See Removing and replacing fans
in the 9006-12P.
12 Fan 7 See Removing and replacing fans
in the 9006-12P.
13 Fan 8 See Removing and replacing fans
in the 9006-12P.
76 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 27. Top view locations (continued)
Index number FRU description FRU removal and replacement
procedures
14 CPU 1 See Removing and replacing a
system processor module for the 9006-12P.
15 CPU 2 See Removing and replacing a
system processor module for the 9006-12P.
16 Time-of-day battery See Removing and replacing the
time-of-day battery in the 9006-12P.
17 System backplane See Removing and replacing the
system backplane in the 9006-12P.
18 PSU 1 See Removing and replacing a
power supply in the 9006-12P.
19 PSU 2 See Removing and replacing a
power supply in the 9006-12P.
20 Trusted platform module (TPM)
card
Figure 9. Rear view
Table 28. Rear view locations
Index number FRU description FRU removal and replacement
18 PSU 1 See Removing and replacing a
19 PSU 2 See Removing and replacing a
21 Network adapter and PCIe riser
(UIO Network)
See Removing and replacing the TPM card in the 9006-12P.
procedures
power supply in the 9006-12P.
power supply in the 9006-12P.
See Removing and replacing PCIe adapters in the 9006-12P.
22 PCIe adapter 1 (UIO Slot1)
Note: This adapter does not have any external connectors.
23 PCIe adapter 2 (WIO Slot1) See Removing and replacing PCIe
24 PCIe adapter 3 (WIO Slot2) See Removing and replacing PCIe
See Removing and replacing PCIe adapters in the 9006-12P.
adapters in the 9006-12P.
adapters in the 9006-12P.
Finding parts and locations 77
Table 28. Rear view locations (continued)
Index number FRU description FRU removal and replacement
procedures
25 PCIe adapter 4 (WIO-R Slot) See Removing and replacing PCIe
adapters in the 9006-12P.
Memory locations
The following diagram shows memory DIMMs and their corresponding eld-replaceable unit (FRU) layouts in the system. Use this diagram with the following table.
Figure 10. Memory locations
The following table provides the memory locations.
Table 29. Memory locations
Index number FRU description FRU removal and replacement
procedures
26 P1-DIMMA1 See Removing and replacing
memory in the 9006-12P.
27 P1-DIMMA2 See Removing and replacing
memory in the 9006-12P.
28 P1-DIMMB1 See Removing and replacing
memory in the 9006-12P.
29 P1-DIMMB2 See Removing and replacing
memory in the 9006-12P.
78 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 29. Memory locations (continued)
Index number FRU description FRU removal and replacement
procedures
30 P1-DIMMC1 See Removing and replacing
memory in the 9006-12P.
31 P1-DIMMC2 See Removing and replacing
memory in the 9006-12P.
32 P1-DIMMD1 See Removing and replacing
memory in the 9006-12P.
33 P1-DIMMD2 See Removing and replacing
memory in the 9006-12P.
34 P2-DIMMA1 See Removing and replacing
memory in the 9006-12P.
35 P2-DIMMA2 See Removing and replacing
memory in the 9006-12P.
36 P2-DIMMB1 See Removing and replacing
memory in the 9006-12P.
37 P2-DIMMB2 See Removing and replacing
memory in the 9006-12P.
38 P2-DIMMC1 See Removing and replacing
memory in the 9006-12P.
39 P2-DIMMC2 See Removing and replacing
memory in the 9006-12P.
40 P2-DIMMD1 See Removing and replacing
memory in the 9006-12P.
41 P2-DIMMD2 See Removing and replacing
memory in the 9006-12P.
Drive on module (DOM) locations
The following diagram shows drive on module (DOM)s and their corresponding eld-replaceable unit (FRU) layouts in the system. Use this diagram with the following table.
Finding parts and locations
79
Figure 11. Drive on module (DOM) locations
The following table provides the drive on module (DOM) locations.
Table 30. Drive on module (DOM) locations
Index number FRU description FRU removal and replacement
42 Drive on module (DOM) 1 See Removing and replacing a
43 Drive on module (DOM) 2 See Removing and replacing a

9006-12P parts

Use this information to nd the eld-replaceable unit (FRU) part number.
After you identify the part number of the part that you want to order, go to Advanced Part Exchange Warranty Service. Registration is required. If you are not able to identify the part number, go to Contacting IBM service and support.
procedures
storage drive in the 9006-12P.
storage drive in the 9006-12P.
80
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Rack nal assembly
Figure 12. Rack nal assembly
Table 31. Rack
Index number
1 02CM137
2 02CM137
nal assembly part numbers
IBM part number (Supermicro part number)
(MCP-290-0 0052-0N)
(MCP-290-0 0052-0N)
Units per assembly
1 Slide rail kit - contains left and right slide rails and
1 Slide rail kit - contains left and right slide rails and
Description
attaching screws
attaching screws
Finding parts and locations 81
System parts
Figure 13. System parts
Table 32. System parts
Index number
1 1 Top cover assembly
2 2 PCIe adapter. Use the feature type of the adapter to nd the
3 01EM611 (RSC-
4 1 PCIe cage
5 02CM139 (RSC-
6 1 PCIe adapter. Use the feature type of the adapter to nd the
IBM part number (Supermicro part number)
W-66P-IB001)
R1UW-E8R-IB001)
Units per assembly
2 Screws
1 PCIe riser for PCIe adapters. Use the feature type of the
1 PCIe riser
Description
FRU number in PCIe adapter information by feature type for the 9006-12P.
adapter to nd the FRU number in PCIe adapter information by feature type for the 9006-12P.
FRU number in PCIe adapter information by feature type for the 9006-12P
82 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 32. System parts (continued)
Index number
7 1 PCIe adapter. Use the feature type of the adapter to nd the
8 01EM608 (AOC-
9 02CM144
IBM part number (Supermicro part number)
UR-i4XTF-IB001)
(PWS-1K02A-1R)
Units per assembly
1 1U UIO NIC PCIe adapter with integrated 4-port 10 GbE
2 Power supply
Description
FRU number in PCIe adapter information by feature type for the 9006-12P
Base-T, Intel XL710, and CAPI
Note: This PCIe adapter is also a PCIe riser.
Finding parts and locations 83
Table 32. System parts (continued)
Index number
10 01EM682 (HDD-
IBM part number (Supermicro part number)
KIT-2A-ST1200­IB001)
01EM683 (HDD­KIT-2A-ST1800­IB001)
01EM652 (HDD­A2000­ST2000NM0135)
01EM653 (HDD­A4000­ST4000NM0125)
01EM654 (HDD­A8000­ST8000NM0075)
01EM655 (HDD­A10T­ST10000NM0096)
Units per assembly
4 1.2 TB 10k 2.5 inch SAS disk drive
4 1.8 TB 10k 2.5 inch SAS disk drive
4 2.0 TB 7.2K (512 block size) 3.5 inch SAS disk drive
4 4.0 TB 7.2K (512 block size) 3.5 inch SAS disk drive
4 8.0 TB 7.2K (512 block size) 3.5 inch SAS disk drive
4 10.0 TB 7.2K (512 block size) 3.5 inch SAS disk drive
Description
01EM656 (HDD­A4000­ST4000NM0075)
01EM657 (HDD­A8000­ST8000NM0095)
02CM136 (HDD­T2000­ST2000NM0125)
01EM659 (HDD­T4000­ST4000NM0115)
01EM660 (HDD­T8000­ST8000NM0055)
01EM661 (HDD­A10T­ST10000NM0096)
4 4.0 TB 7.2K (4k block size) 3.5 inch self-encrypting SAS disk
drive
4 8.0 TB 7.2K (4k block size) 3.5 inch self-encrypting SAS disk
drive
4 2.0 TB 7.2K (512 block size) 3.5 inch SATA disk drive
4 4.0 TB 7.2K (512 block size) 3.5 inch SATA disk drive
4 8.0 TB 7.2K (512 block size) 3.5 inch SATA disk drive
4 10.0 TB 7.2K (512 block size) 3.5 inch SATA disk drive
84 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Table 32. System parts (continued)
Index number
10 01EM671 (HDS-
IBM part number (Supermicro part number)
KIT-2A-1920­IB001)
01EM672 (HDS­KIT-2A-3840­IB001)
01EM673 (HDS­KIT-2A3D-960­IB001)
01EM674 (HDS­KIT-2A3D-1920­IB001)
01EM675 (HDS­KIT-2A-1920S­IB001)
01EM676 (HDS­KIT-2A-3840S­IB001)
Units per assembly
4 1.92 TB 2.5 inch SAS solid-state drive (1 drive write per
4 3.84 TB 2.5 inch SAS solid-state drive (1 drive write per
4 960 GB 2.5 inch SAS solid-state drive (3 drive writes per
4 1.92 TB 2.5 inch SAS solid-state drive (3 drive writes per
4 1.92 TB 2.5 inch self-encrypting SAS solid-state drive (1
4 3.84 TB 2.5 inch self-encrypting SAS solid-state drive (1
Description
day)
day)
day)
day)
drive write per day)
drive write per day)
01EM664 (HDS­KIT-2T-240­IB001)
01EM665 (HDS­KIT-2T-960­IB001)
01EM667 (HDS­KIT-2T-3800­IB001)
01EM666 (HDS­KIT-2T-1900­IB001)
01EM684 (HDS­KIT-2T-480­IB001)
4 240 GB 2.5 inch self-encrypting SATA solid-state drive
(0.78 drive writes per day)
4 960 GB 2.5 inch SATA solid-state drive (0.6 drive writes per
day)
4 3.84 TB 2.5 inch self-encrypting SATA solid-state drive
(0.78 drive writes per day)
4 1.92 TB 2.5 inch self-encrypting SATA solid-state drive
(0.78 drive writes per day)
4 480 GB 2.5 inch self-encrypting SATA solid-state drive (3.5
drive writes per day)
Finding parts and locations 85
Table 32. System parts (continued)
Index number
10 01EM685 ( HDS-
IBM part number (Supermicro part number)
KIT-2T-960S­IB001)
01EM686 (HDS­KIT-2T-1920­IB001)
01EM679 (HDS­KIT-08N-960­IB001)
01EM680 (HDS­KIT-08N-1920­IB001)
01EM681 (HDS­KIT-08N-3840­IB001)
01EM668 (HDS­KIT-5N-800­IB001)
Units per assembly
4 960 GB 2.5 inch self-encrypting SATA solid-state drive (3.5
4 1.92 TB 2.5 inch self-encrypting SATA solid-state drive (3.5
4 960 GB 2.5 inch NVMe solid-state drive (0.8 drive writes per
4 1.92 TB 2.5 inch NVMe solid-state drive (0.8 drive writes
4 3.84 TB 2.5 inch NVMe solid-state drive (0.8 drive writes
4 800 GB 2.5 inch NVMe solid-state drive (5 drive writes per
Description
drive writes per day)
drive writes per day)
day)
per day)
per day)
day)
01EM669 (HDS­KIT-5N-1600­IB001)
01EM670 (HDS­KIT-5N-3200­IB001)
11 02CM140 (BPN-
SAS3-815TQ-N4)
12 2 Screws
13 02CM138
(FAN-0141L4)
14 2 Fan holder
15 01EM607
(MCP-310-81915­0B-OEM)
4 1.6 TB 2.5 inch NVMe solid-state drive (5 drive writes per
day)
4 3.2 TB 2.5 inch NVMe solid-state drive (5 drive writes per
day)
1 Disk drive backplane
8 Fan
2 CPU air baffle
86 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
Loading...