IBM Power System, Power System 9006-22P, Power System 5104-22C, Power System 9006-12P, Power System 9006-22C Problem Analysis, System Parts, And Locations

Power Systems
Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
IBM
Note
Before using this information and the product it supports, read the information in “Safety notices” on page v, “Notices” on page 109, the IBM Systems Safety Notices manual, G229-9054, and the IBM Environmental Notices and User Guide, Z125–5823.
This edition applies to IBM® Power Systems servers that contain the POWER9™ processor and to all associated models.
©
Copyright International Business Machines Corporation 2017, 2019.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

Contents

Safety notices........................................................................................................v
Beginning troubleshooting and problem analysis....................................................1
Determining the problem analysis procedure to perform.......................................................................... 1
Resolving a BMC access problem................................................................................................................2
Resolving a power problem......................................................................................................................... 5
Resolving a system rmware boot failure...................................................................................................5
Resolving a VGA monitor problem...............................................................................................................7
Resolving an operating system boot failure................................................................................................7
Resolving a sensor indicator problem......................................................................................................... 9
Resolving a hardware problem..................................................................................................................10
Resolving a PCIe adapter or device problem............................................................................................11
Resolving a network adapter problem.................................................................................................13
Resolving an NVMe Flash adapter problem........................................................................................ 15
Resolving a storage device problem....................................................................................................15
Identifying the location of the PCIe adapter by using the slot number..............................................16
Identifying the location of the NVMe Flash adapter........................................................................... 17
Identifying the location of the storage device.....................................................................................17
User guides for PCIe adapters............................................................................................................. 17
Identifying a service action....................................................................................................................... 18
Identifying a service action by using system event logs.....................................................................18
Identifying service action keywords in system event logs..................................................................23
Identifying a service action by using sensor and event information ................................................. 24
Isolation procedures..................................................................................................................................49
EPUB_PRC_FIND_DECONFIGURE_PART isolation procedure...........................................................49
EPUB_PRC_SP_CODE isolation procedure.......................................................................................... 50
EPUB_PRC_PHYP_CODE isolation procedure..................................................................................... 50
EPUB_PRC_ALL_PROCS isolation procedure......................................................................................50
EPUB_PRC_ALL_MEMCRDS isolation procedure................................................................................51
EPUB_PRC_LVL_SUPPORT isolation procedure................................................................................. 52
EPUB_PRC_MEMORY_PLUGGING_ERROR isolation procedure........................................................52
EPUB_PRC_FSI_PATH isolation procedure.........................................................................................52
EPUB_PRC_PROC_AB_BUS isolation procedure................................................................................ 53
EPUB_PRC_PROC_XYZ_BUS isolation procedure..............................................................................54
EPUB_PRC_EIBUS_ERROR isolation procedure.................................................................................55
EPUB_PRC_POWER_ERROR isolation procedure...............................................................................56
EPUB_PRC_MEMORY_UE isolation procedure....................................................................................56
EPUB_PRC_HB_CODE isolation procedure......................................................................................... 57
EPUB_PRC_TOD_CLOCK_ERR isolation procedure.............................................................................58
EPUB_PRC_COOLING_SYSTEM_ERR isolation procedure................................................................. 59
Verifying a repair........................................................................................................................................60
Collecting diagnostic data......................................................................................................................... 60
Contacting IBM service and support.........................................................................................................61
Finding parts and locations ................................................................................. 63
5104-22C or 9006-22C locations.............................................................................................................63
5104-22C or 9006-22C parts................................................................................................................... 68
Finding parts and locations ................................................................................. 75
9006-12P locations................................................................................................................................... 75
iii
9006-12P parts..........................................................................................................................................80
Finding parts and locations ................................................................................. 91
9006-22P locations................................................................................................................................... 91
9006-22P parts..........................................................................................................................................97
Notices..............................................................................................................109
Accessibility features for IBM Power Systems servers..........................................................................110
Privacy policy considerations .................................................................................................................111
Trademarks..............................................................................................................................................111
Electronic emission notices.....................................................................................................................111
Class A Notices...................................................................................................................................112
Class B Notices...................................................................................................................................115
Terms and conditions.............................................................................................................................. 117
iv

Safety notices

Safety notices may be printed throughout this guide:
DANGER notices call attention to a situation that is potentially lethal or extremely hazardous to people.
CAUTION notices call attention to a situation that is potentially hazardous to people because of some existing condition.
Attention notices call attention to the possibility of damage to a program, device, system, or data.
World Trade safety information
Several countries require the safety information contained in product publications to be presented in their national languages. If this requirement applies to your country, safety information documentation is included in the publications package (such as in printed documentation, on DVD, or as part of the product) shipped with the product. The documentation contains the safety information in your national language with references to the U.S. English source. Before using a U.S. English publication to install, operate, or service this product, you must rst become familiar with the related safety information documentation. You should also refer to the safety information documentation any time you do not clearly understand any safety information in the U.S. English publications.
Replacement or additional copies of safety information documentation can be obtained by calling the IBM Hotline at 1-800-300-8751.
German safety information
Das Produkt ist nicht für den Einsatz an Bildschirmarbeitsplätzen im Sinne § 2 der Bildschirmarbeitsverordnung geeignet.
Laser safety information
IBM servers can use I/O cards or features that are ber-optic based and that utilize lasers or LEDs.
Laser compliance
IBM servers may be installed inside or outside of an IT equipment rack.
DANGER:
Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard:
• If IBM supplied the power cord(s), connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
• Do not open or service any power supply assembly.
• Do not connect or disconnect any cables or perform installation, maintenance, or reconguration of this product during an electrical storm.
• The product might be equipped with multiple power cords. To remove all hazardous voltages, disconnect all power cords.
– For AC power, disconnect all power cords from their AC power source. – For racks with a DC power distribution panel (PDP), disconnect the customer’s DC power
• When connecting power to the product ensure all power cables are properly connected.
When working on or around the system, observe the following precautions:
source to the PDP.
– For racks with AC power, connect all power cords to a properly wired and grounded electrical
outlet. Ensure that the outlet supplies proper voltage and phase rotation according to the system rating plate.
©
Copyright IBM Corp. 2017, 2019 v
– For racks with a DC power distribution panel (PDP), connect the customer’s DC power source
to the PDP. Ensure that the proper polarity is used when attaching the DC power and DC power return wiring.
• Connect any equipment that will be attached to this product to properly wired outlets.
• When possible, use one hand only to connect or disconnect signal cables.
• Never turn on any equipment when there is evidence of re, water, or structural damage.
• Do not attempt to switch on power to the machine until all possible unsafe conditions are corrected.
• Assume that an electrical safety hazard is present. Perform all continuity, grounding, and power checks specied during the subsystem installation procedures to ensure that the machine meets safety requirements.
• Do not continue with the inspection if any unsafe conditions are present.
• Before you open the device covers, unless instructed otherwise in the installation and conguration procedures: Disconnect the attached AC power cords, turn off the applicable circuit breakers located in the rack power distribution panel (PDP), and disconnect any telecommunications systems, networks, and modems.
DANGER:
• Connect and disconnect cables as described in the following procedures when installing, moving, or opening covers on this product or attached devices.
To Disconnect:
1. Turn off everything (unless instructed otherwise).
2. For AC power, remove the power cords from the outlets.
3. For racks with a DC power distribution panel (PDP), turn off the circuit breakers located in the PDP and remove the power from the Customer's DC power source.
4. Remove the signal cables from the connectors.
5. Remove all cables from the devices.
To Connect:
1. Turn off everything (unless instructed otherwise).
2. Attach all cables to the devices.
3. Attach the signal cables to the connectors.
4. For AC power, attach the power cords to the outlets.
5. For racks with a DC power distribution panel (PDP), restore the power from the Customer's DC power source and turn on the circuit breakers located in the PDP.
6. Turn on the devices.
Sharp edges, corners and joints may be present in and around the system. Use care when handling equipment to avoid cuts, scrapes and pinching. (D005)
(R001 part 1 of 2):
DANGER:
• Heavy equipment–personal injury or equipment damage might result if mishandled.
• Always lower the leveling pads on the rack cabinet.
• Always install stabilizer brackets on the rack cabinet unless the earthquake option is to be installed.
• To avoid hazardous conditions due to uneven mechanical loading, always install the heaviest devices in the bottom of the rack cabinet. Always install servers and optional devices starting from the bottom of the rack cabinet.
Observe the following precautions when working on or around your IT rack system:
vi Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
• Rack-mounted devices are not to be used as shelves or work spaces. Do not place objects on top of rack-mounted devices. In addition, do not lean on rack mounted devices and do not use them to stabilize your body position (for example, when working from a ladder).
• Stability hazard:
– The rack may tip over causing serious personal injury. – Before extending the rack to the installation position, read the installation instructions. – Do not put any load on the slide-rail mounted equipment mounted in the installation position. – Do not leave the slide-rail mounted equipment in the installation position.
• Each rack cabinet might have more than one power cord.
– For AC powered racks, be sure to disconnect all power cords in the rack cabinet when directed
to disconnect power during servicing.
– For racks with a DC power distribution panel (PDP), turn off the circuit breaker that controls
the power to the system unit(s), or disconnect the customer’s DC power source, when directed to disconnect power during servicing.
• Connect all devices installed in a rack cabinet to power devices installed in the same rack cabinet. Do not plug a power cord from a device installed in one rack cabinet into a power device installed in a different rack cabinet.
• An electrical outlet that is not correctly wired could place hazardous voltage on the metal parts of the system or the devices that attach to the system. It is the responsibility of the customer to ensure that the outlet is correctly wired and grounded to prevent an electrical shock. (R001 part 1 of 2)
(R001 part 2 of 2):
CAUTION:
• Do not install a unit in a rack where the internal rack ambient temperatures will exceed the manufacturer's recommended ambient temperature for all your rack-mounted devices.
• Do not install a unit in a rack where the air flow is compromised. Ensure that air flow is not blocked or reduced on any side, front, or back of a unit used for air flow through the unit.
• Consideration should be given to the connection of the equipment to the supply circuit so that overloading of the circuits does not compromise the supply wiring or overcurrent protection. To provide the correct power connection to a rack, refer to the rating labels located on the equipment in the rack to determine the total power requirement of the supply circuit.
(For sliding drawers.) Do not pull out or install any drawer or feature if the rack stabilizer brackets are not attached to the rack or if the rack is not bolted to the floor. Do not pull out more than one drawer at a time. The rack might become unstable if you pull out more than one drawer at a time.
(For xed drawers.) This drawer is a xed drawer and must not be moved for servicing unless specied by the manufacturer. Attempting to move the drawer partially or completely out of the rack might cause the rack to become unstable or cause the drawer to fall out of the rack. (R001 part 2 of 2)
Safety notices
vii
CAUTION: Removing components from the upper positions in the rack cabinet improves rack
stability during relocation. Follow these general guidelines whenever you relocate a populated rack cabinet within a room or building.
• Reduce the weight of the rack cabinet by removing equipment starting at the top of the rack cabinet. When possible, restore the rack cabinet to the conguration of the rack cabinet as you received it. If this conguration is not known, you must observe the following precautions:
– Remove all devices in the 32U position (compliance ID RACK-001 or 22U (compliance ID
RR001) and above. – Ensure that the heaviest devices are installed in the bottom of the rack cabinet. – Ensure that there are little-to-no empty U-levels between devices installed in the rack cabinet
below the 32U (compliance ID RACK-001 or 22U (compliance ID RR001) level, unless the
received conguration specically allowed it.
• If the rack cabinet you are relocating is part of a suite of rack cabinets, detach the rack cabinet from the suite.
• If the rack cabinet you are relocating was supplied with removable outriggers they must be reinstalled before the cabinet is relocated.
• Inspect the route that you plan to take to eliminate potential hazards.
• Verify that the route that you choose can support the weight of the loaded rack cabinet. Refer to the documentation that comes with your rack cabinet for the weight of a loaded rack cabinet.
• Verify that all door openings are at least 760 x 230 mm (30 x 80 in.).
• Ensure that all devices, shelves, drawers, doors, and cables are secure.
• Ensure that the four leveling pads are raised to their highest position.
• Ensure that there is no stabilizer bracket installed on the rack cabinet during movement.
• Do not use a ramp inclined at more than 10 degrees.
• When the rack cabinet is in the new location, complete the following steps:
(L001)
(L002)
– Lower the four leveling pads. – Install stabilizer brackets on the rack cabinet or in an earthquake environment bolt the rack to
the floor.
– If you removed any devices from the rack cabinet, repopulate the rack cabinet from the
lowest position to the highest position.
• If a long-distance relocation is required, restore the rack cabinet to the conguration of the rack cabinet as you received it. Pack the rack cabinet in the original packaging material, or equivalent. Also lower the leveling pads to raise the casters off of the pallet and bolt the rack cabinet to the pallet.
(R002)
DANGER:
this label attached. Do not open any cover or barrier that contains this label. (L001)
Hazardous voltage, current, or energy levels are present inside any component that has
viii
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
(L003)
or
DANGER: Rack-mounted devices are not to be used as shelves or work spaces. Do not place objects on top of rack-mounted devices. In addition, do not lean on rack-mounted devices and do not use them to stabilize your body position (for example, when working from a ladder). Stability hazard:
• The rack may tip over causing serious personal injury.
• Before extending the rack to the installation position, read the installation instructions.
• Do not put any load on the slide-rail mounted equipment mounted in the installation position.
• Do not leave the slide-rail mounted equipment in the installation position.
(L002)
or
or
Safety notices
ix
or
DANGER: Multiple power cords. The product might be equipped with multiple AC power cords or multiple DC power cables. To remove all hazardous voltages, disconnect all power cords and power cables. (L003)
(L007)
CAUTION:
x Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
A hot surface nearby. (L007)
(L008)
CAUTION: Hazardous moving parts nearby. (L008)
All lasers are certied in the U.S. to conform to the requirements of DHHS 21 CFR Subchapter J for class 1 laser products. Outside the U.S., they are certied to be in compliance with IEC 60825 as a class 1 laser product. Consult the label on each part for laser certication numbers and approval information.
CAUTION: This product might contain one or more of the following devices: CD-ROM drive, DVD­ROM drive, DVD-RAM drive, or laser module, which are Class 1 laser products. Note the following information:
• Do not remove the covers. Removing the covers of the laser product could result in exposure to hazardous laser radiation. There are no serviceable parts inside the device.
• Use of the controls or adjustments or performance of procedures other than those specied herein might result in hazardous radiation exposure.
(C026)
CAUTION: Data processing environments can contain equipment transmitting on system links with laser modules that operate at greater than Class 1 power levels. For this reason, never look into the end of an optical ber cable or open receptacle. Although shining light into one end and looking into the other end of a disconnected optical ber to verify the continuity of optic bers may not injure the eye, this procedure is potentially dangerous. Therefore, verifying the continuity of optical bers by shining light into one end and looking at the other end is not recommended. To verify continuity of a ber optic cable, use an optical light source and power meter. (C027)
CAUTION: This product contains a Class 1M laser. Do not view directly with optical instruments. (C028)
CAUTION: Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following information:
• Laser radiation when open.
• Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam. (C030)
(C030)
CAUTION: The battery contains lithium. To avoid possible explosion, do not burn or charge the battery.
Do Not:
• Throw or immerse into water
• Heat to more than 100 degrees C (212 degrees F)
• Repair or disassemble
Exchange only with the IBM-approved part. Recycle or discard the battery as instructed by local regulations. In the United States, IBM has a process for the collection of this battery. For information, call 1-800-426-4333. Have the IBM part number for the battery unit available when you call. (C003)
CAUTION: Regarding IBM provided VENDOR LIFT TOOL:
• Operation of LIFT TOOL by authorized personnel only.
Safety notices xi
• LIFT TOOL intended for use to assist, lift, install, remove units (load) up into rack elevations. It is not to be used loaded transporting over major ramps nor as a replacement for such designated tools like pallet jacks, walkies, fork trucks and such related relocation practices. When this is not practicable, specially trained persons or services must be used (for instance, riggers or movers).
• Read and completely understand the contents of LIFT TOOL operator's manual before using. Failure to read, understand, obey safety rules, and follow instructions may result in property damage and/or personal injury. If there are questions, contact the vendor's service and support. Local paper manual must remain with machine in provided storage sleeve area. Latest revision manual available on vendor's web site.
• Test verify stabilizer brake function before each use. Do not over-force moving or rolling the LIFT TOOL with stabilizer brake engaged.
• Do not raise, lower or slide platform load shelf unless stabilizer (brake pedal jack) is fully engaged. Keep stabilizer brake engaged when not in use or motion.
• Do not move LIFT TOOL while platform is raised, except for minor positioning.
• Do not exceed rated load capacity. See LOAD CAPACITY CHART regarding maximum loads at center versus edge of extended platform.
• Only raise load if properly centered on platform. Do not place more than 200 lb (91 kg) on edge of sliding platform shelf also considering the load's center of mass/gravity (CoG).
• Do not corner load the platforms, tilt riser, angled unit install wedge or other such accessory options. Secure such platforms -- riser tilt, wedge, etc options to main lift shelf or forks in all four (4x or all other provisioned mounting) locations with provided hardware only, prior to use. Load objects are designed to slide on/off smooth platforms without appreciable force, so take care not to push or lean. Keep riser tilt [adjustable angling platform] option flat at all times except for nal minor angle adjustment when needed.
• Do not stand under overhanging load.
• Do not use on uneven surface, incline or decline (major ramps).
• Do not stack loads.
• Do not operate while under the influence of drugs or alcohol.
• Do not support ladder against LIFT TOOL (unless the specic allowance is provided for one following qualied procedures for working at elevations with this TOOL).
• Tipping hazard. Do not push or lean against load with raised platform.
• Do not use as a personnel lifting platform or step. No riders.
• Do not stand on any part of lift. Not a step.
• Do not climb on mast.
• Do not operate a damaged or malfunctioning LIFT TOOL machine.
• Crush and pinch point hazard below platform. Only lower load in areas clear of personnel and obstructions. Keep hands and feet clear during operation.
• No Forks. Never lift or move bare LIFT TOOL MACHINE with pallet truck, jack or fork lift.
• Mast extends higher than platform. Be aware of ceiling height, cable trays, sprinklers, lights, and other overhead objects.
• Do not leave LIFT TOOL machine unattended with an elevated load.
• Watch and keep hands, ngers, and clothing clear when equipment is in motion.
• Turn Winch with hand power only. If winch handle cannot be cranked easily with one hand, it is probably over-loaded. Do not continue to turn winch past top or bottom of platform travel. Excessive unwinding will detach handle and damage cable. Always hold handle when lowering, unwinding. Always assure self that winch is holding load before releasing winch handle.
• A winch accident could cause serious injury. Not for moving humans. Make certain clicking sound is heard as the equipment is being raised. Be sure winch is locked in position before releasing handle. Read instruction page before operating this winch. Never allow winch to unwind freely.
xii
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Freewheeling will cause uneven cable wrapping around winch drum, damage cable, and may cause serious injury.
• This TOOL must be maintained correctly for IBM Service personnel to use it. IBM shall inspect condition and verify maintenance history before operation. Personnel reserve the right not to use TOOL if inadequate. (C048)
Power and cabling information for NEBS (Network Equipment-Building System) GR-1089-CORE
The following comments apply to the IBM servers that have been designated as conforming to NEBS (Network Equipment-Building System) GR-1089-CORE:
The equipment is suitable for installation in the following:
• Network telecommunications facilities
• Locations where the NEC (National Electrical Code) applies
The intrabuilding ports of this equipment are suitable for connection to intrabuilding or unexposed wiring or cabling only. The intrabuilding ports of this equipment must not be metallically connected to the interfaces that connect to the OSP (outside plant) or its wiring. These interfaces are designed for use as intrabuilding interfaces only (Type 2 or Type 4 ports as described in GR-1089-CORE) and require isolation from the exposed OSP cabling. The addition of primary protectors is not sufcient protection to connect these interfaces metallically to OSP wiring.
Note: All Ethernet cables must be shielded and grounded at both ends.
The ac-powered system does not require the use of an external surge protection device (SPD).
The dc-powered system employs an isolated DC return (DC-I) design. The DC battery return terminal shall not be connected to the chassis or frame ground.
The dc-powered system is intended to be installed in a common bonding network (CBN) as described in GR-1089-CORE.
Safety notices
xiii
xiv Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P

Beginning troubleshooting and problem analysis

This information provides a starting point for analyzing problems.
This information is the starting point for diagnosing and repairing systems. From this point, you are guided to the appropriate information to help you diagnose problems, determine the appropriate repair action, and then complete the necessary steps to repair the system.
Note: Update the system rmware to the latest level before you start problem analysis. If you update the system rmware, you will have the latest available xes and improvements for error handling, reporting, and isolation. For instructions about updating the system rmware, see Getting xes.
What type of problem are you dealing with? Problem analysis procedure
You do not know the type of problem. Go to “Determining the problem analysis
procedure to perform” on page 1.
A baseboard management controller (BMC) access problem occurred.
The system does not power on (the power button or the BMC power on command does not power on the system).
A system rmware boot failure occurred (the system started but was not able to boot to the Petitboot menu).
A video graphics array (VGA) monitor problem occurred (the system started but no video is displayed on the monitor).
An operating system boot failure occurred (the system booted to the Petitboot menu but the operating system did not start).
A sensor on the sensor readings GUI display is red. Go to “Resolving a sensor indicator problem” on
A processor, memory, power, or cooling hardware failure occurred.
Missing or faulty PCIe adapter or device. Go to Resolving a PCIe adapter or device problem.
You have an FQPSPxxxxxxx event code. Go to FQPSPxxxxxxx Event Codes.
Go to “Resolving a BMC access problem” on page
2.
Go to “Resolving a power problem” on page 5.
Go to “Resolving a system rmware boot failure” on page 5.
Go to “Resolving a VGA monitor problem” on page
7.
Go to “Resolving an operating system boot failure” on page 7.
page 9.
Go to “Resolving a hardware problem” on page
10.

Determining the problem analysis procedure to perform

Learn how to identify the correct problem analysis procedure to perform.
About this task
To determine the correct problem analysis procedure to perform, complete the following steps:
Procedure
1. After you apply power to the system, are the power supply LEDs green (either steady or flashing)?
If
Yes: Continue with the next step.
©
Copyright IBM Corp. 2017, 2019 1
Then
If Then
No: Go to “Resolving a power problem” on page 5.
2. Can you access the baseboard management controller (BMC) across the network?
If Then
Yes: Continue with the next step.
No: Go to “Resolving a BMC access problem” on page 2.
3. Can you boot the system to the Petitboot menu?
If Then
Yes: Continue with the next step.
No: Go to “Resolving a system rmware boot failure” on page 5.
4. Is video displayed on the video graphics array (VGA) monitor?
If Then
Yes: Continue with the next step.
No: Go to “Resolving a VGA monitor problem” on page 7.
5. Can you start the operating system?
If Then
Yes: Continue with the next step.
No: Go to “Resolving an operating system boot failure” on page 7.
6. On the sensor readings GUI display, are any sensors red?
If
Yes: Go to “Resolving a sensor indicator problem” on page 9.
No: Continue with the next step.
7. Go to “Resolving a hardware problem” on page 10. This ends the procedure.
Then

Resolving a BMC access problem

Learn how to identify the service action that is needed to resolve a baseboard management controller (BMC) access problem.
Procedure
1. Ensure that the BMC password is not set to the default password. For information about changing the default password, see Logging on to the BMC GUI. Does the problem persist?
If
Yes: Continue with the next step.
No: This ends the procedure.
Then
2. Are both ends of the network cable seated securely?
If
Yes: Continue with the next step.
No: Seat both ends of the cable securely. If the problem persists, continue with the next
2 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Then
step.
3. Power off the system and disconnect all AC power cords for 30 seconds. Then, reconnect the AC power cords and power on the system. Does the BMC access problem persist?
If Then
Yes: Continue with the next step.
No: This ends the procedure.
4. Verify that the BMC network settings are correct.
a) Power on the system by using the power button on the front of the system. Wait 1 - 2 minutes for
the system to display the Petitboot menu.
b) When the Petitboot menu is displayed, press any key to interrupt the boot process. Then, select
Exit to Shell.
c) Type the following command and press Enter:
ipmitool lan print 1
d) Verify that the MAC address and the IP address settings are correct. Then, continue with the next
step.
Note: If the IP address setting is incorrect, go to Conguring the rmware IP address website (http://www.ibm.com/support/knowledgecenter/linuxonibm/liabw/ liabwenablenetwork.htm). If the MAC address is 00:00:00:00:00:00, go to “Contacting IBM service and support” on page 61.
5. Are you able to log in to the BMC web interface?
If
Then
Yes: To update the BMC rmware, go to Updating the system rmware by using the BMC.
If the problem persists, go to step “12” on page 4.
No: Continue with the next step.
6. Complete the following steps:
a. Connect a VGA monitor to the system. b. Press the power button to power on the system.
c. Boot the system to the Petitboot menu. From the Petitboot menu, select Exit to shell.
7. Are you mounting the storage that contains the pUpdate utility and the BMC rmware le from a network storage location?
If
Yes: Continue with the next step.
No: Go to step “9” on page 4.
8. To update the BMC rmware by using a network storage location, complete the following steps:
a) Type mkdir /tmp/media and press Enter. b) Type the following command and press Enter:
mount -t nfs xxx.xxx.xx.xx:/path/of/files /tmp/media, where xxx.xxx.xx.xx is the
IP address of the system to which you want to establish the connection. c) Type cd /tmp/media and press Enter. d) To update the BMC rmware, type the following command and press Enter:
Then
./pUpdate -f bmc.bin -i bt, where bmc.bin is the name of the BMC image le. e) Allow at least 2 minutes for the BMC to reboot. Does the problem persist?
If
Yes: Go to step “12” on page 4.
Then
Beginning troubleshooting and problem analysis 3
If Then
No: This ends the procedure.
9. Update the BMC rmware by using a USB device. Complete the following steps: a) Ensure that the USB device is formatted by using the VFAT le system. b) Insert the USB device into the system if you have not already done so. c) Type mount and press Enter.
Is the following output displayed?
/dev/mapper/sdb1 mounted on /var/petitboot/mnt/dev/sdb1
If Then
Yes: Continue with the next step.
No: Go to step “11” on page 4.
10. Complete the following steps: a) Type cd /var/petitboot/mnt/dev/sdb1 and press Enter. b) To update the BMC rmware, type the following command and press Enter:
./pUpdate -f bmc.bin -i bt, where bmc.bin is the name of the BMC image le.
c) Allow at least 2 minutes for the BMC to reboot. Does the problem persist?
If
Yes: Go to step “12” on page 4.
No: This ends the procedure.
11. Complete the following steps: a) Type mkdir /tmp/media and press Enter. b) Type mount /dev/mapper/sdb1 /tmp/media and press Enter. c) Type cd /tmp/media and press Enter. d) To update the BMC rmware, type the following command and press Enter:
./pUpdate -f bmc.bin -i bt, where bmc.bin is the name of the BMC image le.
e) Allow at least 2 minutes for the BMC to reboot. Does the problem persist?
If
Yes: Go to step “12” on page 4.
No: This ends the procedure.
12. Replace the system backplane.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
Then
Then
This ends the procedure.
4
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P

Resolving a power problem

Learn how to identify the service action that is needed to resolve a power problem.
Procedure
1. Is the identify LED on the front of the system flashing red slowly at 0.25 Hz? For more information about LEDs, see LEDs on the 9006-12P system or LEDs on the 5104-22C, 9006-22C, or 9006-22P system.
If Then
Yes: Continue with the next step.
No: No service action is required. This ends the procedure.
2. Perform the following actions, one at a time until the problem is resolved:
a. Ensure that all of the power cords are fully seated in the power supplies.
b. Ensure that the power supply is fully seated in the system.
c. Ensure that the power supply fan is not blocked.
d. Ensure that all of the power cords are fully seated in the power distribution units (PDUs) or wall
outlets.
e. If the power cords are plugged into PDUs, ensure that the PDUs are turned on.
f. Replace the power cords.
g. Replace the power supplies.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
This ends the procedure.
Resolving a system rmware boot failure
Learn how to identify the service action that is needed to resolve a failure while booting your system
rmware.
Procedure
1. Does the baseboard management controller (BMC) respond to commands and are you able to access the BMC web interface?
Note: To determine whether the BMC responds to commands, run the following ipmitool command:
ipmitool -I lanplus -U <username> -P <password> -H <bmc ip or bmc hostname> chassis status
If
Then
Yes: Continue with step “3” on page 6.
No: Continue with the next step.
2. Complete the following actions, one at a time, until the problem is resolved:
a. Reset the BMC remotely by entering the following command:
Beginning troubleshooting and problem analysis
5
ipmitool -I lanplus -U <username> -P <password> -H <bmc ip or bmc hostname> mc reset cold
b. Disconnect the power cords from the system for 30 seconds. Reconnect the power cords, wait 5
minutes, and then go to step “1” on page 5.
c. Update the BMC rmware by using the pUpdate command with the block transfer (BT) option:
1) Type mkdir /tmp/media and press Enter.
2) Type the following command and press Enter:
mount -t nfs xxx.xxx.xx.xx:/path/of/files /tmp/media, where xxx.xxx.xx.xx is the IP address of the system to which you want to establish the connection.
3) Type cd /tmp/media and press Enter.
4) To update the BMC rmware, type the following command and press Enter:
./pUpdate -f bmc.bin -i bt, where bmc.bin is the name of the BMC image le.
5) Allow at least 2 minutes for the BMC to reboot.
d. Replace the system backplane.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
This ends the procedure.
3. After you pressed the power button, did the system turn on but fail to display the Petitboot menu?
If
Then
Yes: Continue with the next step.
No: This ends the procedure.
4. Complete the following actions, one at a time, until the problem is resolved:
a. Ensure that the TPM card is fully seated.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location.
b. Disconnect the power cords from the system for 30 seconds. Reconnect the power cords, wait 5
minutes, and then go to step “3” on page 6.
c. Update the PNOR rmware. For instructions, see Getting xes.
Note: If your system is a 9006-12P or 9006-22P, the PNOR rmware level must be V2.12-20190404, or later.
d. Replace the system backplane.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
6
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
This ends the procedure.

Resolving a VGA monitor problem

Learn how to identify the service action that is needed to resolve a video graphics array (VGA) monitor problem.
Procedure
1. Is the system powered on and is the VGA monitor connected to the VGA display port, but no video is displayed?
If Then
Yes: Continue with the next step.
No: This ends the procedure.
2. Complete the following steps, one at a time until the problem is resolved:
a) Ensure that the VGA cable is properly seated to the server port and to the monitor port. b) Verify that your monitor and your VGA cable are working properly by testing them on a system that
is known to be working properly. If the monitor or the VGA cable does not work properly, replace it.
c) Verify that the system is powered on by activating a serial over LAN (SOL) session through the
baseboard management controller (BMC). If the system is not active, go to “Resolving a system rmware boot failure” on page 5.
d) Replace the system backplane.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
This ends the procedure.

Resolving an operating system boot failure

Learn how to identify the service action that is needed to resolve a failure while booting your operating system.
Procedure
1. Was the system recently installed, serviced, moved, or upgraded?
If
Yes: Ensure that all cables are properly seated in the connection path to the designated
No: Continue with the next step.
2. Are you booting the operating system from a network location?
If
Then
boot device. This ends the procedure.
Then
Yes: Continue with the next step.
No: Continue with step “4” on page 8.
3. Complete the following actions, one at a time until the problem is resolved:
Beginning troubleshooting and problem analysis
7
a. Ensure that a problem does not exist with the connection to the network location. b. Ensure that the adapter has a valid IP address for the network.
c. Replace the network adapter.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
4. Petitboot displays all recognized bootable images to use by default. Is the boot image recognized by Petitboot?
If Then
Yes: Continue with step “10” on page 9.
No: Select the Petitboot menu option to refresh the boot images. If the problem persists,
continue with the next step.
5. To determine the command to type on the Petitboot command line to verify that the boot drive is recognized and in optimal status, use Table 1 on page 8.
Table 1. Determine the command to verify that the boot drive is recognized and in optimal status
Boot drive conguration Commands
Virtual drive connected directly to the system backplane
Physical drive connected directly to the system backplane
Is the boot drive recognized and in optimal status?
If
Yes: Reinstall the operating system on the boot drive. This ends the procedure.
No: Continue with the next step.
6. Are the drives properly seated in their respective drive bays?
Note:
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63
to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical
location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical
location and the removal and replacement procedure.
If
Yes: Continue with the next step.
Then
Then
arcconf getconfig 1 LD
arcconf getconfig 1 PD
No: Properly seat the drives in the drive bays. Then, go to step “4” on page 8.
7. Refresh the Petitboot boot options. Is the boot image on the boot drive recognized?
If
Yes: Boot the operating system. Then, continue with step “10” on page 9.
No: Continue with the next step.
8 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Then
8. To determine the command to type on the Petitboot command line to verify that the drives that are known to be in a RAID array are recognized, use Table 2 on page 9.
Table 2. Determine the command to verify that the drives that are known to be in a RAID array are recognized
Drive conguration Commands
Drive connected directly to the system backplane
Are the drives that are known to be in the RAID array recognized?
If Then
Yes: Reinstall the operating system on the boot drive. This ends the procedure.
No: Continue with the next step.
9. Complete the following actions, one at a time until the physical drives are recognized in the RAID array:
Note:
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63
to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical
location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical
location and the removal and replacement procedure.
a. If the drive is connected directly to the system backplane, ensure that the mini-SAS cable and
SATA cables are properly seated in the disk drive backplane and system backplane.
b. Replace the SAS or SATA cable.
c. If the drive is connected directly to the system backplane, replace the system backplane.
arcconf getconfig 1 LD
arcconf getconfig 1 PD
This ends the procedure.
10. Does an operating system error occur during the boot?
If
Yes: Recover the operating system with the tools for the operating system. If that does
No: Reinstall the operating system. This ends the procedure.
Then
not resolve the problem, reinstall the operating system. This ends the procedure.

Resolving a sensor indicator problem

Learn how to resolve a sensor indicator problem.
About this task
To determine whether a service action is required, complete the following procedure:
Note: For more information about sensors, see Sensor readings GUI display.
Procedure
1. If the system is not powered on, boot the system to the operational state. Log in to the BMC web interface. Then, click Server Health > Sensor Readings.
Are any of the sensor indicator LEDs red?
Beginning troubleshooting and problem analysis
9
Yes: Continue with the next step.
No: This ends the procedure.
2. Record the names of any sensors that have a red LED indicator status.
Note: Repeat steps 3 - 6 for every sensor that you record in this step.
3. Use one of the following commands to list the sensor event logs (SELs).
• To list SELs by using an in-band network, enter the following command:
ipmitool sel elist
• To list SELs remotely over the LAN, enter the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP addres or BMC hostname> sel elist
4. Review the list of SELs and locate the log entry that meets the following criteria:
• The name of any of the sensors you recorded in step 2
.
• A service action keyword is present. For a list of service action keywords, see “Identifying service
action keywords in system event logs” on page 23.
Asserted is in the description.
Did you identify a log entry that meets the above criteria?
Yes: Continue with the next step.
No: Go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service and
support” on page 61. This ends the procedure.
5. Use one of the following options to display the SEL details for the sensor:
Note: You must specify the SEL record ID in hexadecimal format. For example: 0x1a.
• To display SEL details by using an in-band network, enter the following command:
ipmitool sel get <SEL record ID>
• To display SEL details remotely over the LAN, enter the following command:
ipmitool -I lanplus -U <username> -P <password> -H <BMC IP address or BMC hostname> sel get <SEL record ID>
6. The sensor ID eld contains sensor information in the sensor name (sensor ID) format. Record the sensor name, sensor ID, and event description. Then, use this information to determine the service action to perform:
• If your system is a 5104-22C, 9006-12P, 9006-22C, or 9006-22P, go to “Identifying a service action
by using sensor and event information for the 5104-22C, 9006-12P, 9006-22C, or 9006-22P” on page 24 to determine the service action to perform. This ends the procedure.

Resolving a hardware problem

Learn how to identify the service action that is needed to resolve a hardware problem.
Procedure
1. If you have not already done so, manually boot the system.
2. Go to “Identifying a service action by using system event logs” on page 18. Then, continue with the next step.
3. Was a service action identied?
If
Yes: Continue with the next step.
10 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Then
If Then
No: Go to step “5” on page 11.
4. Did the service action x the problem?
If Then
Yes: This ends the procedure.
No: Go to step “5” on page 11.
5. Go to “Resolving a PCIe adapter or device problem” on page 11. Then, continue with the next step.
6. Was a service action identied?
If Then
Yes: Continue with the next step.
No: Go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service
and support” on page 61. This ends the procedure.
7. Did the service action x the problem?
If Then
Yes: This ends the procedure.
No: Go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service
and support” on page 61. This ends the procedure.

Resolving a PCIe adapter or device problem

Learn how to access log les, information to identify types of events, and a list of potential problems and service actions.
About this task
Procedure
1. To identify the correct service procedure to perform by using operating system log information, complete the following steps:
a) Log in as the root user. b) At the command prompt, type dmesg and press Enter.
2. Scan the operating system logs for the rst occurrence of keywords, such as fail, failure, or failed. When you nd a keyword that accompanies one or more of the resource names in Table 3 on page 12, a service action is required.
Did you nd an operating system log that requires a service action?
If
Yes: Use Table 3 on page 12 to determine the service procedure to perform for your type
No: Continue with the next step.
Then
of problem. This ends the procedure.
Beginning troubleshooting and problem analysis 11
Table 3. Resource names, examples, and service procedures for different types of operating system logs.
Resource name Example of a log
requiring a service action
eth1, eth2, eth3, enPxxxxx, where xxxxx indicates the network port.
mlx5_core Link Down
tg3 PCI I/O error
nvme Failed status:
sda, sdb, sdc FAILED Result Storage Go to “Resolving a
EEH Detected error on
Failed to re­initialize device
health_care: handling bad device here
detected. Link is Down
ffffffff, reset controller
PHB#xxx, where xxx is
the PHB number.
Type of problem Service procedure
Network Go to “Resolving a
network adapter problem” on page 13.
Network Go to “Resolving a
network adapter problem” on page 13.
Network Go to “Resolving a
network adapter problem” on page 13.
NVMe Flash adapter Go to “Resolving an
NVMe Flash adapter problem” on page 15.
storage device problem” on page 15.
PCIe bus or adapter Resolve any device
driver errors that are related to I/O and that occurred near the time of this operating system log entry.
xxx has failed 6 times in the last hour and has been permanently disabled, where xxx
is the PCI bus number.
3. Are all of the adapters in the system missing or failed?
If
Yes: Perform the following actions, one at a time, until the problem is resolved:
Then
a. Ensure that the PCIe risers are fully seated in the system. b. Replace system processor CPU 1. c. Replace the system backplane.
• If your system is a 5104-22C or 9006-22C, go to “5104-22C or 9006-22C locations” on page 63 to identify the physical location and the removal and replacement procedure.
PCIe bus or adapter Ensure that the correct
device drivers are properly installed for the device. If the problem persists, replace the adapter in the PCIe slot that is specied in the operating system log entry.
12 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P
If Then
• If your system is a 9006-12P, go to “9006-12P locations” on page 75 to identify the physical location and the removal and replacement procedure.
• If your system is a 9006-22P, go to “9006-22P locations” on page 91 to identify the physical location and the removal and replacement procedure.
No: Go to “Collecting diagnostic data” on page 60. Then, go to “Contacting IBM service
and support” on page 61.

Resolving a network adapter problem

Learn about the possible problems and service actions that you can perform to resolve a network adapter problem.
About this task
Note: To determine the location of the PCIe adapter, see “Identifying the location of the PCIe adapter by
using the slot number” on page 16.
Table 4. Network adapter problems and service actions
Problem Service action
System is unable to nd the adapter or the negotiated PCIe bandwidth of the adapter is less than expected
1. Verify that the adapter is properly seated in a compatible slot.
2. Install the adapter in a different compatible slot.
3. Verify that the drivers for the adapter are installed.
4. Verify that the most recent rmware is installed on the system, or install the most recent rmware if it is not already installed.
5. Restart the system.
6. Replace the adapter.
7. If the adapter is connected to a PCIe riser, replace the PCIe riser.
8. If the adapter is in UIO slot 1, UIO slot 2, or UIO slot 3, replace CPU 1. Otherwise, replace CPU 2.
9. Replace the system backplane.
Beginning troubleshooting and problem analysis 13
Table 4. Network adapter problems and service actions (continued)
Problem Service action
Adapter suddenly stops working
1. If the system was recently installed, moved, serviced, or upgraded, verify that the adapter is seated properly and all associated cables are correctly connected.
2. Inspect the PCIe socket and verify that there is no dirt or debris in the socket.
3. Inspect the card and verify that it is not physically damaged.
4. Verify that all cables are properly seated and are not physically damaged. If you recently added one or more new adapters, remove them and then test to determine whether the failing adapter is functioning properly again. If the network adapter is functioning again, review the IBM support tips to conrm that there are no PCI address, driver, or rmware conflicts. Then, reinstall the new adapters again one at a time until all adapters function properly.
5. Replace the adapter.
6. If the adapter is connected to a PCIe riser, replace the PCIe riser.
7. If the adapter is in UIO slot 1, UIO slot 2, or UIO slot 3, replace CPU 1. Otherwise, replace CPU 2.
8. Replace the system backplane.
Link indicator light on the adapter is off
Link light on the adapter is on, but there is no communication from the adapter
Other problems For information about adapter diagnostics, see
1. Verify that the cable functions properly by testing it with a known working connection.
2. Verify that the port or ports on the switch are enabled and functional.
3. Verify that the switch and adapter are compatible.
4. Replace the adapter.
1. Verify that the most recent driver is installed, or install the most recent driver if it is not already installed.
2. Verify that the adapter and its link have compatible settings, such as speed and duplex
conguration.
Supporting diagnostics. For information about adapter user information, see User guides for PCIe adapters.
14 Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and 9006-22P

Resolving an NVMe Flash adapter problem

Learn about the possible problems and service actions that you can perform to resolve a Non-Volatile Memory Express (NVMe) Flash adapter problem.
About this task
Note: To determine the location of the NVMe Flash adapter, see “Identifying the location of the NVMe
Flash adapter” on page 17.
Table 5. NVMe Flash adapter problems and service actions
Problem Service action
System is unable to nd the NVMe Flash adapter
NVMe Flash adapter stops working suddenly
Other problems Check the messages and resolve any other problems that are detected. Then, test
1. If the system was recently installed, moved, serviced, or upgraded, verify that the NVMe Flash adapter is seated and installed properly.
2. Verify that the NVMe Flash adapter is compatible with the system.
3. Verify that the most recent rmware is installed on the system. Otherwise install the most recent rmware if it is not already installed.
4. Replace the NVMe Flash adapter.
1. Check the system logs to verify whether the system detected a problem.
2. Replace the NVMe Flash adapter.
the NVMe Flash adapter again.

Resolving a storage device problem

Learn about the possible problems and service actions that you can perform to resolve a storage device problem.
About this task
Note: To determine the location of the storage device, see “Identifying the location of the storage device”
on page 17.
Table 6. Storage device problems and service actions
Problem Service action
System is unable to nd more than one storage device
1. If the system was recently installed, moved, serviced, or upgraded, verify that the device is seated and installed properly.
2. Verify that the device is compatible with your system.
3. Verify that all internal cables are properly seated and are not physically damaged.
4. Verify that the most recent rmware is installed on the system, or install the most recent rmware if it is not already installed.
5. If the devices are part of a RAID conguration, ensure that the device has been enabled and is part of an array.
6. Replace the cable that connects the disk drive backplane to the system backplane.
Beginning troubleshooting and problem analysis 15
Table 6. Storage device problems and service actions (continued)
Problem Service action
System unable to nd a storage device
More than one storage device suddenly stops working
1. If the system was recently installed, moved, serviced, or upgraded, verify that the device is seated and installed properly.
2. Verify that the device is compatible with your system.
3. Verify that all internal cables are properly seated and are not physically damaged.
4. Verify that the most recent rmware is installed on the system, or install the most recent rmware if it is not already installed.
5. If the device is part of a RAID conguration, ensure that the device has been enabled and is part of an array.
6. Install the device in an open or free slot. If the device is able to be found replace the component with the failing connector.
7. Replace the storage device.
8. Replace any applicable attached cable.
1. If the system was recently installed, moved, serviced, or upgraded, verify that the device is seated and installed properly.
2. Check the system logs to verify whether the system detected a problem.
3. Replace the cable that connects the disk drive backplane to the system backplane.
One storage device suddenly stops working
Other problems Check the messages and resolve any other
1. Verify that all internal cables are properly seated and are not physically damaged.
2. Check the system logs to verify whether the system detected a problem.
3. Replace the drive.
4. Replace the system backplane.
5. Replace the cable.
problems that were detected. Then, test the drive again. If the drive continues not to function, refer to the documentation for the drive.

Identifying the location of the PCIe adapter by using the slot number

The error message provides information to help you to determine the location of the PCIe adapter.
About this task
For example, the log might contain an error similar to the following text:
[131779.752714] EEH: PHB#0 failure detected, location: WIO-R Slot
16
Power Systems: Problem analysis, system parts, and locations for the 5104-22C, 9006-12P, 9006-22C, and
9006-22P
Loading...
+ 104 hidden pages