Use this supplement with the
Hardware Maintenance Manual for the PC Server
IBM
IBM PC Servers
IBM SerialRAID Adapter for PC Servers
Hardware Maintenance
Manual Supplement
October 1998
Use this supplement with the
Hardware Maintenance Manual for the PC Server
SY33-0193-00
Note
Before using this information and the product it supports, be sure to read the general information under “Notices”
in the product documentation.
First Edition (October 1998)
The following paragraph does not apply to any country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION “AS IS” WITHOUT
WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow
disclaimer of express or implied warranties in certain transactions; therefore, this statement may not apply to you.
This publication could include technical inaccuracies or typographical errors. Changes are periodically made to the
information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements or changes in the products or the programs described in this publication at any time.
It is possible that this publication may contain reference to, or information about, IBM products (machines and
programs), programming, or services that are not announced in your country. Such references or information must not
be construed to mean that IBM intends to announce such IBM products, programming, or services in your country.
Requests for technical information about IBM products should be made to your IBM Authorized Dealer or your
Marketing Representative.
Copyright International Business Machines Corporation 1998. All rights reserved.
Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is
subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp.
Contents
About This Supplement ............................... v
How This Book Is Organized............................. v
This book is intended for service representatives who maintain PC servers that use the
IBM SerialRAID Adapter.
How This Book Is Organized
“Introducing the IBM SerialRAID Adapter” on page 1 introduces the IBM SerialRAID
Adapter.
“Service Request Numbers (SRNs)” on page 3 provides a table of service request
numbers (SRNs) that are related to the IBM SerialRAID Adapter.
“Removing and Replacing FRUs” on page 19 describes how to exchange disk drives,
DRAMs and the Fast-Write cache.
“Service Aids and Other Utilities” on page 29 describes the SSA service aids.
“Maintenance Tasks” on page 41 describes a number of tasks involving the
configurator utilities that are called during the maintenance analysis procedures (MAPs).
“Maintenance Analysis Procedures (MAPs)” on page 49 provides maintenance analysis
procedures for the IBM SerialRAID Adapter.
Important
This manual is intended for trained service personnel who are familiar with IBM PC
Server products.
Before servicing an IBM product, be sure to review “Safety Information” in the
product documentation.
Copyright IBM Corp. 1998 v
Related Publications
Other manuals that you might find useful are:
IBM SerialRAID Adapter: Installation and User’s Guide
IBM SerialRAID Adapter: Technical Reference
For more information, contact IBM or your IBM Authorized Dealer.
, S33-3283-00
, SA33-3275-01
viIBM SerialRAID Adapter Maintenance Information
Introducing the IBM SerialRAID Adapter
The IBM SerialRAID Adapter is a Peripheral Component Interconnect (PCI) adapter that
serves as the interface between systems based on PCI architecture and devices that
use Serial Storage Architecture (SSA). The adapter has four ports, which can be
connected in pairs to drive two SSA loops. Each loop can contain a maximum of 48
disk drives. Two adapters, each one located in a different PC Server, can drive the
same SSA loop. This arrangement is referred to as a Cluster Configuration. (See the
IBM SerialRAID Adapter Installation and User's Guide
The four SSA ports on the adapter can operate at 20MB/s full-duplex over point-to-point
copper cables up to 25 meters long. SSA uses an industry-standard interface based on
SCSI-2 commands, queuing model, status and sense bytes.
.1/SSA Loop B Port 2.3/ SSA Loop A Port 2
.2/SSA Loop B Port 1.4/ SSA Loop A Port 1
Internal
Connectors
for more details).
1
Fast-Write
Cache card
2
3
4
Lights
The four SSA connectors on the adapter card are arranged in two pairs; connectors A1
and A2 are one pair, B1 and B2 are the other. Next to each pair of connectors is a
light that functions as follows:
ON continuously: Power is turned on to the adapter and both ports for that loop
are operational; that is, the devices in the loop have power turned on, are
connected correctly to the adapter, and are operational.
Copyright IBM Corp. 1998 1
Flashing continuously: One of the ports is not operational. This condition occurs
if the cable to the port is not connected correctly, or if the device in the loop
connected next to the adapter is not operational.
Off: Both ports are non-operational.
General Information
For general information regarding the IBM SerialRAID Adapter, RAID technology and
SSA loops, see the
The IBM SerialRAID Adapter also contains array management software that provides
RAID-5 functions to control the arrays of the RAID system. An array can have from 3 to
16 member disk drives and is handled as one large disk drive by the operating system.
The array management software translates requests to the single large disk into
requests to the individual member disk drives. Configuration software is available that
allows the user to define which disk drives in the loop, if any, are to be included in an
array.
Up to three adapters can be present in one system unit. For performance reasons it is
recommended that they are all placed on the same bus.
A module on the IBM SerialRAID Adapter contains a lithium battery.
CAUTION:
A lithium battery can cause fire, explosion, or a severe burn. Do not recharge,
disassemble, heat above 100°C (212°F), solder directly to the cell, incinerate, or
expose cell contents to water. Keep away from children. Replace only with the
part number specified with your system. Use of another battery might present a
risk of fire or explosion.
IBM SerialRAID Adapter: Installation and User's Guide
.
The battery connector is polarized; do not try to reverse the polarity.
Dispose of the battery according to local regulations.
2IBM SerialRAID Adapter Maintenance Information
Service Request Numbers (SRNs)
Service request numbers (SRNs) are generated by the error logging facility and by the
diagnostics. SRNs help you to identify the cause of a problem, the failing
field-replaceable units (FRUs), and the service actions that might be needed to solve
the problem.
Displaying SRNs
To see the SRNs run the Remote Systems Management (RSM) Configurator.
(see the
configurator)
Use the configurator to display the SRNs as follows:
1. On the opening page, select Event Logger
2. On the second page, select Analyse
The error log is analysed and all errors with a severity level that calls for service
intervention are displayed.
The SRN Table
The table in this section lists the SRNs and describes the actions you should take. The
table columns are:
Installation and User Guide
for details of how to load and start the RSM
SRNThe service reference number.
FRU listThe FRU or FRUs that might be causing the problem, and how likely it is
(by percentage) that the FRU is causing the problem.
ProblemA description of the problem and the action you must take.
Abbreviations used in the table are:
DMADirect memory access
FRUField-replaceable unit
PAAP = Adapter port number
AA = SSA address
(see also “Finding the Device When No Service Aids Are Available” on
page 38)
Important: You should have been sent here from either diagnostics or a START MAP.
Do not start problem determination from the SRN table; always go to the START MAP
for the unit in which the device is installed.
Copyright IBM Corp. 1998 3
1. Find the SRN in the table.
If you cannot find the SRN
, refer to the
documentation for the subsystem or device. If you still cannot find the SRN, you
have a problem with the diagnostics, the microcode, or the documentation. Call
your support center for assistance.
2. Read carefully the “Action” you must do for the problem.
adapters unless you are instructed to do so
.
Do not exchange
3. Normally exchange only one adapter at a time. Always use instructions provided
with the system unit when exchanging adapters. After each adapter is exchanged,
go to “MAP 2410: SSA Repair Verification” on page 2410-1 to verify the repair.
SRNFRU ListProblem
20PAADevice (45%)
21PAA
to
29PAA
2A002Device (50%)
2A003Device (50%)
2A004Device (50%)
2FFFFNoneDescription: An async code that is not valid has been received.
303FFDevice (100%)
40000SSA adapter card (100%)Description: The SSA adapter card has failed.
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
).
).
Description: An open SSA link has been detected.
Action: Run the Disk service aid to isolate the failure (see
“Service Aids and Other Utilities” on page 29).
If the SSA service aids are not available, go to the service
information for the unit in which the device is installed.
Description: An SSA ‘Threshold exceeded’ link error has been
detected.
Action: Go to “MAP 2010: START” on page 2010-1.
Description: Async code 02 has been received. Probably, a
software error has occurred.
Action: Go to “Software and Microcode Errors” on page 12
before exchanging any FRUs.
Description: Async code 03 has been received. Probably, a
software error has occurred.
Action: Go to “Software and Microcode Errors” on page 12
before exchanging any FRUs.
Description: Async code 04 has been received. Probably, a
software error has occurred.
Action: Go to “Software and Microcode Errors” on page 12
before exchanging any FRUs.
Action: Go to “Software and Microcode Errors” on page 12.
Description: A SCSI status that is not valid has been received.
Action: Go to “Software and Microcode Errors” on page 12.
Action: Exchange the FRU for a new FRU.
Description: A 4 MB DRAM in adapter card module 0 has failed.
Action: Exchange the FRU for a new FRU.
Description: An 8 MB DRAM in adapter card module 0 has
failed.
Action: Exchange the FRU for a new FRU.
Description: A 16 MB DRAM in adapter card module 0 has
failed.
Action: Exchange the FRU for a new FRU.
4IBM SerialRAID Adapter Maintenance Information
SRNFRU ListProblem
4003232 MB DRAM module 0 (100%)
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
4006464 MB DRAM module 0 (100%)
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
40128128 MB DRAM module 0 (100%)
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
410044 MB DRAM module 1 (100%)
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
410088 MB DRAM module 1 (100%)
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
4101616 MB DRAM module 1 (100%)
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
4103232 MB DRAM module 1 (100%)
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
4106464 MB DRAM module 1 (100%)
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
41128128 MB DRAM module 1 (100%)
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
42000SSA adapter card (50%)
DRAM modules (50%)
(“Exchanging DRAMs on the IBM
SerialRAID Adapter” on page 22).
42200NoneDescription: Other adapters on the SSA loop are using levels of
Description: A 32 MB DRAM in adapter card module 0 has
failed.
Action: Exchange the FRU for a new FRU.
Description: A 64 MB DRAM in adapter card module 0 has
failed.
Action: Exchange the FRU for a new FRU.
Description: A 128 MB DRAM in adapter card module 0 has
failed.
Action: Exchange the FRU for a new FRU.
Description: A 4 MB DRAM in adapter card module 1 has failed.
Action: Exchange the FRU for a new FRU.
Description: An 8 MB DRAM in adapter card module 1 has
failed.
Action: Exchange the FRU for a new FRU.
Description: A 16 MB DRAM in adapter card module 1 has
failed.
Action: Exchange the FRU for a new FRU.
Description: A 32 MB DRAM in adapter card module 1 has
failed.
Action: Exchange the FRU for a new FRU.
Description: A 64 MB DRAM in adapter card module 1 has
failed.
Action: Exchange the FRU for a new FRU.
Description: A 128 MB DRAM in adapter card module 1 has
failed.
Action: Exchange the FRU for a new FRU.
Description: The SSA adapter has detected that both DRAM
modules are failing.
Action:
1. Check whether both DRAM modules are correctly installed
on the adapter card. Make any necessary corrections.
2. If this problem has occurred immediately after an upgrade
to the adapter card, check whether the correct type of
DRAM modules have been installed. Make any necessary
corrections.
3. If the problem remains, exchange the adapter card FRU for
a new one.
4. Install the DRAM modules from the original adapter card
onto the new adapter card, then install the new adapter
card.
5. If the problem remains, exchange the DRAM modules for
new modules.
6. Install the new DRAM modules onto the original adapter
card. Reinstall the original adapter card.
microcode that are not compatible.
Action: Install the latest level of microcode on all other adapters
in this SSA loop.
First refer to “Software and Microcode Errors” on page 12 and if
necessary “Download Microcode Function” on page 37.
Do not exchange any DRAM modules yet
.
Service Request Numbers (SRNs)5
SRNFRU ListProblem
42500Fast-Write Cache Card (98%)
42510NoneDescription: Not enough DRAM available to run the fast-write
42515Fast-write Cache Card (90%)
(“Exchanging the Fast-Write Cache
Card” on page 23)
SSA Adapter Card (2%) (Installation
and User Guide)
“Exchanging the Fast-Write Cache Card”
on page 23)
SSA adapter card (10%) (using system
Installation and Users Guide
)
Description: The Fast-Write Cache Card has failed.
Action:
1. Exchange the Cache Card for a new one
2. Switch on power to the using system
3. New error codes are produced if the original cache card
contained data that was not moved to a disk drive. Run
diagnostics on the adapter and if a SRN is produced, do
the actions for that SRN.
cache operation.
Action:
1. Start the using-system service aids.
2. Select Display or Change Configuration or Vital ProductData (VPD).
3. Select Display Vital Product Data.
4. Find the VPD for the SSA adapter that is logging the error.
5. Note the DRAM and cache sizes (Device Specifics Z0 and
Z1).
6. For fast-write operations, you must have a 32 MB DRAM.
Check that you have the correct size of DRAM.
Description: A fast-write disk is installed, but no Fast-Write
Cache Card has been detected. This problem can be caused
because:
The cache card is not installed correctly.
The Fast-Write feature is not installed on this machine, but
a disk drive that is configured for fast-write operations has
been added to the subsystem.
Action:
1. If you have not already done so, run diagnostics on the
adapter. If a different SRN is generated solve that problem
first.
2. Do the following actions as appropriate:
If the cache card is not installed correctly, remove it
from the adapter and reinstall it correctly.
If the cache card is installed correctly, it might have
failed. Exchange for new FRUs, the FRUs that are
shown in the FRU list for this SRN.
If the Fast-Write feature is not installed, and you want
to delete the fast-write function for one or more disk
drives that have been added to this subsystem:
3. Verify with the customer that the fast-write function can be
deleted for the disk drives configured for fast-write.
4. Using the RSM configurator, select the resource in question
and select Delete FW. See “Dealing with Fast-Write
Problems” on page 12 for more details.
6IBM SerialRAID Adapter Maintenance Information
SRNFRU ListProblem
42520Fast-Write Cache Card (100%)Description: A Fast-Write Cache Card has failed. Data has been
42521Fast-Write Cache Card (100%)
(“Exchanging the Fast-Write Cache
Card” on page 23)
written to the cache card and cannot be recovered. The location
of the lost data is not known. The disk drive is offline.
Action:
1. Ask the customer to determine:
Which disk drives are affected by this error
How much data has been lost
Which data recovery procedures can be done
2. Ask the customer to disable the Fast-Write feature for:
Each device for which Fast-Write is offline
All other devices that are connected to the failing
adapter and have Fast-Write enabled
For details of how to disable Fast-Write see “Dealing
with Fast-Write Problems” on page 12.
3. Exchange the Fast-Write Cache Card for a new one.
4. Ask the customer to re-enable Fast-Write for the devices
that are attached to the new Fast-Write Cache Card.
Description: A Fast-Write Cache card has failed. Data has been
written to the card and cannot be recovered. The disk drives
that have lost data cannot be identified. All unsynchronized
fast-write disk drives that are attached to this adapter are
off-line.
Action:
1. Ask the customer to determine:
Which disk drives are affected by this error
How much data has been lost
Which data recovery procedures can be done
2. Ask the customer to disable Fast-Write for:
Each device for which the Fast-Write is offline
All other devices that are connected to the failing
adapter and have the Fast-Write enabled
For details of how to disable Fast-Write see “Dealing
with Fast-Write Problems” on page 12.
3. Exchange the Fast-Write Cache card for a new one.
4. Ask the customer to re-enable Fast-Write for the devices
that are attached to the new Fast-Write Cache Card.
Service Request Numbers (SRNs)7
SRNFRU ListProblem
42522Fast-Write Cache Card (100%)
(“Exchanging the Fast-Write Cache
Card” on page 23)
42523NoneDescription: The Fast-Write Cache Card has a bad version
42524Fast-Write Cache Option Card (100%)Description: A fast-write disk drive (or drives) that does not
Description: A Fast-Write Cache card has failed. Data has been
written to the card and cannot be recovered. One or more 4 KB
blocks of data for a known disk have been lost and cannot be
read.
Action:
1. Ask the customer to determine:
Which disk drives are affected by this error
How much data has been lost
Which data recovery procedures can be done
2. Ask the customer to disable Fast-Write for:
Each device for which the Fast-Write is offline
All other devices that are connected to the failing
adapter and have the Fast-Write enabled
For details of how to disable Fast-Write see “Dealing
with Fast-Write Problems” on page 12.
3. Exchange the Fast-Write Cache card for a new one.
4. Ask the customer to re-enable Fast-Write for the devices
that are attached to the new Fast-Write Cache Card.
number.
Action: Install the correct adapter microcode for this card.
contain synchronized data has been detected. The Fast-Write
Cache Card, however, cannot be detected. The disk drive (or
drives) is offline.
Action:
If the Fast-Write Cache Card has been removed, reinstall it.
If the Fast-Write Cache card has failed:
1. Ask the customer to disable the Fast-Write for:
– Each device for which the Fast-Write is offline
– All other devices that are connected to the failing
adapter, and have Fast-Write enabled
For details of how to disable Fast-Write see
“Dealing with Fast-Write Problems” on page 12.
2. Exchange the Fast-Write Cache Card for a new one.
3. Ask the customer to re-enable Fast-Write for the
devices that are attached to the new Fast-Write Cache
Card.
8IBM SerialRAID Adapter Maintenance Information
SRNFRU ListProblem
42525NoneDescription: The wrong Fast-Write Cache Card has been
42526SSA adapter card (100%) (
42527NoneDescription: A dormant fast-write cache entry exists.
42528NoneDescription: A fast-write disk drive has been detected that was
43PAADevice (90%)
and User Guide
(“Exchanging Disk Drives” on page 19).
SSA adapter card (10%)
).
Installation
detected by a fast-write disk drive that contains unsynchronized
data.
Action: The failing disk drive is offline. If the disk drive has just
been moved from another adapter, do either of the following
actions:
Return the disk drive to its original adapter.
Move the original Fast-Write Cache card to this adapter so
that the data can be synchronized.
If you cannot do either action, or the data on the disk drive has
no value:
1. Ask the customer to disable Fast-Write for:
Each device for which the Fast-Write option is offline
All other devices that are connected to the failing
adapter, and have Fast-Write enabled.
For details of how to disable Fast-Write see “Dealing
with Fast-Write Problems” on page 12.
2. Ask the customer to re-enable Fast-Write for the devices
that are attached to the new Fast-Write Cache Card.
Description: This adapter card does not provide support for the
Fast-Write Cache.
Action: Install the correct SSA adapter (if applicable).
Action: The fast-write cache contains unsynchronized data for a
disk drive that is no longer available. If possible, reconnect the
disk drive to the adapter to enable the data to be synchronized.
If you cannot reconnect the disk drive (for example, because the
disk drive has failed), the user should delete the dormant
fast-write cache entry
Although the resource is no longer available, the RSM
configurator will show the resource. Go to the Resource View
page of the RSM and select Detach or Delete as appropriate.
previously unsynchronized, but has since been configured on a
different adapter.
Action: If this disk drive contains data that should be kept, return
the disk drive to the adapter to which it was previously
connected.
If the disk drive does not contain data that should be kept, ask
the user to delete all offline items:
1. Open the RSM Configurator and go to the Resource View
2. Select Detach or Delete as appropriate.
When the items have been deleted the disk drive becomes free.
Description: An SSA device on the link is preventing the
completion of the loop configuration.
Action: Go to “MAP 2010: START” on page 2010-1.
Service Request Numbers (SRNs)9
SRNFRU ListProblem
44PAADevice (100%)
45PAADevice (40%)
46000NoneDescription: An array is the Offline state because more than one
47000NoneDescription: An attempt has been made to store in the SSA
47500NoneDescription: Part of the array data might have been lost.
48000NoneDescription: The SSA adapter has detected a link configuration
49000NoneDescription: An array is in the Degraded state because a disk
49100NoneDescription: An array is in the Exposed state because a disk
49500NoneDescription: No hot-spare disk drives are available for an array
(“Exchanging Disk Drives” on page 19).
(“Exchanging Disk Drives” on page 19).
SSA Adapter card (40%)
SSA cables, or other SSA connections
in the device enclosure (20%).
(
Hardware Maintenance Manual
).
Description: An SSA device has a ‘Failed’ status.
Action: If the SSA service aids are available, run the Disk
service aid (see “Service Aids and Other Utilities” on page 29)
to find the failing device. If no device is listed as Rejected, use
the PAA part of the SRN to determine which device is failing.
Before you exchange the failing device, run nonconcurrent
diagnostics to that device to determine the cause of the
problem.
If the SSA service aids are not available, note the value of PAA
in this SRN, and go to “Finding the Physical Location of a
Device” on page 38. Exchange the failing FRU for a new FRU.
Description: The SSA adapter has detected an open SSA loop.
Action: If the SSA service aids are available, run the Disk
service aid (see “Service Aids and Other Utilities” on page 29)
to determine which part of the loop is failing.
If the SSA service aids are not available, note the value of PAA
in this SRN, and go to “Finding the Physical Location of a
Device” on page 38.
disk drive is not available. At least one member disk drive of
the array is present, but more than one member disk drive is
missing.
Action: Go to “MAP 2010: START” on page 2010-1.
adapter the details of more than 32 arrays.
Action: Go to “MAP 2010: START” on page 2010-1.
Action: Go to “MAP 2010: START” on page 2010-1.
that is not valid.
Action: See “SSA Loop Configurations that Are Not Valid” on
page 12.
drive is not available to the array, and a write command has
been sent to that array.
Action: A disk drive might not be available for one of the
following reasons:
The disk drive has failed.
The disk drive has been removed from the subsystem.
An SSA link has failed.
A power failure has occurred.
Go to “MAP 2010: START” on page 2010-1.
drive is not available to the array.
Action: A disk drive can become not available for several
reasons:
The disk drive has failed.
The disk drive has been removed from the subsystem.
An SSA link has failed.
A power failure has occurred.
Go to “MAP 2010: START” on page 2010-1.
that is configured for hot spare disk drives.
Action: Go to “MAP 2010: START” on page 2010-1.
10IBM SerialRAID Adapter Maintenance Information
SRNFRU ListProblem
49700NoneDescription: The parity for the array is not complete.
50000SSA adapter card (100%)Description: The SSA adapter failed to respond to the device
50001SSA adapter card (100%)Description: A data parity error has occurred.
50002SSA adapter card (100%)Description: An SSA adapter DMA error has occurred.
50005SSA adapter card (100%)Description: A software error has occurred.
50006SSA adapter card (100%)Description: A channel check has occurred.
50008SSA adapter card (100%)Description: Unable to read or write the PCI registers.
50010SSA adapter card (100%)Description: An SSA adapter or device drive protocol error has
50012SSA adapter card (100%)Description: The SSA adapter microcode has hung.
D4000SSA adapter card (100%)Description: The diagnostics cannot configure the SSA adapter.
D4100SSA adapter card (100%)Description: The diagnostics cannot open the SSA adapter.
D4300SSA adapter card (100%)Description: The diagnostics have detected an SSA adapter
DFFFFSSA adapter card (100%)Note: The description and action for this SRN are valid only if
Action: Go to “MAP 2010: START” on page 2010-1.
driver.
Action: Exchange the FRU for a new FRU.
Action: Exchange the FRU for a new FRU.
Action: Exchange the FRU for a new FRU.
Action: Exchange the FRU for a new FRU.
Action: Go to “Software and Microcode Errors” on page 12
before exchanging the FRU.
Action: Exchange the FRU for a new FRU.
Action: Exchange the FRU for a new FRU.
occurred.
Action: Go to “Software and Microcode Errors” on page 12
before exchanging the FRU.
Action: Run nonconcurrent diagnostics to the SSA adapter.
If the diagnostics fail, exchange the FRU for a new FRU.
If the diagnostics do not fail, go to “Software and Microcode
Errors” on page 12 before exchanging the FRU.
Action: Exchange the FRU for a new FRU.
Action: Exchange the FRU for a new FRU.
POST failure.
Action: Exchange the FRU for a new FRU.
you have run diagnostics to the SSA attachment. If this SRN
has occurred because you have run diagnostics on some other
device, see the service information for that device. Description:
A command or parameter that has been sent or received is not
valid. This problem is caused either by the SSA adapter, or by
an error in the microcode.
Action: Go to “Software and Microcode Errors” on page 12
before exchanging the FRU.
Service Request Numbers (SRNs)11
Software and Microcode Errors
Some SRNs indicate that a problem might have been caused by a software error or by
a microcode error. If you have one of these SRNs, do the following actions:
1. Make a note of the contents of the error log for the device that has the problem.
2. Go to the system service aids and select Display Vital Product Data to display
the VPD of the failing system. Make a note of the VPD for all the SSA adapters
and disk drives.
3. Report the problem to your support center. The center can tell you whether you
have a known problem, and can, if necessary, provide you with a correction for the
software or microcode.
SSA Loop Configurations that Are Not Valid
Note: This section is related to SRN 48000.
SRN 48000 shows that the SSA loop contains more devices or adapters than are
allowed. The maximum numbers allowed depend on the adapter. Refer to the
SerialRAID Adapter: Installation and User's Guide
If the SRN occurred when you or the customer turned on the system:
1. Turn off the system.
2. Review the configuration that you are trying to make, and determine why that
configuration is not valid.
3. Correct your configuration by reconfiguring the SSA cables or by removing the
excess devices or adapters from the loop.
4. Turn on the system.
IBM
for details.
If the SRN occurred because additional devices or adapters were added to a working
SSA loop:
1. Remove the additional devices or adapters that are causing the problem, and put
the loop back into its original, working configuration.
Note:
configuration code to reset itself from the effects of the error.
2. Review the configuration that you are trying to make, and determine why that
configuration is not valid.
3. Correct your configuration by reconfiguring the SSA cables or by removing the
excess devices or adapters from the loop.
It is important that you do these actions
, because they enable the
Dealing with Fast-Write Problems
Fast-Write problems are indicated by Service Request Numbers (SRNs): in the series
425xx.
12IBM SerialRAID Adapter Maintenance Information
The procedure, using the RSM configurator, for removing the Fast-Write function from a
resource when advised to do this is as follows:
1. Start the RSM Configurator and select all the resources that have Fast-Write
enabled
2. Perform the following actions on each resource in turn:
a. Go to the Resource View page for each resource you want to change
b. Select Delete FW
Note: All Fast-Write resources are identified with the symbol of a lightening flash
against them.
Service Request Numbers (SRNs)13
14IBM SerialRAID Adapter Maintenance Information
SSA Link Errors
SSA link errors can be caused by a number reasons, for example if:
Power is removed from an SSA device
An SSA device is failing
An SSA device is removed
A cable is disconnected.
Errors might be indicated in various ways, such as:
SRN 45PAA
A flashing link status (Ready) light on the SSA device at each end of the failing link
The indication of an open link when using the Disk Service Aid.
SSA Link Error Problem Determination
Instead of using the normal MAPs to solve a link error problem, you can refer directly to
the link status lights to isolate the failing FRU. The descriptions given here show you
how to do this.
In an SSA loop, devices are connected through two or more SSA links to an SSA RAID
Adapter. Each SSA link is the connection between two SSA nodes (devices or
adapters); for example:
Disk drive to disk drive
Adapter to disk drive module
Adapter to adapter
An SSA link can contain several parts. When doing problem determination, think of the
link and all its parts as one complete item.
Here are some examples of SSA links. Each link contains more than one part.
Example 1
This link is between two disk drives that are in the same subsystem. It has three parts.
SSA Subsystem
Internal
Disk
Connection
Drive 1
Copyright IBM Corp. 1998 15
Disk
Drive 2
Example 2
This link is between two disk drives that are in the same subsystem. It has five parts.
SSA Subsyst em
Disk
Drive 1
Internal
Connection
Dummy
Disk
Drive
Internal
Connection
Disk
Drive 2
Example 3
This link is between two disk drives that are not in the same subsystem. It has seven
parts.
SSA Subsyst emSSA Subsystem
Disk
Drive
Internal
Connection
SSA
Connector
Card
Cable
SSA
Connector
Card
Internal
Connection
Disk
Drive
Example 4
This link is between a disk drive and an SSA RAID Adapter. It has five parts.
SSA Subsyst em
Disk
Drive
Internal
Connection
SSA
Connector
Card
Cable
Adapter
Link Status Lights
If a fault occurs that prevents the operation of a particular link, the link status lights of
the various parts of the complete link show that the error has occurred.
You can find the failing link by looking for the flashing green status light at each end of
the affected link. Some configurations might have other indicators along the link (for
example, SSA connector cards) to help with FRU isolation.
The meanings of the disk drive and adapter lights are summarized here.
16IBM SerialRAID Adapter Maintenance Information
Status of LightMeaning
OffBoth SSA links are inactive.
Permanently onBoth SSA links are active.
Slow flash
(two seconds on,
two seconds off)
If you need more information about the lights, see:
For adapter lights, “Introducing the IBM SerialRAID Adapter” on page 1 in this
book.
For other lights, the service information for the device that contains the lights.
Locating a Broken Loop
Using the RSM configurator, go to the Physical View of the selected adapter and look
for the symbol Break. This indicates a broken SSA loop.
Using the DOS configurator you can access the disk service aids to show the SSA loop
that is broken.
Only one SSA link is active.
à
@
┌───────────────────────────────────────────────────────────────────────┐Ur
│CONFIGSSA Configurator and Service AidsyymmddDOS Version│
└───────────────────────────────────────────────────────────────────────┘
This example screen shows a break (the dotted line) in the SSA loop between the
second and third disk drives. In the condition shown by the display, the Ready lights
on the second and third disk drives are both flashing.
SSA Link Errors17
To help locate these disk drives, select the disk drive, and press F9 (FlashOn). The
Check light on the selected disk drive flashes. This action does not affect the
customer’s operations.
For more information about the service aids, see “Service Aids and Other Utilities” on
page 29.
18IBM SerialRAID Adapter Maintenance Information
Removing and Replacing FRUs
Exchanging Disk Drives
When a maintenance procedure requires you to replace a faulty disk drive with a new
one, first check whether the disk drive to be removed is a member of a RAID array.
If the disk drive to be changed IS NOT a member of an array, go to “Exchanging
an Array Disk” on page 20.
If the disk drive to be changed IS a member of an array, go to “Exchanging a
Non-Array Disk Drive.”
Replacement Disk Drives
There are two points to note about a disk drive you are installing to replace a faulty
unit.
If the replacement disk drive is a new unit from the factory, or one previously used
in an AIX machine, it will be placed on the list of New Resources. It must be
converted to a Free Resource before it can be used by the PC.
If the replacement disk was previously formatted as a member of a RAID array in a
different system, it will be identified as a Pre-Configured disk. It must be converted
to a Free Resource before it can be used in the new system.
Exchanging a Non-Array Disk Drive
The procedure depends on whether you are using the DOS configurator or the RSM
configurator.
Using the DOS Configurator
1. From the Main Menu, select SSA Adapter List
2. Select the required adapter from the list displayed.
3. From the Adapter Menu, select Disk Service Aids.
4. Select the disk drive that you want to change.
If necessary use the Identify function to find the disk drive. Press F9 (FlashOn);
the Check light flashes on the selected disk drive; press F10 (FlashOff) to remove
the function.
5. Put the disk drive into Service Mode.
Place the cursor on the disk drive entry and press F4.
6. Remove the old drive and replace it with a new one (see the unit
Maintenance Manual
7. Press Esc to exit from the Service Aids window.
This action automatically resets Service Mode on the new disk drive.
Copyright IBM Corp. 1998 19
).
Hardware
8. Repeat the procedure given above for any other disk drive that you are changing.
9. If necessary, convert the newly-installed disk drive into a free resource (see the
configurator information in the
Guide
).
IBM SerialRAID Adapter: Installation and User's
Using the RSM Configurator
1. Start the RSM configurator and select the appropriate adapter from the Adapter
List.
2. On the Adapter View page, select the Physical View.
3. On the Physical View page, select the disk drive you want to change.
4. On the Disk View page, click on the Service Mode button at the bottom of the page
to put the disk into service mode.
If necessary use the Identify function to find the actual disk drive. Click FlashOn,
the Check light flashes on the selected disk drive; click FlashOff to remove the
function.
5. You can now remove the faulty disk drive and insert the replacement.
6. To reset Service Mode on the new disk, move back to the Physical View page and
click on the Reset Service Mode button at the base of the page.
If necessary, convert the newly-installed disk drive into a free resource (see the
configurator information in the
Guide
).
IBM SerialRAID Adapter: Installation and User's
Exchanging an Array Disk
This section describes how a disk is logically removed from an array and replaced by a
compatible Free Resource. Such action could be necessay, for example, to check a
disk drive that is giving a high level of read/write errors but has not been rejected from
the array.
This action is also necessary if an array disk develops a hard fault and there is no
hot-spare available. If this happens, at the next write operation the faulty disk is
automatically de-configured and moved to the Rejected list. In the array, the faulty disk
is replaced by a Blank Reserved. To restore the array to its full operational status you
need to replace the Blank Reserved in the array with a suitable Free Resource.
Note: If there was a Hot-Spare available when the array disk became faulty the
hot-spare is automatically integrated into the array and the faulty disk moved to the list
of Rejected disks. The procedure in this event is to convert the Rejected disk to a Free
resource, then change it as described in “Exchanging a Non-Array Disk Drive” on
page 19. The new disk can then be reassigned as a Hot-Spare to replace the one that
was used.
20IBM SerialRAID Adapter Maintenance Information
Exchanging an Array Disk Using the DOS Configurator
1. Start the DOS Configurator.
2. From the Main Menu, select SSA Adapter List, then select the required adapter
from the list.
3. From the Adapter Menu, select RAID 5 Resources.
4. Select the array from which you want to remove a disk drive.
5. Select View Members.
6. Select the disk drive that you want to remove.
7. Press F7 (Exchange Members).
This displays a list of Free Resources (disk drives) that are compatible as
replacements for the array disk drive.
The list also includes the item Blank Reserved. If there are no Free Resources
available and you intend to physically remove the array disk, you must exchange
the array disk with the Blank Reserved.
8. Select an appropriate Free Resource (or the Blank Reserved).
The selected item replaces the disk drive in the array. The array disk drive is
logically removed and returned to the list of Free Resources.
9. If you need to perform maintenance on the disk drive removed from the array:
a. Go to the list of Free Resources.
b. Select the disk drive that you logically removed from the array.
c. Set Service Mode.
You can now physically remove the disk drive for maintenance.
Exchanging an Array Disk Using the RSM Configurator
1. Start the RSM configurator and select the appropriate adapter from the adapter list.
2. On the Adapter View page, select the Logical View.
3. On the Logical View page, select the RAID type.
This opens the Resource list, listing the defined arrays.
4. Select the required array.
5. On the Array View page scroll down to the list of array members.
6. Select the disk drive that you want to remove.
This will take you to the Disk View page for this disk.
7. On the Disk View page, click on the Comp.Exchange button.
A list is displayed showing the Free Resource candidates that are suitable as
replacements for the array disk drive.
8. Select an appropriate Free Resource.
Removing and Replacing FRUs21
Loading...
+ 67 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.