Dell OpenManage Server Administrator Version 2.3 Messages Reference Guide

Dell OpenManage™
Server
Administrator

Messages Reference Guide

www.dell.com | support.dell.com
Notes and Notices
NOTE: A NOTE indicates important information that helps you make better use of your computer.
NOTICE: A NOTICE indicates either potential damage to hardware or loss of data and tells you how to avoid the problem.
Information in this document is subject to change without notice. © 2003–2005 Dell Inc. All rights reserved.
Reproduction in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden. Trademarks used in this text: The DELL logo and Dell OpenManage are trademarks of Dell Inc.; Microsoft and Windows are registered
trademarks of Microsoft Corporation; Novell and NetWare are registered trademarks of Novell, Inc.; Red Hat is a registered trademark of
Hat, Inc.
Red Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products.
Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own.
November 2005
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
What’s New in this Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Messages Not Described in This Guide
Understanding Event Messages
Viewing Alerts and Event Messages
. . . . . . . . . . . . . . . . . . . . . . 7
. . . . . . . . . . . . . . . . . . . . . . . . . . 8
. . . . . . . . . . . . . . . . . . . . . . . 9
2 Event Message Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Miscellaneous Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Temperature Sensor Messages
Cooling Device Messages
Voltage Sensor Messages
Current Sensor Messages
Chassis Intrusion Messages
Redundancy Unit Messages
Power Supply Messages
Memory Device Messages
Fan Enclosure Messages
AC Power Cord Messages
. . . . . . . . . . . . . . . . . . . . . . . . . 16
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
. . . . . . . . . . . . . . . . . . . . . . . . . . . 25
. . . . . . . . . . . . . . . . . . . . . . . . . . . 26
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
. . . . . . . . . . . . . . . . . . . . . . . . . . . 30
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Hardware Log Sensor Messages
Processor Sensor Messages
Pluggable Device Messages
. . . . . . . . . . . . . . . . . . . . . . . . . . 34
. . . . . . . . . . . . . . . . . . . . . . . . . . 36
. . . . . . . . . . . . . . . . . . . . . . . . 33
Contents 3
3 System Event Log Messages for IPMI Systems . . . . . . . . . . . . 37
Temperature Sensor Events . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Voltage Sensor Events
Fan Sensor Events
Processor Status Events
Power Supply Events
Memory ECC Events
BMC Watchdog Events
Memory Events
Hardware Log Sensor Events
Drive Events
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Intrusion Events
BIOS Generated System Events
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
. . . . . . . . . . . . . . . . . . . . . . . . . . 43
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
. . . . . . . . . . . . . . . . . . . . . . . . . 44
4 Storage Management Message Reference . . . . . . . . . . . . . . . 45
Alert Monitoring and Logging . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Alert Descriptions and Corrective Actions
. . . . . . . . . . . . . . . . . . . 45
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4 Contents

Introduction

Dell OpenManage™ Server Administrator produces event messages stored primarily in the operating system or Server Administrator event logs and sometimes in SNMP traps. This document describes the event messages created by Server Administrator version 2.0 or later and displayed in the Server Administrator Alert log.
Server Administrator creates events in response to sensor status changes and other monitored parameters. The Server Administrator event monitor uses these status change events to add descriptive messages to the operating system event log or the Server Administrator Alert log.
Each event message that Server Administrator adds to the alert log consists of a unique identifier called the event ID for a specific event source category and a descriptive message. The event message includes the severity, cause of the event, and other relevant information, such as the event location and the monitored item’s previous state.
Tables provided in this guide list all Server Administrator event IDs in numeric order. Each entry includes the event ID’s corresponding description, severity level, and cause. Message text in angle brackets (for example,
Administrator.
Server

What’s New in this Release

The following changes in Server Administrator are documented in this guide:
Support for additional Storage Management messages
Removed support for Novell
<State>
) describes the event-specific information provided by the
®
NetWare
®

Messages Not Described in This Guide

This guide describes only event messages created by Server Administrator and displayed in the Server Administrator Alert log. For information on other messages produced by your system, consult one of the following sources:
Your system’s
Other system documentation
Operating system documentation
Application program documentation
For more information on Array Manager event messages, see the Array Manager documentation.
Installation and Troubleshooting Guide
Introduction 7

Understanding Event Messages

This section describes the various types of event messages generated by the Server Administrator. When
an event occurs on your system, the Server Administrator sends information about one of the
following event types to the systems management console:
Table 1-1. Understanding Event Messages
Icon Alert Severity Component Status
An event that describes the successful operation of a unit.
OK/Normal
Warning/Non-critical
Critical/Failure/Error
for informational purposes and does not indicate an error condition. For example, the alert may indicate the normal start or stop of an operation, such as power supply or a
An event that is not necessarily significant, but may indicate a possible future problem.
component (such as a temperature probe in an enclosure) has crossed a warning threshold.
A significant event that indicates actual or imminent loss of data or loss of function.
an array disk.
For example, a Warning/Non-critical alert may indicate that a
For example,
sensor reading returning to normal.
crossing a failure threshold or a hardware failure such as
Server Administrator generates events based on status changes in the following sensors:
Temperature Sensor
— Helps protect critical components by alerting the systems management console when temperatures become too high inside a chassis; also monitors a variety of locations in the chassis and in any attached systems.
Fan Sensor
Voltage Sensor
— Monitors fans in various locations in the chassis and in any attached systems.
— Monitors voltages across critical components in various chassis locations and in any
attached systems.
Current Sensor
— Monitors the current (or amperage) output from the power supply (or supplies) in
the chassis and in any attached systems.
Chassis Intrusion Sensor
Redundancy Unit Sensor
— Monitors intrusion into the chassis and any attached systems.
— Monitors redundant units (critical units such as fans, AC power cords, or power supplies) within the chassis; also monitors the chassis and any attached systems. For example, redundancy allows a second or
n
th fan to keep the chassis components at a safe temperature when another fan has failed. Redundancy is normal when the intended number of critical components are operating. Redundancy is degraded when a component fails, but others are still operating. Redundancy is lost when there is one less critical redundancy device than required.
Power Supply Sensor
Memory Prefailure Sensor
— Monitors power supplies in the chassis and in any attached systems.
— Monitors memory modules by counting the number of Error Correction
Code (ECC) memory corrections.
The alert is provided
8 Introduction
Fan Enclosure Sensor
insertion into the system, and by measuring how long a fan enclosure is absent from the chassis. This sensor monitors the chassis and any attached systems.
AC Power Cord Sensor
Hardware Log Sensor
Processor Sensor
Pluggable Device Sensor
• pluggable devices, such as memory cards.
— Monitors protective fan enclosures by detecting their removal from and
— Monitors the presence of AC power for an AC power cord.
— Monitors the size of a hardware log.
— Monitors the processor status in the system.
— Monitors the addition, removal, or configuration errors for some
Sample Event Message Text
The following example shows the format of the event messages logged by Server Administrator.
EventID: 1000
Source: Server Administrator
Category: Instrumentation Service
Type: Information
Date and Time: Wed Mar 15 10:38:00 2006
Computer:
Description:
Server Administrator starting
Data: Bytes in Hex
<computer name>

Viewing Alerts and Event Messages

An event log is used to record information about important events.
Storage Management generates alerts that are added to the Microsoft® Windows® application alert log and to the Server Administrator Alert log. To view these alerts in Server Administrator:
1
Select the
2
Select the
3
Select the
You can also view the event log using your operating system’s event viewer. Each operating system’s event viewer accesses the applicable operating system event log.
System
object in the tree view.
Logs
tab.
Alert
subtab.
Introduction 9
The location of the event log file depends on the operating system you are using.
In the Microsoft Windows 2000 Advanced Server and Windows Server® 2003 operating systems, messages are logged to the system event log and optionally to a unicode text file, (viewable using Notepad), that is located in the
install_path
In the Red Hat
is
C:\Program Files\Dell\SysMgt
®
Enterprise Linux operating system, messages are logged to the system log file.
The default name of the system log file is
install_path
.
/var/log/messages
\omsa\log
directory. The default
. You can view the messages file using a text
dcsys32.log
editor such as vi or emacs.
NOTE: Logging messages to a unicode text file is optional. By default, the feature is disabled. To enable this
feature, modify the Event Manager section of the dcemdy32.ini file as follows:
In Windows, locate the file at install_path\dataeng\ini and set
install_path is C:\Program Files\Dell\SysMgt. Restart the Systems Management Event Manager service.
In Red Hat Enterprise Linux, locate the file at install_path/dataeng/ini and set
UnitextLog.enabled=True.
restart command to restart the systems management event manager service. This will also restart the systems management data manager and SNMP services.
The default install_path is /opt/dell/svradmin. Issue the service dataeng
UnitextLog.enabled=True
. The default
The following subsections explain the procedure to open the Windows 2000 Advanced Server, Windows
Server 2003, and Red Hat Enterprise Linux event viewers.
Viewing Events in Windows 2000 and Windows Server 2003
1
Click the
2
Double-click
3
In the
The
Start
Administrative Tools
Event Viewer
System Log
button, point to
window, click the
Settings
, and click
Control Panel
, and then double-click
Tree
tab and then click
Event Viewer
window displays a list of recently logged events.
.
.
System Log
.
4
To view the details of an event, double-click one of the event items.
NOTE: You can also look up the dcsys32.log file, in the install_path\omsa\log directory, to view the separate
event log file. The default install_path is C:\Program Files\Dell\SysMgt.
Viewing Events in Red Hat Enterprise Linux
1
Log in as
2
Use a text editor such as vi or emacs to view the file named
The following example shows the Red Hat Enterprise Linux message log, /var/log/messages. The text in boldface type indicates the message text.
NOTE: These messages are typically displayed as one long line. In the following example, the message is
displayed using line breaks to help you see the message text more clearly.
10 Introduction
root
.
/var/log/messages
.
...
Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service EventID: 1000
Server Administrator starting
Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service EventID: 1001
Server Administrator startup complete
Feb 6 14:21:21 server01 Server Administrator: Instrumentation Service EventID: 1254 Chassis intrusion detected Sensor location: Main chassis
intrusion Chassis location: Main System Chassis Previous state was: OK (Normal) Chassis intrusion state: Open
Feb 6 14:21:51 server01 Server Administrator: Instrumentation Service EventID: 1252 Chassis intrusion returned to normal Sensor location: Main
chassis intrusion Chassis location: Main System Chassis Previous state was: Critical (Failed) Chassis intrusion state: Closed
Viewing the Event Information
The event log for each operating system contains some or all of the following information:
Date
— The date the event occurred.
Time
— The local time the event occurred.
Ty p e
— A classification of the event severity: Information, Warning, or Error.
User
— The name of the user on whose behalf the event occurred.
Computer
Source
Category
Event ID
Description
depending on the event type.
— The name of the system where the event occurred.
— The software that logged the event.
— The classification of the event by the event source.
— The number identifying the particular event type.
— A description of the event. The format and contents of the event description vary,
Introduction 11
Understanding the Event Description
Ta b l e 1-2 lists in alphabetical order each line item that may appear in the event description.
Table 1-2. Event Description Reference
Description Line Item Explanation
Action performed was:
Action requested was:
Additional Details:
details for the event>
<Additional power supply status information>
Chassis intrusion state:
<Intrusion state>
Chassis location:
chassis>
Configuration error type:
of configuration error>
Current sensor value (in Amps):
<Reading>
Date and time of action:
and time>
Device location: <
chassis
Discrete current state:
Discrete temperature state:
>
<State>
<Action>
<Action>
<Additional
<Name of
<type
<Date
Location in
<State>
Specifies the action that was performed, for example:
Action performed was: Power cycle
Specifies the action that was requested, for example:
Action requested was: Reboot, shutdown OS first
Specifies additional details available for the hot plug event, for example:
Memory device: DIMM1_A Serial number: FFFF30B1
Specifies information pertaining to the event, for example:
Power supply input AC is off, Power supply POK (power OK) signal is not normal, Power supply is turned off
Specifies the chassis intrusion state (open or closed), for example:
Chassis intrusion state: Open
Specifies name of the chassis that generated the message, for example:
Chassis location: Main System Chassis
Specifies the type of configuration error that occurred, for example:
Configuration error type: Revision mismatch
Specifies the current sensor value in amps, for example:
Current sensor value (in Amps): 7.853
Specifies the date and time the action was performed, for example:
Date and time of action: Tue Mar 21 16:20:33 2006
Specifies the location of the device in the specified chassis, for example:
Device location: Memory Card A
Specifies the state of the current sensor, for example:
Discrete current state: Good
Specifies the state of the temperature sensor, for example:
Discrete temperature state: Good
12 Introduction
Table 1-2. Event Description Reference (continued)
Description Line Item Explanation
Discrete voltage state:
Fan sensor value:
Log type:
Memory device bank location:
<Log type>
<Reading>
<Bank name in chassis>
Memory device location:
name in chassis>
Number of devices required for full redundancy:
Possible memory module event cause:
Power Supply type:
<list of causes>
<Number>
power supply>
<State>
<Device
<type of
Specifies the state of the voltage sensor, for example:
Discrete voltage state: Good
Specifies the fan speed in revolutions per minute (RPM) or On/Off, for example:
Fan sensor value (in RPM): 2600
Fan sensor value: Off
Specifies the type of hardware log, for example:
Log type: ESM
Specifies the name of the memory bank in the system that generated the message, for example:
Memory device bank location: Bank_1
Specifies the location of the memory module in the chassis, for example:
Memory device location: DIMM_A
Specifies the number of power supply or cooling devices required to achieve full redundancy, for example:
Number of devices required for full redundancy: 4
Specifies a list of possible causes for the memory module event, for example:
Possible memory module event cause: Single bit warning error rate exceeded
Single bit error logging disabled
Specifies the type of power supply, for example:
Power Supply type: VRM
Previous redundancy state was:
<State>
Previous state was:
Processor sensor status:
<State>
<status>
Specifies the status of the previous redundancy message, for example:
Previous redundancy state was: Lost
Specifies the previous state of the sensor, for example:
Previous state was: OK (Normal)
Specifies the status of the processor sensor, for example:
Processor sensor status: Configuration error
Introduction 13
Table 1-2. Event Description Reference (continued)
Description Line Item Explanation
Redundancy unit:
location in chassis>
Sensor location:
chassis>
Temperature sensor value:
<Reading>
Voltage sensor value (in Volts):
<Reading>
<Redundancy
<Location in
Specifies the location of the redundant power supply or cooling unit in the chassis, for example:
Redundancy unit: Fan Enclosure
Specifies the location of the sensor in the specified chassis, for example:
Sensor location: CPU1
Specifies the temperature in degrees Celsius, for example:
Temperature sensor value (in degrees Celsius): 30
Specifies the voltage sensor value in volts, for example:
Voltage sensor value (in Volts): 1.693
14 Introduction

Event Message Reference

The following tables list in numerical order each event ID and its corresponding description, along with its severity and cause.
NOTE: For corrective actions, see the appropriate documentation.

Miscellaneous Messages

Miscellaneous messages in Table 2-1 indicate that certain alert systems are up and working.
Table 2-1. Miscellaneous Messages
Event ID Description Severity Cause
0000 Log was cleared Information User cleared the log from Server
Administrator.
0001 Log backup created Information The log was full, copied to backup,
and cleared.
1000 Server Administrator starting Information Server Administrator is beginning to
initialize.
1001 Server Administrator startup
complete
1002 A system BIOS update has been
scheduled for the next reboot
1003 A previously scheduled system
BIOS update has been canceled
1004 Thermal shutdown protection
has been initiated
Information Server Administrator completed its
initialization.
Information The user has chosen to update the flash
basic input/output system (BIOS).
Information The user has decided to cancel the flash
BIOS update, or an error has occurred during the flash.
Error This message is generated when a
system is configured for thermal shutdown due to an error event. If a temperature sensor reading exceeds the error threshold for which the system is configured, the operating system shuts down and the system powers off. This event may also be initiated on certain systems when a fan enclosure is removed from the system for an extended period of time.
Event Message Reference 15
Table 2-1. Miscellaneous Messages (continued)
Event ID Description Severity Cause
1005 SMBIOS data is absent Warning The system management BIOS does
not contain the required systems management BIOS version 2.2 or higher, or the BIOS is corrupted.
1006 Automatic System Recovery
(ASR) action was performed
Action performed was:
Date and time of action:
and time>
1007 User initiated host system
control action
Action requested was:
1008 Systems Management Data
Manager Started
1009 Systems Management Data
Manager Stopped
<Action>
<Date
<Action>
Error This message is generated when an
automatic system recovery action is
Information User requested a host system control
Information Systems Management Data Manager
Information Systems Management Data Manager
performed due to a non-responsive operating system. The action performed and the time of action are provided.
action to reboot, power off, or power cycle the system. Alternatively, the user had indicated protective measures to be initiated in the event of a thermal shutdown.
services were started.
services were stopped.

Temperature Sensor Messages

Temperature sensors listed in Table 2-2 help protect critical components by alerting the systems management console when temperatures become too high inside a chassis. The temperature sensor messages use additional variables: sensor location, chassis location, previous state, and temperature sensor value or state.
16 Event Message Reference
Table 2-2. Temperature Sensor Messages
Event ID Description Severity Cause
1050 Temperature sensor has failed
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
1051 Temperature sensor value unknown
Sensor location:
Chassis location:
If sensor type is not discrete:
Temperature sensor value (in degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
1052 Temperature sensor returned to a normal
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<Reading>
<Reading>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<State>
<Location in chassis>
<Name of chassis>
<State>
<Location in chassis>
<Name of chassis>
<State>
<State>
Information A temperature sensor on the
backplane board, system board, or the carrier in the specified system failed. The sensor location, chassis location, previous state, and temperature sensor value are provided.
Information A temperature sensor on the
backplane board, system board, or drive carrier in the specified system could not obtain a reading. The sensor location, chassis location, previous state, and a nominal temperature sensor value are provided.
Information A temperature sensor on the
backplane board, system board, or drive carrier in the specified system returned to a valid range after crossing a failure threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided.
Event Message Reference 17
Table 2-2. Temperature Sensor Messages (continued)
Event ID Description Severity Cause
1053 Temperature sensor detected a warning
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
1054 Temperature sensor detected a failure
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
1055 Temperature sensor detected a
non-recoverable value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<Reading>
<Reading>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<State>
<Location in chassis>
<Name of chassis>
<State>
<State>
<Location in chassis>
<Name of chassis>
<State>
<State>
Warning A temperature sensor on the
backplane board, system board, or drive carrier in the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided.
Error A temperature sensor on the
backplane board, system board, or drive carrier in the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided.
Error A temperature sensor on the
backplane board, system board, or drive carrier in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and temperature sensor value are provided.
18 Event Message Reference

Cooling Device Messages

Cooling device sensors listed in Table 2-3 monitor how well a fan is functioning. Cooling device messages provide status and warning information for fans in a particular chassis.
Table 2-3. Cooling Device Messages
Event ID Description Severity Cause
1100 Fan sensor has failed
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
1101 Fan sensor value unknown
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
1102 Fan sensor returned to a normal value
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
1103 Fan sensor detected a warning value
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
1104 Fan sensor detected a failure value
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
<Location in chassis>
<Name of chassis>
<State>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<Reading>
Information A fan sensor in the specified
system is not functioning. The sensor location, chassis location, previous state, and fan sensor value are provided.
Information A fan sensor in the specified
system could not obtain a reading. The sensor location, chassis location, previous state, and a nominal fan sensor value are provided.
Information A fan sensor reading on the
specified system returned to a valid range after crossing a warning threshold. The sensor location, chassis location, previous state, and fan sensor value are provided.
Warning A fan sensor reading in the
specified system exceeded a warning threshold. The sensor location, chassis location, previous state, and fan sensor value are provided.
Error A fan sensor in the specified
system detected the failure of one or more fans. The sensor location, chassis location, previous state, and fan sensor value are provided.
Event Message Reference 19
Table 2-3. Cooling Device Messages (continued)
Event ID Description Severity Cause
1105 Fan sensor detected a
non-recoverable value
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
<Location in chassis>
<Name of chassis>
<State>
<Reading>
Error A fan sensor detected an error
from which it cannot recover. The sensor location, chassis location, previous state, and fan sensor value are provided.

Voltage Sensor Messages

Voltage sensors listed in Table 2-4 monitor the number of volts across critical components. Voltage sensor messages provide status and warning information for voltage sensors in a particular chassis.
Table 2-4. Voltage Sensor Messages
Event ID Description Severity Cause
1150 Voltage sensor has failed
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Location in chassis>
<Name of chassis>
<State>
<Reading>
If sensor type is discrete:
Discrete voltage state:
1151 Voltage sensor value unknown
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Location in chassis>
<Name of chassis>
<State>
<State>
<Reading>
If sensor type is discrete:
Discrete voltage state:
<State>
Information A voltage sensor in the specified
system failed. The sensor location, chassis location, previous state, and voltage sensor value are provided.
Information A voltage sensor in the specified
system could not obtain a reading. The sensor location, chassis location, previous state, and a nominal voltage sensor value are provided.
20 Event Message Reference
Table 2-4. Voltage Sensor Messages (continued)
Event ID Description Severity Cause
1152 Voltage sensor returned to a normal
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Location in chassis>
<Name of chassis>
<State>
Information A voltage sensor in the specified
system returned to a valid range after crossing a failure threshold. The sensor location, chassis location, previous state, and voltage sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete voltage state:
1153 Voltage sensor detected a warning
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Location in chassis>
<Name of chassis>
<State>
<State>
Warning A voltage sensor in the specified
system exceeded its warning threshold. The sensor location, chassis location, previous state, and voltage sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete voltage state:
1154 Voltage sensor detected a failure
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Location in chassis>
<Name of chassis>
<State>
<State>
Error A voltage sensor in the specified
system exceeded its failure threshold. The sensor location, chassis location, previous state, and voltage sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete voltage state:
<State>
Event Message Reference 21
Table 2-4. Voltage Sensor Messages (continued)
Event ID Description Severity Cause
1155 Voltage sensor detected a
non-recoverable value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Reading>
If sensor type is discrete:
Discrete voltage state:
<Location in chassis>
<Name of chassis>
<State>
<State>
Error A voltage sensor in the specified
system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and voltage sensor value are provided.

Current Sensor Messages

Current sensors listed in Table 2-5 measure the amount of current (in amperes) that is traversing critical components. Current sensor messages provide status and warning information for current sensors in a particular chassis.
Table 2-5. Current Sensor Messages
Event ID Description Severity Cause
1200 Current sensor has failed
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Reading>
If sensor type is discrete:
Discrete current state:
<Location in chassis>
<Name of chassis>
<State>
<State>
Information A current sensor on the power
supply for the specified system failed. The sensor location, chassis location, previous state, and current sensor value are provided.
22 Event Message Reference
Table 2-5. Current Sensor Messages (continued)
Event ID Description Severity Cause
1201 Current sensor value unknown
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Location in chassis>
<Name of chassis>
<State>
Information A current sensor on the power
supply for the specified system could not obtain a reading. The sensor location, chassis location, previous state, and a nominal current sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete current state:
1202 Current sensor returned to a normal
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Location in chassis>
<Name of chassis>
<State>
<State>
Information A current sensor on the power
supply for the specified system returned to a valid range after crossing a failure threshold. The sensor location, chassis location, previous state, and current sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete current state:
1203 Current sensor detected a warning
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Location in chassis>
<Name of chassis>
<State>
<State>
Warning A current sensor on the power
supply for the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and current sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete current state:
<State>
Event Message Reference 23
Table 2-5. Current Sensor Messages (continued)
Event ID Description Severity Cause
1204 Current sensor detected a failure
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Location in chassis>
<Name of chassis>
<State>
Error A current sensor on the power
supply for the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and current sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete current state:
1205 Current sensor detected a
non-recoverable value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Location in chassis>
<Name of chassis>
<State>
<State>
Error A current sensor in the specified
system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and current sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete current state:
<State>
24 Event Message Reference

Chassis Intrusion Messages

Chassis intrusion messages listed in Table 2-6 are a security measure. Chassis intrusion means that someone is opening the cover to a system’s chassis. Alerts are sent to prevent unauthorized removal of parts from a chassis.
Table 2-6. Chassis Intrusion Messages
Event ID Description Severity Cause
1250 Chassis intrusion sensor has failed
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
state>
1251 Chassis intrusion sensor value unknown
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
state>
1252 Chassis intrusion returned to normal
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
state>
1253 Chassis intrusion in progress
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
state>
Information A chassis intrusion sensor in the
specified system failed. The sensor location, chassis location, previous state, and chassis intrusion state are provided.
Information A chassis intrusion sensor in the
specified system could not obtain a reading. The sensor location, chassis location, previous state, and chassis intrusion state are provided.
Information A chassis intrusion sensor in the
specified system detected that a cover was opened while the system was operating but has since been replaced. The sensor location, chassis location, previous state, and chassis intrusion state are provided.
Warning A chassis intrusion sensor in the
specified system detected that a system cover is currently being opened and the system is operating. The sensor location, chassis location, previous state, and chassis intrusion state are provided.
Event Message Reference 25
Table 2-6. Chassis Intrusion Messages (continued)
Event ID Description Severity Cause
1254 Chassis intrusion detected
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
state>
1255 Chassis intrusion sensor detected a
non-recoverable value
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
state>
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
Error A chassis intrusion sensor in the
specified system detected that the system cover was opened while the system was operating. The sensor location, chassis location, previous state, and chassis intrusion state are provided.
Error A chassis intrusion sensor in the
specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and chassis intrusion state are provided.

Redundancy Unit Messages

Redundancy means that a system chassis has more than one of certain critical components. Fans and power supplies, for example, are so important for preventing damage or disruption of a system that a chassis may have “extra” fans or power supplies installed. Redundancy allows a second or nth fan to keep the chassis components at a safe temperature when the primary fan has failed. Redundancy is normal when the intended number of critical components are operating. Redundancy is degraded when a component fails but others are still operating. Redundancy is lost when the number of components functioning falls below the redundancy threshold.
The number of devices required for full redundancy is provided as part of the message when applicable for the redundancy unit and the platform. For details on redundancy computation, see platform documentation.
Ta b l e 2-7 lists the redundancy unit messages.
the respective
Table 2-7. Redundancy Unit Messages
Event ID Description Severity Cause
1300 Redundancy sensor has failed
Redundancy unit:
in chassis>
Chassis location:
Previous redundancy state was:
<Redundancy location
<Name of chassis>
<State>
Information A redundancy sensor in the
26 Event Message Reference
specified system failed. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided.
Table 2-7. Redundancy Unit Messages (continued)
Event ID Description Severity Cause
1301 Redundancy sensor value unknown
Redundancy unit:
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
1302 Redundancy not applicable
Redundancy unit:
<Name of chassis>
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
1303 Redundancy is offline
Redundancy unit:
<Name of chassis>
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
1304 Redundancy regained
Redundancy unit:
<Name of chassis>
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
<Name of chassis>
<State>
<State>
<State>
<State>
Information A redundancy sensor in the
specified system could not obtain a reading. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided.
Information A redundancy sensor in the
specified system detected that a unit was not redundant. The redundancy location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided.
Information A redundancy sensor in the
specified system detected that a redundant unit is offline. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided.
Information A redundancy sensor in the
specified system detected that a “lost” redundancy device has been reconnected or replaced; full redundancy is in effect. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided.
Event Message Reference 27
Table 2-7. Redundancy Unit Messages (continued)
Event ID Description Severity Cause
1305 Redundancy degraded
Redundancy unit:
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
1306 Redundancy lost
Redundancy unit:
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
<Name of chassis>
<State>
<Name of chassis>
<State>
Warning A redundancy sensor in the
specified system detected that one of the components of the redundancy unit has failed but the unit is still redundant. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided.
Warning or Error (depending on the number of units that are functional)
A redundancy sensor in the specified system detected that one of the components in the redundant unit has been disconnected, has failed, or is not present. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided.

Power Supply Messages

Power supply sensors monitor how well a power supply is functioning. Power supply messages listed in Ta b l e 2-8 provide status and warning information for power supplies present in a particular chassis.
Table 2-8. Power Supply Messages
Event ID Description Severity Cause
1350 Power supply sensor has failed
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
<Location in chassis>
<Name of chassis>
<State>
<type of power
Information A power supply sensor in the
specified system failed. The sensor location, chassis location, previous state, and additional power supply status information are provided.
supply>
<Additional power supply status information>
If in configuration error state:
Configuration error type:
<type of
configuration error>
28 Event Message Reference
Table 2-8. Power Supply Messages (continued)
Event ID Description Severity Cause
1351 Power supply sensor value unknown
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
<Location in chassis>
<Name of chassis>
<State>
<type of power
supply>
Information A power supply sensor in the
specified system could not obtain a reading. The sensor location, chassis location, previous state, and additional power supply status information are provided.
<Additional power supply status information>
If in configuration error state:
Configuration error type:
<type of
configuration error>
1352 Power supply returned to normal
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
<Location in chassis>
<Name of chassis>
<State>
<type of power
Information A power supply has been
reconnected or replaced. The sensor location, chassis location, previous state, and additional power supply status information are provided.
supply>
<Additional power supply status information>
If in configuration error state:
Configuration error type:
<type of
configuration error>
1353 Power supply detected a warning
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
<Location in chassis>
<Name of chassis>
<State>
<type of power
supply>
Warning A power supply sensor reading in
the specified system exceeded a user-definable warning threshold. The sensor location, chassis location, previous state, and additional power supply status information are provided.
<Additional power supply status information>
If in configuration error state:
Configuration error type:
<type of
configuration error>
Event Message Reference 29
Table 2-8. Power Supply Messages (continued)
Event ID Description Severity Cause
1354 Power supply detected a failure
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
supply>
<Additional power supply status information>
If in configuration error state:
Configuration error type:
configuration error>
1355 Power supply sensor detected a non-
recoverable value
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
supply>
<Additional power supply status information>
If in configuration error state:
Configuration error type:
configuration error>
<Location in chassis>
<Name of chassis>
<State>
<type of power
<type of
<Location in chassis>
<Name of chassis>
<State>
<type of power
<type of
Error A power supply has been
disconnected or has failed. The sensor location, chassis location, previous state, and additional power supply status information are provided.
Error A power supply sensor in the
specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and additional power supply status information are provided.

Memory Device Messages

Memory device messages listed in Table 2-9 provide status and warning information for memory modules present in a particular system. Memory devices determine health status by monitoring the
memory correction rate and the type of memory events that have occurred.
ECC
NOTE: A critical status does not always indicate a system failure or loss of data. In some instances, the system has
exceeded the ECC correction rate. Although the system continues to function, you should perform system maintenance as described in Table
NOTE: In Table 2-9, <status> can be either critical or non-critical.
30 Event Message Reference
2-9.
Table 2-9. Memory Device Messages
Event ID Description Severity Cause
1403 Memory device status is
Memory device location:
chassis>
Possible memory module event cause:
<status>
<location in
Warning A memory device correction rate
exceeded an acceptable value. The memory device status and location are provided.
<list of causes>
1404 Memory device status is
Memory device location:
chassis>
Possible memory module event cause: <list of causes>
<status>
<location in
Error A memory device correction rate
exceeded an acceptable value, a memory spare bank was activated, or a multibit ECC error occurred. The system continues to function normally (except for a multibit error). Replace the memory module identified in the message during the system’s next scheduled maintenance. Clear the memory error on multibit ECC error. The memory device status and location are provided.

Fan Enclosure Messages

Some systems are equipped with a protective enclosure for fans. Fan enclosure messages listed in Ta b l e 2-10 monitor whether foreign objects are present in an enclosure and how long a fan enclosure is missing from a chassis.
Table 2-10. Fan Enclosure Messages
Event ID Description Severity Cause
1450 Fan enclosure sensor has failed
Sensor location:
Chassis location:
1451 Fan enclosure sensor value unknown
Sensor location:
Chassis location:
1452 Fan enclosure inserted into system
Sensor location:
Chassis location:
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
Information The fan enclosure sensor in the
specified system failed. The sensor location and chassis location are provided.
Information The fan enclosure sensor in the
specified system could not obtain a reading. The sensor location and chassis location are provided.
Information A fan enclosure has been inserted
into the specified system. The sensor location and chassis location are provided.
Event Message Reference 31
Table 2-10. Fan Enclosure Messages (continued)
Event ID Description Severity Cause
1453 Fan enclosure removed from system
Sensor location:
Chassis location:
1454 Fan enclosure removed from system for
an extended amount of time
Sensor location:
Chassis location:
1455 Fan enclosure sensor detected a non-
recoverable value
Sensor location:
Chassis location:
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
Warning A fan enclosure has been removed
from the specified system. The sensor location and chassis location are provided.
Error A fan enclosure has been removed
from the specified system for a user-definable length of time. The sensor location and chassis location are provided.
Error A fan enclosure sensor in the
specified system detected an error from which it cannot recover. The sensor location and chassis location are provided.

AC Power Cord Messages

AC power cord messages listed in Table 2-11 provide status and warning information for power cords that are part of an AC power switch, if your system supports AC switching.
Table 2-11. AC Power Cord Messages
Event ID Description Severity Cause
1500 AC power cord sensor has failed
Sensor location:
Chassis location:
1501 AC power cord is not being monitored
Sensor location:
Chassis location:
1502 AC power has been restored
Sensor location:
Chassis location:
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
Information An AC power cord sensor in the
Information The AC power cord status is not
Information An AC power cord that did not
32 Event Message Reference
specified system failed. The AC power cord status cannot be monitored. The sensor location and chassis location information are provided.
being monitored. This occurs when a system’s expected AC power configuration is set to nonredundant. The sensor location and chassis location information are provided.
have AC power has had the power restored. The sensor location and chassis location information are provided.
Table 2-11. AC Power Cord Messages (continued)
Event ID Description Severity Cause
1503 AC power has been lost
Sensor location:
Chassis location:
1504 AC power has been lost
Sensor location:
Chassis location:
1505 AC power has been lost
Sensor location:
Chassis location:
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
Warning An AC power cord has lost its
power, but there is sufficient redundancy to classify this as a warning. The sensor location and chassis location information are provided.
Error An AC power cord has lost its
power, and lack of redundancy requires this to be classified as an error. The sensor location and chassis location information are provided.
Error An AC power cord sensor in the
specified system failed. The AC power cord status cannot be monitored. The sensor location and chassis location information are provided.

Hardware Log Sensor Messages

Hardware logs provide hardware status messages to systems management software. On certain systems, the hardware log is implemented as a circular queue. When the log becomes full, the oldest status messages are overwritten when new status messages are logged. On some systems, the log is not circular. On these systems, when the log becomes full, subsequent hardware status messages are lost. Hardware log sensor messages listed in logs that may fill up, resulting in lost status messages.
Ta b l e 2-12 provide status and warning information about the noncircular
Table 2-12. Hardware Log Sensor Messages
Event ID Description Severity Cause
1550 Log monitoring has been disabled
Log type:
1551 Log status is unknown
Log type:
<Log type>
<Log type>
Information A hardware log sensor in the
specified system is disabled. The log type information is provided.
Information A hardware log sensor in the
specified system could not obtain a reading. The log type information is provided.
Event Message Reference 33
Table 2-12. Hardware Log Sensor Messages (continued)
Event ID Description Severity Cause
1552 Log size is no longer near or at
capacity
Log type:
1553 Log size is near or at capacity
Log type:
1554 Log size is full
Log type:
1555 Log sensor has failed
Log type:
<Log type>
<Log type>
<Log type>
<Log type>
Information The hardware log on the specified
system is no longer near or at its capacity, usually as the result of clearing the log. The log type information is provided.
Warning The size of a hardware log on the
specified system is near or at the capacity of the hardware log. The log type information is provided.
Error The size of a hardware log on the
specified system is full. The log type information is provided.
Error A hardware log sensor in the
specified system failed. The hardware log status cannot be monitored. The log type information is provided.

Processor Sensor Messages

Processor sensors monitor how well a processor is functioning. Processor messages listed in Table 2-13 provide status and warning information for processors in a particular chassis.
Table 2-13. Processor Sensor Messages
Event ID Description Severity Cause
1600 Processor sensor has failed
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
1601 Processor sensor value unknown
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
Information A processor sensor in the specified
Information A processor sensor in the specified
34 Event Message Reference
system is not functioning. The sensor location, chassis location, previous state and processor sensor status are provided.
system could not obtain a reading. The sensor location, chassis location, previous state and processor sensor status are provided.
Table 2-13. Processor Sensor Messages (continued)
Event ID Description Severity Cause
1602 Processor sensor returned to a normal
value
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
1603 Processor sensor detected a warning
value
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
1604 Processor sensor detected a failure
value
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
1605 Processor sensor detected a non-
recoverable value
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
Information A processor sensor in the specified
system transitioned back to a normal state. The sensor location, chassis location, previous state and processor sensor status are provided.
Warning A processor sensor in the specified
system is in a throttled state. The sensor location, chassis location, previous state and processor sensor status are provided.
Error A processor sensor in the specified
system is disabled, has a configuration error, or experienced a thermal trip. The sensor location, chassis location, previous state and processor sensor status are provided.
Error A processor sensor in the specified
system has failed. The sensor location, chassis location, previous state and processor sensor status are provided.
Event Message Reference 35

Pluggable Device Messages

The pluggable device messages listed in Table 2-14 provide status and error information when some devices, such as memory cards, are added or removed.
Table 2-14. Pluggable Device Messages
Event ID Description Severity Cause
1650
1651 Device added to system
1652 Device removed from system
1653 Device configuration error detected
<Device plug event type unknown>
Device location:
if available>
Chassis location:
available>
Additional details:
details for the events, if available>
Device location:
Chassis location:
Additional details:
details for the events>
Device location:
Chassis location:
Additional details:
details for the events>
Device location:
Chassis location:
Additional details:
details for the events>
<Location in chassis,
<Name of chassis, if
<Additional
<Location in chassis>
<Name of chassis>
<Additional
<Location in chassis>
<Name of chassis>
<Additional
<Location in chassis>
<Name of chassis>
<Additional
Information A pluggable device event message
of unknown type was received. The device location, chassis location, and additional event details, if available, are provided.
Information A device was added in the specified
system. The device location, chassis location, and additional event details, if available, are provided.
Information A device was removed from the
specified system. The device location, chassis location, and additional event details, if available, are provided.
Error A configuration error was detected
for a pluggable device in the specified system. The device may have been added to the system incorrectly.
36 Event Message Reference

System Event Log Messages for IPMI Systems

The following tables list the system event log (SEL) messages, their severity, and cause.
NOTE: For corrective actions, see the appropriate documentation.

Temperature Sensor Events

The temperature sensor event messages help protect critical components by alerting the systems management console when the temperature rises inside the chassis. These event messages use additional variables, such as sensor location, chassis location, previous state, and temperature sensor value or state.
Table 3-1. Temperature Sensor Events
Event Message Severity Cause
<
Sensor Name/Location
temperature sensor detected a failure <
Name/Location
that this sensor is monitoring. For example, "PROC Temp" or "Planar Temp."
Reading is specified in degree Celsius. For example 100 C.
<Sensor Name/Location
temperature sensor detected a warning <
<
Sensor Name/Location>
temperature sensor returned to warning state <
<
Sensor Name/Location
temperature sensor returned to normal state <
Reading
> is the entity
Reading
Reading
Reading
>
> where <
>
>.
>.
>
>.
Sensor
Critical Temperature of the backplane board,
system board, or the carrier in the specified system <Sensor Name/Location> exceeded the critical threshold.
Warning Temperature of the backplane board,
system board, or the carrier in the specified system <Sensor Name/Location> exceeded the non-critical threshold.
Warning Temperature of the backplane board,
system board, or the carrier in the specified system <Sensor Name/Location> returned from critical state to non-critical state.
Information Temperature of the backplane board,
system board, or the carrier in the specified system <Sensor Name/Location> returned to normal operating range.
System Event Log Messages for IPMI Systems 37

Voltage Sensor Events

The voltage sensor event messages monitor the number of volts across critical components. These messages provide status and warning information for voltage sensors for a particular chassis.
Table 3-2. Voltage Sensor Events
Event Message Severity Cause
<
Sensor Name/Location
sensor detected a failure < where < entity that this sensor is monitoring. For example, "CMOS Battery."
Reading is specified in volts. For example, 3.860 V.
Sensor Name/Location
< sensor state asserted.
<
Sensor Name/Location
sensor state de-asserted.
Sensor Name/Location
< sensor detected a warning <
Reading
Sensor Name/Location
< sensor returned to normal<
Sensor Name/Location
>.
> voltage
Reading
> is the
> voltage
> voltage
> voltage
> voltage
Reading
Critical The voltage of the monitored device is out of
>
Critical The voltage specified by <Sensor
Information The voltage of a previously reported <Sensor
Warning Voltage of the monitored entity <Sensor
Information The voltage of a previously reported <Sensor
>.
critical threshold.
Name/Location> is in critical state.
Name/Location> is returned to normal state.
Name/Location> exceeded the warning
threshold.
Name/Location> is returned to normal state.
38 System Event Log Messages for IPMI Systems

Fan Sensor Events

The cooling device sensors monitor how well a fan is functioning. These messages provide status warning and failure messages for fans for a particular chassis.
Table 3-3. Fan Sensor Events
Event Message Severity Cause
<
Sensor Name/Location
detected a failure <
Sensor Name/Location
< that this sensor is monitoring. For example "BMC Back Fan" or "BMC Front Fan."
Reading is specified in RPM. For example, 100 RPM.
<Sensor Name/Location
returned to normal state <
<
Sensor Name/Location
detected a warning <
<
Sensor Name/Location
sensor redundancy degraded.
Sensor Name/Location
< sensor redundancy lost.
<Sensor Name/Location> Fan Redundancy sensor redundancy regained
> Fan sensor
Reading
> is the entity
> Fan sensor
> Fan sensor
Reading
> Fan Redundancy
> Fan Redundancy
> where
Reading
>.
>.
Critical The speed of the specified <Sensor
Name/Location> fan is not sufficient to
provide enough cooling to the system.
Information The fan specified by <Sensor
Name/Location> has returned to its normal operating speed.
Warning The speed of the specified <Sensor
Name/Location> fan may not be sufficient to provide enough cooling to the system.
Information The fan specified by <Sensor
Name/Location> may have failed and hence, the redundancy has been degraded.
Critical The fan specified by <Sensor
Name/Location> may have failed and hence, the redundancy that was degraded previously has been lost.
Information The fan specified by <Sensor
Name/Location> may have started functioning again and hence, the redundancy has been regained.
System Event Log Messages for IPMI Systems 39

Processor Status Events

The processor status messages monitor the functionality of the processors in a system. These messages provide processor health and warning information of a system.
Table 3-4. Processor Status Events
Event Message Severity Cause
<
Processor Entity
sensor IERR, where <
Entity
generated the event. For example, PROC for a single processor system and PROC # for multiprocessor system.
< sensor Thermal Trip.
< sensor recovered from IERR.
< sensor disabled.
< sensor terminator not present.
> is the processor that
Processor Entity
Processor Entity
Processor Entity
Processor Entity
> status processor
Processor
> status processor
> status processor
> status processor
> status processor
Critical IERR internal error generated by the
<Processor Entity>.
Critical The processor generates this event before it
shuts down because of excessive heat caused by lack of cooling or heat synchronizating.
Information This event is generated when a processor
recovers from the internal error.
Warning This event is generated for all processors that
are disabled.
Information This event is generated if the terminator is
missing on an empty processor slot.

Power Supply Events

The power supply sensors monitor the functionality of the power supplies. These messages provide status and warning information for power supplies for a particular system.
Table 3-5. Power Supply Events
Event Message Severity Cause
<
Power Supply Sensor Name
supply sensor removed.
<
Power Supply Sensor Name
supply sensor AC recovered.
<
Power Supply Sensor Name
supply sensor returned to normal state.
40 System Event Log Messages for IPMI Systems
> power
> power
> power
Critical This event is generated when the power
supply sensor is removed.
Information This event is generated when the power
supply has been replaced.
Information This event is generated when the power
supply that failed or removed was replaced and the state has returned to normal.
Table 3-5. Power Supply Events (continued)
Event Message Severity Cause
<
Entity Name
redundancy degraded.
Entity Name
< redundancy lost.
Entity Name
< redundancy regained.
> PS Redundancy sensor
> PS Redundancy sensor
> PS Redundancy sensor
Information Power supply redundancy is degraded if one
of the power supply sources is removed or failed.
Critical Power supply redundancy is lost if only one
power supply is functional.
Information This event is generated if the power supply
has been reconnected or replaced.

Memory ECC Events

The memory ECC event messages monitor the memory modules in a system. These messages monitor the ECC memory correction rate and the type of memory events that occurred.
Table 3-6. Memory ECC Events
Event Message Severity Cause
ECC error correction detected on Bank # DIMM [A/B].
ECC uncorrectable error detected on Bank # [DIMM].
Correctable memory error logging disabled.
Information This event is generated when there is a
memory error correction on a particular Dual Inline Memory Module (DIMM).
Critical This event is generated when the chipset is
unable to correct the memory errors. Usually, a bank number is provided and DIMM may or may not be identifiable, depending on the error.
Critical This event is generated when the chipset in
the ECC error correction rate exceeds a predefined limit.
System Event Log Messages for IPMI Systems 41

BMC Watchdog Events

The BMC watchdog operations are performed when the system hangs or crashes. These messages monitor the status and occurrence of these events in a system.
Table 3-7. BMC Watchdog Events
Event Message Severity Cause
BMC OS Watchdog timer expired. Information This event is generated when the BMC watchdog
timer expires and no action is set.
BMC OS Watchdog performed system reboot.
BMC OS Watchdog performed system power off.
BMC OS Watchdog performed system power cycle.
Critical This event is generated when the BMC watchdog
detects that the system has crashed (timer expired because no response was received from Host) and the action is set to reboot.
Critical This event is generated when the BMC watchdog
detects that the system has crashed (timer expired because no response was received from Host) and the action is set to power off.
Critical This event is generated when the BMC watchdog
detects that the system has crashed (timer expired because no response was received from Host) and the action is set to power cycle.

Memory Events

The memory modules can be configured in different ways in particular systems. These messages monitor the status, warning, and configuration information about the memory modules in the system.
Table 3-8. Memory Events
Event Message Severity Cause
Memory RAID redundancy degraded.
Memory RAID redundancy lost. Critical This event is generated when redundancy is lost in a
Memory RAID redundancy regained Information This event is generated when the redundancy lost or
Memory Mirrored redundancy degraded.
Memory Mirrored redundancy lost.
Information This event is generated when there is a memory
failure in a RAID-configured memory configuration.
RAID-configured memory configuration.
degraded earlier is regained in a RAID-configured memory configuration.
Information This event is generated when there is a memory
failure in a mirrored memory configuration.
Critical This event is generated when redundancy is lost in a
mirrored memory configuration.
42 System Event Log Messages for IPMI Systems
Table 3-8. Memory Events (continued)
Event Message Severity Cause
Memory Mirrored redundancy regained.
Memory Spared redundancy degraded.
Memory Spared redundancy lost. Critical This event is generated when redundancy is lost in a
Memory Spared redundancy regained.
Information This event is generated when the redundancy lost or
degraded earlier is regained in a mirrored memory configuration.
Information This event is generated when there is a memory
failure in a spared memory configuration.
spared memory configuration.
Information This event is generated when the redundancy lost or
degraded earlier is regained in a spared memory configuration.

Hardware Log Sensor Events

The hardware logs provide hardware status messages to the system management software. On particular systems, the subsequent hardware messages are not displayed when the log is full. These messages provide status and warning messages when the logs are full.
Table 3-9. Hardware Log Sensor Events
Event Message Severity Cause
Log full detected. Critical This event is generated when the SEL device detects
that only one entry can be added to the SEL before it is full.
Log cleared. Information This event is generated when the SEL is cleared.

Drive Events

The drive event messages monitor the health of the drives in a system. These events are generated when there is a fault in the drives indicated.
Table 3-10. Drive Events
Event Message Severity Cause
Drive < state.
Drive < fault state.
Drive #
Drive #
> asserted fault
> de-asserted
Critical This event is generated when the specified drive in
the array is faulty.
Information This event is generated when the specified drive
recovers from a faulty condition.
System Event Log Messages for IPMI Systems 43

Intrusion Events

The chassis intrusion messages are a security measure. Chassis intrusion alerts are generated when the system's chassis is opened. Alerts are sent to prevent unauthorized removal of parts from the chassis.
Table 3-11. Intrusion Events
Event Message Severity Cause
<
Intrusion sensor Name
detected an intrusion.
<
Intrusion sensor Name
returned to normal state.
> sensor
> sensor
Critical This event is generated when the intrusion sensor
detects an intrusion.
Information This event is generated when the earlier intrusion has
been corrected.

BIOS Generated System Events

The BIOS generated messages monitor the health and functionality of the chipsets, I/O channels, and other BIOS-related functions. These system events are generated by the BIOS.
Table 3-12. BIOS Generated System Events
Event Message Severity Cause
System Event I/O channel chk. Critical This event is generated when a critical interrupt is
generated in the I/O Channel.
System Event PCI Parity Err. Critical This event is generated when a parity error is
detected on the PCI bus.
System Event Chipset Err. Critical This event is generated when a chip error is detected.
System Event PCI System Err. Critical This event indicates historical data, and is generated
when the system has crashed and recovered.
System Event PCI Fatal Err. Critical This error is generated when a fatal error is detected
on the PCI bus.
System Event PCIE Fatal Err. Critical This error is generated when a fatal error is detected
on the PCIE bus.
44 System Event Log Messages for IPMI Systems

Storage Management Message Reference

Storage Management’s alert or event management features let you monitor the health of storage resources such as controllers, connectors, array disks, and virtual disks.

Alert Monitoring and Logging

The Storage Management Service performs alert monitoring and logging. By default, the Storage Management Service starts when the managed system starts up. If you stop the Storage Management Service, then alert monitoring and logging stops. Alert monitoring does the following:
Updates the status of the storage object that generated the alert.
Propagates the storage object’s status to all the related higher objects in the storage hierarchy. For example, the status of a lower-level object will be propagated up to the status displayed on the Health tab for the top-level storage object.
Logs an alert into the Alert log and the Windows application log.
Sends an SNMP trap if the operating system’s SNMP service is installed and enabled.
NOTE: Storage Management does not log alerts regarding the data I/O path. These alerts are logged by the
respective RAID drivers in the system alert log.
For updated information, lookup the Storage Management Online Help and the Dell OpenManage Server Administrator Storage Management User’s Guide.

Alert Descriptions and Corrective Actions

The following sections describe alerts generated by the RAID or SCSI controllers supported by Storage Management. The alerts are displayed in the Server Administrator Alert subtab or through the Windows Event Viewer. These alerts can also be forwarded as SNMP traps to other applications.
SNMP traps are generated for the alerts listed in the following sections. These traps are included in the Storage Management management information base (MIB). The SNMP traps for these alerts use all of the SNMP trap variables. For more information on SNMP support and the MIB, see the SNMP Reference Guide.
To locate an alert, scroll through the following table to find the alert number displayed on the Server Administrator Alert tab or search this file for the alert message text or number. See Event Messages" for more information on severity levels.
Storage Management Message Reference 45
"Understanding
NOTE: If you have an Array Manager installation, the Array Manager console reports the status of storage
components through error icons and graphical displays. When there is a change in status, Array Manager sends events to the Array Manager event log, which can be viewed from the Array Manager console. For more information, see the Dell OpenManage™ Array Manager User's Guide.
For more information regarding alert descriptions and the appropriate corrective actions, see the online help.
Table 4-1. Storage Management Messages
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2048 Device failed Critical /
Failure / Error
2049 Array disk removed Warning /
Non-critical
Cause: A physical disk in the array failed. The failed disk may have been identified by the controller or channel. Performing a consistency check can also identify a failed disk.
Action: Replace the failed array disk. You can identify which disk has failed by locating the disk that has a red “X” for its status. Perform a rescan after replacing the disk.
Cause: A physical disk has been removed from the array. A user may have also executed the "Prepare to Remove" task. This alert can also be caused by loose or defective cables or by problems with the enclosure.
If a physical disk was removed from
Action:
the array, either replace the disk or restore the original disk. You can identify which disk has been removed by locating the disk that has a red “X” for its status. Perform a rescan after replacing or restoring the disk. If a disk has not been removed from the array, then check for problems with the cables. See the for more information on checking the cables. Make sure that the enclosure is powered on. If the problem persists, check the enclosure documentation for further diagnostic information.
online help
754, 804, 854, 904, 954, 1004, 1054, 1104, 1154, 1204
903 501
Array Manager Event Number
500
46 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2050 Array disk offline Warning /
Non-critical
2051 Array disk degraded Warning /
Non-critical
2052 Array disk inserted Ok /
Normal
2053 Virtual disk created Ok /
Normal
2054 Virtual disk deleted Warning /
Non-critical
2055 Virtual disk
configuration changed
Ok / Normal
Cause: A physical disk in the array is offline. A disk can be made offline during a "Prepare to Remove" operation or because a user manually put the disk offline.
Perform a rescan. You can also select
Action:
the offline disk and perform a Make Online operation.
Cause: An array disk has reported an error condition and may be degraded. The array disk may have reported the error condition in response to a consistency check or other operation.
Action: Replace the degraded array disk. You can identify which disk is degraded by locating the disk that has a red "X" for its status. Perform a rescan after replacing the disk.
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: A virtual disk has been deleted. "Performing a Reset Configuration" operation may detect that a virtual disk has been deleted and generate this alert.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
903 502
903 503
901 504
1201 505
1203 506
1201 507
Array Manager Event Number
Storage Management Message Reference 47
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2056 Virtual disk failed Critical /
Failure / Error
2057 Virtual disk degraded Warning /
Non-critical
2058 Virtual disk check
consistency started
Ok / Normal
Cause: One or more physical disks included in the virtual disk have failed. If the virtual disk is non-redundant (does not use mirrored or parity data), then the failure of a single physical disk can cause the virtual disk to fail. If the virtual disk is redundant, then more physical disks have failed than can be rebuilt using mirrored or parity information.
Create a new virtual disk and restore
Action:
from a backup.
Cause 1: This alert message occurs when a physical disk included in a redundant virtual disk fails. Because the virtual disk is redundant (uses mirrored or parity information) and only one physical disk has failed, the virtual disk can be rebuilt.
Action 1: Configure a hot spare for the virtual disk if one is not already configured. Rebuild the virtual disk. When using a Expandable RAID Controller (PERC) 2/SC, 3/SC, 2/DC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, or CERC ATA100/4ch controller, rebuild the virtual disk by first configuring a hot spare for the disk, and then initiating a write operation to the disk. The write operation will initiate a rebuild of the disk.
Cause 2: A physical disk in the array has been removed.
Action 2: If a physical disk was removed from the array, either replace the disk or restore the original disk. You can identify which disk has been removed by locating the disk that has a red “X” for its status. Perform a rescan after replacing the disk.
Cause: This alert is provided for informational purposes.
Action: None
1204 508
1203 509
1201 520
Array Manager Event Number
48 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2059 Virtual disk format
started
2061 Virtual disk
initialization started
2063 Virtual disk
reconfiguration started
2064 Virtual disk rebuild
started
2065 Array disk rebuild
started
2067 Virtual disk check
consistency cancelled
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: The check consistency operation cancelled because a physical disk in the array has failed or because a user cancelled the check consistency operation.
Action: If the physical disk failed, then replace the physical disk. You can identify which disk failed by locating the disk that has a red “X” for its status. Perform a rescan after replacing the disk. When performing a consistency check, be aware that the consistency check can take a long time. The time it takes depends on the size of the physical disk or the virtual disk.
1201 521
1201 523
1201 525
1201 526
901 527
1201 529
Array Manager Event Number
Storage Management Message Reference 49
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2070 Virtual disk
initialization cancelled
2074 Array disk rebuild
cancelled
2076 Virtual disk check
consistency failed
2077 Virtual disk format
failed.
2079 Virtual disk
initialization failed
Ok / Normal
Ok / Normal
Critical / Failure / Error
Critical / Failure / Error
Critical / Failure / Error
Cause: The virtual disk initialization cancelled because a physical disk included in the virtual disk has failed or because a user cancelled the virtual disk initialization.
Action: If a physical disk failed, then replace the physical disk. You can identify which disk has failed by locating the disk that has a red “X” for its status. Perform a rescan after replacing the disk. Restart the format array disk operation. Restart the virtual disk initialization.
Cause: A user has cancelled the rebuild operation.
Action: Restart the rebuild operation.
Cause: An array disk included in the virtual disk failed or there is an error in the parity information. A failed array disk can cause errors in parity information.
Action: Replace the failed array disk. You can identify which disk has failed by locating the disk that has a red “X” for its status. Rebuild the array disk. When finished, restart the check consistency operation.
Cause: An array disk included in the virtual disk failed.
Action: Replace the failed array disk. You can identify which array disk has failed by locating the disk that has a red "X" for its status. Rebuild the array disk. When finished, restart the virtual disk format operation.
Cause: An array disk included in the virtual disk has failed or a user has cancelled the initialization.
Action: If an array disk has failed, then replace the array disk.
1201 532
901 536
1204 538
1204 539
1204 541
Array Manager Event Number
50 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2080 Array disk initialize
failed
2081 Virtual disk
reconfiguration failed
2082 Virtual disk rebuild
failed
2083 Array disk rebuild
failed
2085 Virtual disk check
consistency completed
2086 Virtual disk format
completed
Critical / Failure / Error
Critical / Failure / Error
Critical / Failure / Error
Critical / Failure / Error
Ok / Normal
Ok / Normal
Cause: The array disk has failed or is corrupt.
Action: Replace the failed or corrupt disk. You can identify a disk that has failed by locating the disk that has a red “X” for its status. Restart the initialization.
Cause: An array disk included in the virtual disk has failed or is corrupt. A user may also have cancelled the reconfiguration.
Action: Replace the failed or corrupt disk. You can identify a disk that has failed by locating the disk that has a red “X” for its status. If the array disk is part of a redundant array, then rebuild the array disk. When finished, restart the reconfiguration.
Cause: An array disk included in the virtual disk has failed or is corrupt. A user may also have cancelled the rebuild.
Action: Replace the failed or corrupt disk. You can identify a disk that has failed by locating the disk that has a red “X” for its status. Restart the virtual disk rebuild.
Cause: An array disk included in the virtual disk has failed or is corrupt. A user may also have cancelled the rebuild.
Action: Replace the failed or corrupt disk. You can identify a disk that has failed by locating the disk that has a red “X” for its status. Rebuild the virtual disk rebuild.
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
904 542
1204 543
1204 544
904 545
1201 547
1201 548
Array Manager Event Number
Storage Management Message Reference 51
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2088 Virtual disk
initialization completed
2089 Array disk initialize
completed
2090 Virtual disk
reconfiguration completed
2091 Virtual disk rebuild
completed
2092 Array disk rebuild
completed
2094 Predictive Failure
reported. If this disk is part of a redundant virtual disk, select the "Offline" option and then replace the disk. Then configure a hot spare and it will start the rebuild automatically. If this disk is a hot spare, select the "Prepare to Remove" option and then replace the disk. If this disk is part of a non-redundant disk, you should back up your data immediately. If the disk fails, you will not be able to recover the data.
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Warning / Non-critical
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: The array disk is predicted to fail. Many array disks contain Self Monitoring Analysis and Reporting Technology (S.M.A.R.T.). When enabled, SMART monitors the health of the disk based on indications such as the number of write operations that have been performed on the disk.
Action: Replace the array disk. Even though the disk may not have failed yet, it is strongly recommended that you replace the disk. Review the message text for additional information.
1201 550
901 551
1201 552
1201 553
901 554
903 570
Array Manager Event Number
52 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2095 SCSI sense data. If
this disk is part of a redundant virtual disk, select the "Offline" option and then replace the disk. Then configure a hot spare and it will start the rebuild automatically. If this disk is a hot spare, select the "Prepare to Remove" option and then replace the disk. If this disk is part of a non-redundant disk, you should back up your data immediately. If the disk fails, you will not be able to recover the data.
2098 Global hot spare
assigned
2099 Global hot spare
unassigned
Warning / Non-critical
Ok / Normal
Ok / Normal
Cause: An array disk has failed, is corrupt, or is otherwise experiencing a problem.
Action: Replace the array disk. Even though the disk may not have failed yet, it is strongly recommended that you replace the disk. Review the message text for additional information.
Cause: A user has assigned an array disk as a global hot spare. This alert is provided for informational purposes.
Action: None
Cause: A user has unassigned an array disk as a global hot spare. This alert is provided for informational purposes.
Action: None
903 571
901 574
901 575
Array Manager Event Number
Storage Management Message Reference 53
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2100 Temperature
exceeded the maximum warning threshold
2101 Temperature dropped
below the minimum warning threshold
2102 Temperature
exceeded the maximum failure threshold
2103 Temperature dropped
below the minimum failure threshold
Warning / Non-critical
Warning / Non-critical
Critical / Failure / Error
Critical / Failure / Error
Cause: The array disk enclosure is too hot. A variety of factors can cause the excessive temperature. For example, a fan may have failed, the thermostat may be set too high, or the room temperature may be too hot.
Action: Check for factors that may cause overheating. For example, verify that the enclosure fan is working. You should also check the thermostat settings and examine whether the enclosure is located near a heat source. Make sure the enclosure has enough ventilation and that the room temperature is not too hot. See the enclosure documentation for more diagnostic information.
Cause: The array disk enclosure is too cool.
Action: Check whether the thermostat setting is too low and whether the room temperature is too cool.
Cause: The array disk enclosure is too hot. A variety of factors can cause the excessive temperature. For example, a fan may have failed, the thermostat may be set too high, or the room temperature may be too hot.
Action: Check for factors that may cause overheating. For example, verify that the enclosure fan is working. You should also check the thermostat settings and examine whether the enclosure is located near a heat source. Make sure the enclosure has enough ventilation and that the room temperature is not too hot. See the enclosure documentation for more diagnostic information.
Cause: The array disk enclosure is too cool.
Action: Check whether the thermostat setting is too low and whether the room temperature is too cool.
1053 591
1053 592
1054 593
1054 594
Array Manager Event Number
54 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2104 Controller battery is
reconditioning
2105 Controller battery
recondition is completed
2106 Smart FPT exceeded Warning /
2107 Smart configuration
change
2108 Smart warning Warning /
Ok / Normal
Ok / Normal
Non-critical
Critical / Failure / Error
Non-critical
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: A disk on the specified controller has received a SMART alert (predictive failure) indicating that the disk is likely to fail in the near future.
Action: Replace the disk that has received the SMART alert. If the array disk is a member of a non-redundant virtual disk, then back up the data before replacing the disk. Removing an array disk that is included in a non-redundant virtual disk will cause the virtual disk to fail and may cause data loss.
Cause: A disk has received a SMART alert (predictive failure) after a configuration change. The disk is likely to fail in the near future.
Action: Replace the disk that has received the SMART alert. If the array disk is a member of a non-redundant virtual disk, then back up the data before replacing the disk. Removing an array disk that is included in a non­redundant virtual disk will cause the virtual disk to fail and may cause data loss.
Cause: A disk has received a SMART alert (predictive failure). The disk is likely to fail in the near future.
Action: Replace the disk that has received the SMART alert. If the array disk is a member of a non-redundant virtual disk, then back up the data before replacing the disk. Removing an array disk that is included in a non­redundant virtual disk will cause the virtual disk to fail and may cause data loss.
1151 581
1151 582
903 585
904 586
903 587
Array Manager Event Number
Storage Management Message Reference 55
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2109 Smart warning
temperature
2110 Smart warning
degraded
Warning / Non-critical
Warning / Non-critical
Cause: A disk has reached an unacceptable temperature and received a SMART alert (predictive failure). The disk is likely to fail in the near future.
First Action: Determine why the array disk has reached an unacceptable temperature. A variety of factors can cause the excessive temperature. For example, a fan may have failed, the thermostat may be set too high, or the room temperature may be too hot or cold. Verify that the fans in the server or enclosure are working. If the array disk is in an enclosure, you should check the thermostat settings and examine whether the enclosure is located near a heat source. Make sure the enclosure has enough ventilation and that the room temperature is not too hot. See the enclosure documentation for more diagnostic information.
Second Action: If you cannot identify why the disk has reached an unacceptable temperature, then replace the disk. If the array disk is a member of a non-redundant virtual disk, then back up the data before replacing the disk. Removing an array disk that is included in a non-redundant virtual disk will cause the virtual disk to fail and may cause data loss.
Cause: A disk is degraded and has received a SMART alert (predictive failure). The disk is likely to fail in the near future.
Action: Replace the disk that has received the SMART alert. If the array disk is a member of a non-redundant virtual disk, then back up the data before replacing the disk. Removing an array disk that is included in a non­redundant virtual disk will cause the virtual disk to fail and may cause data loss.
903 588
903 589
Array Manager Event Number
56 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2111 Failure prediction
threshold exceeded due to test - No action needed
2112 Enclosure was shut
down
2114 A consistency check
on a virtual disk has been paused (suspended)
2115 A consistency check
on a virtual disk has been resumed
Warning / Non-critical
Critical / Failure / Error
Ok / Normal
Ok / Normal
Cause: A disk has received a SMART alert (predictive failure) due to test conditions.
Action: None
Cause: The array disk enclosure is either hotter or cooler than the maximum or minimum allowable temperature range.
Action: Check for factors that may cause overheating or excessive cooling. For example, verify that the enclosure fan is working. You should also check the thermostat settings and examine whether the enclosure is located near a heat source. Make sure the enclosure has enough ventilation and that the room temperature is not too hot or too cold. See the enclosure documentation for more diagnostic information.
Cause: The check consistency operation on a virtual disk was paused by a user.
Action: To resume the check consistency operation, right-click the virtual disk in the Storage Management tree view and select Resume Check Consistency.
Cause: The check consistency operation on a virtual disk has resumed processing after being paused by a user.
Action: This alert is provided for informational purposes.
903 590
854 602
1201 604
1201 605
Array Manager Event Number
Storage Management Message Reference 57
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2116 A virtual disk and its
mirror have been split
2117 A mirrored virtual
disk has been unmirrored
2118 Change write policy Ok /
2120 Enclosure firmware
mismatch
Ok / Normal
Ok / Normal
Normal
Warning / Non-critical
Cause: A user has caused a mirrored virtual disk to be split. When a virtual disk is mirrored, its data is copied to another virtual disk in order to maintain redundancy. After being split, both virtual disks retain a copy of the data, although because the mirror is no longer intact, updates to the data are no longer copied to the mirror.
Action: This alert is provided for informational purposes.
Cause: A user has caused a mirrored virtual disk to be unmirrored. When a virtual disk is mirrored, its data is copied to another virtual disk in order to maintain redundancy. After being unmirrored, the disk formerly used as the mirror returns to being an array disk and becomes available for inclusion in another virtual disk.
Action: This alert is provided for informational purposes.
Cause: A user has changed the write policy for a virtual disk.
Action: This alert is provided for informational purposes.
Cause: The firmware on the enclosure management modules (EMM) is not the same version. It is required that both modules have the same version of the firmware. This alert may be caused when a user attempts to insert an EMM module that has a different firmware version than an existing module.
Action: Download the same version of the firmware to both EMM modules.
1201 606
1201 607
1201 601
853 672
Array Manager Event Number
58 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2121 Device returned to
normal
2122 Redundancy
degraded
Ok / Normal
Warning / Non-critical
Cause: A device that was previously in an error state has returned to a normal state. For example, if an enclosure became too hot and subsequently cooled down, then you may receive this alert.
Action: This alert is provided for informational purposes.
Cause: One or more of the enclosure components has failed. For example, a fan or power supply may have failed. Although the enclosure is currently operational, the failure of additional components could cause the enclosure to fail.
Action: Identify and replace the failed component. To identify the failed component, select the enclosure in the tree view and click the Health subtab. Any failed component will be identified with a red X on the enclosure’s Health subtab. Alternatively, you can select the Storage object and click the Health subtab. The controller status displayed on the Health subtab indicates whether a controller has a failed or degraded component. See the enclosure documentation for information on replacing enclosure components and for other diagnostic information.
752, 802, 852, 902, 952, 1002, 1052, 1102, 1152, 1202
1305 None
Array Manager Event Number
None
Storage Management Message Reference 59
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2123 Redundancy lost Warning /
Non-critical
2124 Redundancy normal Ok /
Normal
Cause: A virtual disk or an enclosure has lost data redundancy. In the case of a virtual disk, one or more array disks included in the virtual disk have failed. Due to the failed array disk or disks, the virtual disk is no longer maintaining redundant (mirrored or parity) data. The failure of an additional array disk will result in lost data. In the case of an enclosure, more than one enclosure component has failed. For example, the enclosure may have suffered the loss of all fans or all power supplies.
Action: Identify and replace the failed components. To identify the failed component, select the Storage object and click the Health subtab. The controller status displayed on the Health subtab indicates whether a controller has a failed or degraded component. Click the controller that displays a Warning or Failed status. This action displays the controller Health subtab which displays the status of the individual controller components. Continue clicking the components with a Warning or Health status until you identify the failed component. See the online help for more information. See the enclosure documentation for information on replacing enclosure components and for other diagnostic information.
Cause: Data redundancy has been restored to a virtual disk or an enclosure that previously suffered a loss of redundancy.
Action: This alert is provided for informational purposes.
1306 None
1304 None
Array Manager Event Number
60 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2126 SCSI sense sector
reassign
2127 Background
initialization (BGI) started
2128 BGI cancelled Ok /
2129 BGI failed Critical /
2130 BGI completed Ok /
2131 Firmware version
mismatch
Warning / Non-critical
Ok / Normal
Normal
Failure / Error
Normal
Warning / Non-critical
Cause: A sector of the disk is corrupted and data cannot be maintained on this portion of the disk.
Action: If the disk is part of a non-redundant virtual disk, then replace the disk. Any data residing on the corrupt portion of the disk may be lost and you may need to restore from backup. If the disk is part of a redundant virtual disk, then any data residing on the corrupt portion of the disk will be reallocated elsewhere in the virtual disk.
Cause: BGI of a virtual disk has started. This alert is provided for informational purposes.
Action: None
Cause: BGI of a virtual disk has been cancelled. A user or the firmware may have stopped BGI.
Action: None
Cause: BGI of a virtual disk has failed.
Action: None
Cause: BGI of a virtual disk has completed. This alert is provided for informational purposes.
Action: None
Cause: The firmware on the controller is not a supported version.
Action: Install a supported version of the firmware. If you do not have a supported version of the firmware available, it can be downloaded from the Dell™ support website at support.dell.com. If you do not have a supported version of the firmware available, check with your support provider for information on how to obtain the most current firmware.
903 None
1201 683
1201 684
1204 685
1201 686
753 None
Array Manager Event Number
Storage Management Message Reference 61
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2132 Driver version
mismatch
2135 Array Manager is
installed on the system
2136 Virtual disk
initialization
Warning / Non-critical
Warning / Non-critical
Ok / Normal
Cause: The controller driver is not a supported version.
Action: Install a supported version of the driver. If you do not have a supported driver version available, it can be downloaded from the Dell support site at support.dell.com. If you do not have a supported version of the driver available, check with your support provider for information on how to obtain the most current driver.
Cause: Storage Management has been installed on a system that has an Array Manager installation.
Action: Installing Storage Management and Array Manager on the same system is not a supported configuration. Uninstall either Storage Management or Array Manager.
Cause: Virtual disk initialization is in progress. This alert is provided for informational purposes.
Action: None
753 None
103 None
1201 None
Array Manager Event Number
62 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2137 Communication
timeout
2138 Enclosure alarm
enabled
2139 Enclosure alarm
disabled
2140 Dead disk segments
restored
Warning / Non-critical
Ok / Normal
Ok / Normal
Ok / Normal
Cause: The controller is unable to communicate with an enclosure. There are several reasons why communcation may be lost. For example, there may be a bad or loose cable. An unusual amount of I/O may also interrupt communication with the enclosure. In addition, communication loss may be caused by software, hardware, or firmware problems, bad or failed power supplies, and enclosure shutdown.
When viewed in the Alert Log, the description for this event displays several variables. These variables are: Controller and enclosure names, type of communication problem, return code, and SCSI status.
Action: Check for problems with the cables. See the online help for more information on checking the cables. You should also check to see if the enclosure has degraded or failed components. To do so, select the enclosure object in the tree view and click the Health subtab. The Health subtab displays the status of the enclosure components. Verify that the controller has supported driver and firmware versions installed and that the EMMs are each running the same version of supported firmware.
Cause: A user has enabled the enclosure alarm. This alert is provided for informational purposes.
Action: None
Cause: A user has disabled the enclosure alarm.
Action: None
Cause: Disk space that was formerly “dead” or inaccessible to a redundant virtual disk has been restored. This alert is provided for informational purposes.
Action: None
853 688, 610,
851 676
851 677
1201 None
Array Manager Event Number
611
Storage Management Message Reference 63
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2141 Array disk dead
segments recovered
2142 Controller rebuild
rate has changed
2143 Controller alarm
enabled
2144 Controller alarm
disabled
2145 Controller battery low Warning /
2146 Bad block
replacement error
2147 Bad block sense error Warning /
2148 Bad block medium
error
2149 Bad block extended
sense error
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Non-critical
Warning / Non-critical
Non-critical
Warning / Non-critical
Warning / Non-critical
Cause: Portions of the array disk that were formerly inaccessible have been recovered. This alert is provided for informational purposes.
Action: None
Cause: A user has changed the controller rebuild rate. This alert is provided for informational purposes.
Action: None
Cause: A user has enabled the controller alarm. This alert is provided for informational purposes.
Action: None
Cause: A user has disabled the controller alarm. This alert is provided for informational purposes.
Action: None
Cause: The controller battery charge is low.
Action: Recondition the battery. See the online help for more information
Cause: A portion of an array disk is damaged.
Action: See the Storage Management online help or the Dell OpenManage Server
Administrator Storage Management User's Guide for more information.
Cause: A portion of an array disk is damaged.
Action: See the online help for more information.
Cause: A portion of an array disk is damaged.
Action: See the online help for more information.
Cause: A portion of an array disk is damaged.
Action: See the online help for more information.
901 None
751 680
751 678
751 679
1153 580
753 691
753 691
753 691
753 691
Array Manager Event Number
64 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2150 Bad block extended
medium error
2151 Asset tag changed Ok /
2152 Asset name changed Ok /
2153 Service tag changed Warning /
2154 Maximum
temperature probe warning threshold value changed
2155 Minimum
temperature probe warning threshold value changed
2156 Controller alarm has
been tested
Warning / Non-critical
Normal
Normal
Non-critical
Ok / Normal
Ok / Normal
Ok / Normal
Cause: A portion of an array disk is damaged.
Action: See the online help for more information.
Cause: A user has changed the enclosure asset tag. This alert is provided as an information.
Action: None
Cause: A user has changed the enclosure asset name. This alert is provided for informational purposes.
Action: None
Cause: An enclosure service tag was changed. In most circumstances, this service tag should only be changed by Dell support or your service provider.
Action: Ensure that the tag was changed under authorized circumstances.
Cause: A user has changed the value for the maximum temperature probe warning threshold. This alert is provided for informational purposes.
Action: None
Cause: A user has changed the value for the minimum temperature probe warning threshold. This alert is provided for informational purposes.
Action: None
Cause: The controller alarm test has run successfully. This alert is provided for informational purposes.
Action: None
753 691
851 None
851 None
753 None
1051 None
1051 None
751 None
Array Manager Event Number
Storage Management Message Reference 65
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2157 Controller
configuration has been reset
2158 Array disk online Ok /
2159 Virtual disk renamed Ok /
Ok / Normal
Normal
Normal
Cause: A user has reset the controller configuration. See the online help for more information. This alert is provided for informational purposes.
Action: None
Cause: An offline array disk has been made online. This alert is provided for informational purposes.
Action: None
Cause: A user has renamed a virtual disk. This alert is provided for informational purposes.
751 None
901 None
1201 608
NOTE: When renaming a virtual disk on a PERC
2, 2/Si, 3/Si, 3/Di, CERC SATA 1.5/6ch, or CERC SATA 1.5/2s controller, this alert displays the new virtual disk name. On the PERC 2/SC, 2/DC, 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, 4/IM, 4e/Si, 4e/Di, and CERC ATA 100/4ch controllers, this alert displays the original virtual disk name.
Action: None
2160 Dedicated hotspare
assigned
2161 Dedicated hotspare
unassigned
2162 Communication
regained
Ok / Normal
Ok / Normal
Ok / Normal
Cause: A user has assigned an array disk as a dedicated hot spare to a virtual disk. See the online help for more information. This alert is provided for informational purposes.
Action: None
Cause: A user has unassigned an array disk as a dedicated hot spare to a virtual disk. See the online help for more information. This alert is provided for informational purposes.
Action: None
Cause: Communication with an enclosure has been restored. This alert is provided for informational purposes.
Action: None
901 574
901 575
851 None
Array Manager Event Number
66 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2163 Rebuild completed
with errors
2164 See the Readme file
for a list of validated controller driver versions
2165 The RAID controller
firmware and driver validation was not performed. The configuration file cannot be opened.
2166 The RAID controller
firmware and driver validation was not performed. The configuration file is out of date or corrupted.
Ok / Normal
Ok / Normal
Warning / Non-critical
Warning / Non-critical
See the online help for more information. 904 690
Cause: Storage Management is unable to determine whether the system has the minimum required versions of the RAID controller drivers.
Action: This alert is generated for informational purposes. See the Readme file for driver and firmware requirements. In particular, if Storage Management experiences performance problems, you should verify that you have the minimum supported versions of the drivers and firmware installed.
Cause: Storage Management is unable to determine whether the system has the minimum required versions of the RAID controller firmware and drivers. This situation may occur for a variety of reasons. For example, the installation directory path to the configuration file may not be correct. The configuration file may also have been removed or renamed.
Action: Reinstall Storage Management
Cause: Storage Management is unable to determine whether the system has the minimum required versions of the RAID controller firmware and drivers. This situation has occurred because a configuration file is unreadable or missing data. The configuration file may be corrupted.
Action: Reinstall Storage Management.
101 None
753 None
753 None
Array Manager Event Number
Storage Management Message Reference 67
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2167 The current kernel
version and the non­RAID SCSI driver version are older than the minimum required levels.
See the Readme file for a list of validated kernel and driver versions.
2168 The non-RAID SCSI
driver version is older than the minimum required level.
See the Readme file for the validated driver version.
2169 The controller battery
needs to be replaced.
2170 The controller battery
charge level is normal.
Warning / Non-critical
Warning / Non-critical
Critical / Failure / Error
Ok / Normal
Cause: The version of the kernel and the driver do not meet the minimum requirements. Storage Management may not be able to display the storage or perform storage management functions until you have updated the system to meet the minimum requirements.
Action: See the Readme file for kernel and driver requirements. Update the system to meet the minimum requirements and then reinstall Storage Management.
Cause: The version of the driver does not meet the minimum requirements. Storage Management may not be able to display the storage or perform storage management functions until you have updated the system to meet the minimum requirements.
Action: See the Readme file for the driver requirements. Update the system to meet the minimum requirements and then reinstall Storage Management.
Cause: The controller battery cannot recharge. The battery may be old or it may have been already recharged the maximum number of times. In addition, the battery charger may not be working.
Action: Replace the battery pack.
Cause: This alert is provided for informational purposes.
Action: None
103 None
103 None
1154 None
1151 None
Array Manager Event Number
68 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2171 The controller battery
temperature is above normal.
2172 The controller battery
temperature is normal.
2174 The controller battey
has been removed.
2175 The controller battery
has been replaced.
2176 The controller battery
Learn cycle has started.
2177 The controller battery
Learn cycle has completed.
Warning / Non-critical
Ok / Normal
Warning / Non-critical
Ok / Normal
Ok / Normal
Ok / Normal
Cause: The battery may be recharging, the room temperature may be too hot, or the fan in the system may be degraded or failed.
Action: If this alert was generated due to a battery recharge, the situation will correct when the recharge is complete. You should also check if the room temperature is normal and that the system components are functioning properly.
Cause: This alert is provided for informational purposes.
Action: None
Cause: The controller cannot communicate with the battery, the battery may be removed, or the contact point between the controller and the battery may be burnt or corroded.
Action: Replace the battery if it is not in. If the contact point between the battery and the controller is burnt or corroded, you will need to replace either the battery or the controller, or both. See the hardware documentation for information on how to safely access, remove, and replace the battery.
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
1153 None
1151 None
1153 None
1151 None
1151 None
1151 None
Array Manager Event Number
Storage Management Message Reference 69
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2178 The controller battery
Learn cycle has timed out.
2179 The controller battery
Learn cycle has been postponed.
2180 The controller battery
Learn cycle will start in %1 days.
Warning / Non-critical
Ok / Normal
Ok / Normal
Cause: The controller battery must be fully charged before the Learn cycle can begin. The battery may be unable to maintain a full charge causing the Learn cycle to timeout. Additionally, the battery must be able to maintain cached data for a specified period of time in the event of a power loss. For example, some batteries maintain cached data for 24 hours. If the battery is unable to maintain cached data for the required period of time, then the Learn cycle will timeout.
Action: Replace the battery pack as the battery is unable to maintain a full charge.
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
1153 None
1151 None
1151 None
NOTE: The %1 is a
variable that will be filled in with the number of days before which the Learn cycle will start. You can set the duration to start the Learn cycle.
Array Manager Event Number
70 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2181 The controller battery
Learn cycle will start in % hours.
Ok / Normal
Cause: This alert is provided for informational purposes.
Action: None
1151 None
NOTE: The %1 is a
variable that will be filled in with the number of hours before which the Learn cycle will start. You can set the duration to start the Learn cycle.
2182 An invalid SAS
configuration has been detected.
2186 The controller cache
has been discarded.
2187 Single-bit ECC error
limit exceeded.
Critical / Failure / Error
Warning / Non-critical
Warning / Non-critical
Cause: The controller and attached enclosures are not cabled correctly.
Action: See the hardware documentation for information on correct cabling configurations.
Cause: The controller has flushed the cache and any data in the cache has been lost. This may happen if the system has memory or battery problems that cause the controller to distrust the cache. Although user data may have been lost, this alert does not always indicate that relevant or user data has been lost.
Action: Verify that the battery and memory are functioning properly.
Cause: The system memory is malfunctioning.
Action: Replace the battery pack.
754 None
753 None
753 None
Array Manager Event Number
Storage Management Message Reference 71
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2188 The controller write
policy has been changed to "Write Through."
2189 The controller write
policy has been changed to "Write Back."
2191 Multiple enclosures
are attached to the controller. This is an unsupported configuration.
Warning / Non-critical
Ok / Normal
Critical / Failure / Error
Cause: The controller battery is unable to maintain cached data for the required period of time. For example, if the required period of time is 24 hours, the battery is unable to maintain cached data for 24 hours. It is normal to receive this alert during the battery Learn cycle as the Learn cycle discharges the battery before recharging it. When discharged, the battery cannot maintain cached data.
Action: Check the health of the battery. If the battery is weak, replace the battery pack.
Cause: This alert is provided for informational purposes.
Action: None
Cause: Many enclosures are attached to the controller port. When the enclosure limit is exceeded, the controller loses contact with all enclosures attached to the port.
Action: Remove the last enclosure. You must remove the enclosure that has been added last and is causing the enclosure limit to exceed.
1153 None
1151 None
854 None
Array Manager Event Number
72 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2192 The virtual disk
"Check Consistency" has made corrections and completed.
2193 The virtual disk
reconfigure has resumed.
2194 The virtual disk read
policy has changed.
2199 The virtual disk cache
policy has changed.
2201 A global hot spare
failed.
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Warning / Non-critical
Cause: The virtual disk "Check Consistency" has identified errors and made corrections. For example, the "Check Consistency" may have encountered a bad disk block and remapped the disk block to restore data consistency. This alert is provided for informational purposes.
Action: Monitor the battery and cache health to make sure they are functioning properly. Monitor the Alert Log for events related to the battery and write policy changes. You should also monitor the Alert Log for events related to disk errors. If you suspect that the battery or a disk have problems, replace the battery pack or the disk.
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: The controller is unable to communicate with a disk that is assigned as a global hot spare. The disk may have failed or been removed. There may also be a bad or loose cable.
Action: Check if the disk is healthy and that it has not been removed. Check the cables.
If necessary, replace the disk and reassign the hot spare.
1203 None
1201 None
1201 None
1201 None
903 None
Array Manager Event Number
Storage Management Message Reference 73
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2202 A global hot spare has
been removed.
2203 A dedicated hot spare
failed.
2204 A dedicated hot spare
has been removed.
2205 A dedicated hot spare
has been automatically unassigned.
Warning / Non-critical
Warning / Non-critical
Warning / Non-critical
Warning / Non-critical
Cause: The controller is unable to communicate with a disk that is assigned as a global hot spare. The disk may have been removed. There may also be a bad or loose cable.
Action: Check if the disk is healthy and that it has not been removed. Check the cables.
If necessary, replace the disk and reassign the hot spare.
Cause: The controller is unable to communicate with a disk that is assigned as a dedicated hot spare. The disk may have failed or been removed. There may also be a bad or loose cable.
Action: Check if the disk is healthy and that it has not been removed. Check the cables.
If necessary, replace the disk and reassign the hot spare.
Cause: The controller is unable to communicate with a disk that is assigned as a dedicated hot spare. The disk may have been removed. There may also be a bad or loose cable.
Action: Check if the disk is healthy and that it has not been removed. Check the cables.
If necessary, replace the disk and reassign the hot spare.
Cause: The hot spare is no longer required because the virtual disk it was assigned to has been deleted.
Action: None.
903 None
903 None
903 None
903 None
Array Manager Event Number
74 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2206 The only hot spare
available is a SATA disk. SATA disks cannot replace SAS disks.
Warning / Non-critical
Cause: The only array disk available to be assigned as a hot spare is using SATA technology. The array disks in the virtual disk are using SAS technology. Due to this difference in technology, the hot spare cannot
903 None
rebuild data if one of the array disks in the virtual disk fails.
Action: Add a SAS disk that is large enough to be used as the hot spare and assign the new disk as a hot spare.
2207 The only hot spare
available is a SAS disk. SAS disks cannot replace SATA disks.
Warning / Non-critical
Cause: The only array disk available to be assigned as a hot spare is using SAS technology. The array disks in the virtual disk are using SATA technology. Due to this difference in technology, the hot spare cannot
903 None
rebuild data if one of the array disks in the virtual disk fails.
Action: Add a SATA disk that is large enough to be used as the hot spare and assign the new disk as a hot spare.
2211 The physical disk is
not supported.
Warning / Non-critical
Cause: The physical disk may not have a supported version of the firmware or the disk
903 None
may not be supported by Dell.
Action: If the disk is supported by Dell, update the firmware to a supported version. If the disk is not supported by Dell, replace the disk with one that is supported.
2232 The controller alarm
is silenced.
Ok / Normal
Cause: This alert is provided for informational purposes.
751 None
Action: None
2233 The BGI rate has
changed.
Ok / Normal
Cause: This alert is provided for informational purposes.
751 None
Action: None
2234 The "Patrol Read" rate
has changed.
Ok / Normal
Cause: This alert is provided for informational purposes.
751 None
Action: None
Array Manager Event Number
Storage Management Message Reference 75
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2235 The Check
Consistency rate has changed.
2237 A controller rescan
has been initiated.
2238 The controller debug
log file has been exported.
2239 A foreign
configuration has been cleared.
2240 A foreign
configuration has been imported.
2241 The "Patrol Read"
mode has changed.
2242 The "Patrol Read" has
started.
2243 The "Patrol Read" has
stopped.
2244 A virtual disk blink
has been initiated.
2245 A virtual disk blink
has ceased.
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
751 None
751 None
751 None
751 None
751 None
751 None
751 None
751 None
1201 None
1201 None
Array Manager Event Number
76 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2246 The controller battery
is degraded.
Warning / Non-critical
Cause: The controller battery charge is weak.
Action: As the charge weakens, the charger
1153 None
should automatically recharge the battery. If the battery has reached its recharge limit, replace the battery pack. Monitor the battery to make sure that it recharges successfully. If the battery does not recharge, replace the battery pack.
2247 The controller battery
is charging.
Ok / Normal
Cause: This alert is provided for informational purposes.
1151 None
Action: None
2248 The controller battery
is executing a Learn cycle.
2249 The array disk "Clear"
operation has started.
Ok / Normal
Ok / Normal
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
1151 None
901 None
Action: None
2251 The array disk blink
has initiated.
Ok / Normal
Cause: This alert is provided for informational purposes.
901 None
Action: None
2252 The array disk blink
has ceased.
Ok / Normal
Cause: This alert is provided for informational purposes.
901 None
Action: None
2254 The "Clear" operation
has cancelled.
Ok / Normal
Cause: This alert is provided for informational purposes.
901 None
Action: None
2255 The array disk has
started.
Ok / Normal
Cause: This alert is provided for informational purposes.
901 None
Action: None
2259 An enclosure blink
operation has initiated.
Ok / Normal
Cause: This alert is provided for informational purposes.
Action: None
851 None
Array Manager Event Number
Storage Management Message Reference 77
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2260 An enclosure blink
has ceased.
2261 A global rescan has
initiated.
2262 Smart thermal
shutdown is enabled.
2263 Smart thermal
shutdown is disabled.
2264 A device is missing. Warning /
2265 A device is in an
unknown state.
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Non-critical
Warning / Non-critical
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: The controller cannot communicate with a device. The device may be removed. There may also be a bad or loose cable.
Action: Check if the device is in and connected. If it is in, check the cables.
Also check the connection to the controller battery and the battery health. A battery with a weak or depleted charge may cause this alert.
Cause: The controller cannot communicate with a device. The state of the device cannot be determined. There may be a bad or loose cable. The system may also be experiencing problems with the application programming interface (API). There could also be a problem with the driver or firmware.
Action: Check the cables.
Check if the controller has a supported version of the driver and firmware. You can download the most current version of the driver and firmware from support.dell.com. Rebooting the system may also resolve this problem.
851 None
101 None
101 None
101 None
753, 803, 853, 903, 953, 1003, 1053, 1103, 1153, 1203
753, 803, 853, 903, 953, 1003, 1053, 1103, 1153, 1203
Array Manager Event Number
None
None
78 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2266 Controller log file
entry: %1
%1 is a substitution variable that will appear in the alert description for specific details about the alert.
2267 The controller
reconstruct rate has changed.
2268 %1, Storage
Management has lost communication with this RAID controller and attached storage. An immediate reboot is strongly recommended to avoid further problems. If the reboot does not restore communication, there may be a hardware failure.
Ok / Normal
Ok / Normal
Critical / Failure / Error
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: Storage Management has lost communication with a device. There may be faulty hardware or loose or defective cables.
Action: Reboot the system. If the problem is not resolved, check for hardware failures. Any failed component must be replaced. Make sure the cables are attached securely.
See the hardware documentation for more diagnostics information.
751 None
751 None
104 None
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2269 The array disk "Clear"
operation has completed.
Ok / Normal
Cause: This alert is provided for informational purposes.
Action: None
901 None
Array Manager Event Number
Storage Management Message Reference 79
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2270 The array disk "Clear"
operation failed.
2271 The "Patrol Read"
corrected a media error.
2272 "Patrol Read" found
an uncorrectable media error.
2273 Bad media. Critical /
2274 The array disk rebuild
has resumed.
2276 The dedicated hot
spare is too small.
Critical / Failure / Error
Ok / Normal
Critical / Failure / Error
Failure / Error
Ok / Normal
Warning / Non-critical
Cause: A "Clear" operation was being performed on an array disk, but it was interrupted and did not complete successfully. The controller may have lost communication with the disk. The disk may have been removed or the cables may be loose or defective.
Action: Check if the disk is in and not in a failed state. Make sure the cables are attached securely.
Restart the "Clear" operation.
Cause: This alert is provided for informational purposes.
Action: None
Cause: The "Patrol Read" task has faced an error that cannot be corrected. There may be a bad disk block that cannot be remapped.
Action: Replace the array disk to avoid future data loss.
Cause: A source (array) disk in a redundant virtual disk has a bad disk block. The algorithm that maintains redundant data has created a similar bad block on the target redundant disk to maintain consistency in disk block addressing. Data has been lost.
Action: Restore from backup.
Cause: This alert is provided for informational purposes.
Action: None
Cause: The dedicated hot spare is not large enough to protect all virtual disks that reside on the disk group.
Action: Assign a larger disk as the dedicated hot spare.
904 None
901 None
903 None
904 None
901 None
903 None
Array Manager Event Number
80 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2277 The global hot spare
is too small.
2278 The controller battery
charge level is below a normal threshold.
2279 The controller battery
charge level is above a normal threshold.
2280 A disk media error has
been corrected.
2281 Virtual disk has
inconsistent data.
Warning / Non-critical
Critical / Failure / Error
Ok / Normal
Ok / Normal
Ok / Normal
Cause: The global hot spare is not large enough to protect all virtual disks that reside on the controller.
Action: Assign a larger disk as the global hot spare.
Cause: The battery is discharging. A battery discharge is a normal activity during the battery Learn cycle. Before completing, the battery Learn cycle recharges the battery. You should receive alert 2179 when the recharge occurs.
Action: Check if the battery Learn cycle is in progress. Alert 2176 indicates that the battery Learn cycle has initiated. The battery also displays the Learn state while the Learn cycle is in progress. If a Learn cycle is not in progress, replace the battery pack.
Cause: This alert is provided for informational purposes. This alert indicates that the battery is recharging during the battery Learn cycle.
Action: None
Cause: A disk media error was detected while the controller was completing a background task. A bad disk block was identified. The disk block has been remapped.
Action: Consider replacing the disk. If you receive this alert frequently, be sure to replace the disk. You should also routinely back up your data.
Cause: This alert is provided for informational purposes.
Action: None
903 None
1154 None
1151 None
1201 None
1201 None
Array Manager Event Number
Storage Management Message Reference 81
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2282 Hot spare SMART
polling failed.
2283 A redundant path is
broken.
2284 A redundant path has
been restored.
2285 A disk media error
was corrected during recovery.
2286 A Learn cycle start is
pending while the battery charges.
2287 The "Patrol Read" is
paused.
2288 The "Patrol Read" has
resumed.
Critical / Failure / Error
Warning / Non-critical
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Cause: The controller firmware attempted to do SMART polling on the hot spare but was unable to complete it. The controller has lost communication with the hot spare.
Action: Check the health of the disk assigned as a hot spare. You may need to replace the disk and reassign the hot spare. Make sure the cables are attached securely.
Cause: The controller has two connectors that are connected to the same enclosure. The communication path on one connector has lost connection with the enclosure. The communication path on the other connector is reporting this loss.
Action: Make sure the cables are attached securely.
Make sure both EMMs are healthy.
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
904 None
903 None
901 None
901 None
1151 None
751 None
751 None
Array Manager Event Number
82 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2289 Multi-bit ECC error. Critical /
Failure / Error
2290 Single-bit ECC error. Warning /
Non-critical
2291 An EMM has been
discovered.
2292 Communication with
the enclosure has been lost.
Ok / Normal
Critical / Failure / Error
Cause: An error involving multiple bits has been encountered during a read or write operation. The error correction algorithm recalculates parity data during read and write operations. If an error involves only a single bit, it may be possible for the error correction algorithm to correct the error and maintain parity data. An error involving multiple bits, however, usually indicates data loss. In some cases, if the multi-bit error occurs during a read operation, the data on the disk may be alright. If the multi-bit error occurrs during a write operation, data loss has occurred.
Action: Replace the dual in-line memory module (DIMM). The DIMM is a part of the controller battery pack. See your hardware documentation for information on replacing the DIMM. You may need to restore data from backup.
Cause: An error involving a single bit has been encountered during a read or write operation. The error correction algorithm has corrected this error.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: The controller has lost communication with an EMM. The cables may be loose or defective.
Action: Make sure the cables are attached securely.
Reboot the system.
754 None
753 None
851 None
854 None
Array Manager Event Number
Storage Management Message Reference 83
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2293 The EMM has failed. Critical /
Failure / Error
2294 A device has been
inserted.
2295 A device has been
removed.
2296 An EMM has been
inserted.
2297 An EMM has been
removed.
2298 There is a bad sensor
on an enclosure.
Ok / Normal
Critical / Failure / Error
Ok / Normal
Critical / Failure / Error
Warning / Non-critical
Cause: The failure may be caused by a loss of power to the EMM. The EMM self test may also have identified a failure. There could also be a firmware problem or a multi-bit error.
Action: Replace the EMM. See the hardware documentation for information on replacing the EMM.
Cause: This alert is provided for informational purposes.
Action: None
Cause: A device has been removed and the system is no longer functioning in optimal condition.
Action: Replace the device.
Cause: This alert is provided for informational purposes.
Action: None
Cause: An EMM has been removed.
Action: Replace the EMM. See the hardware documentation for information on replacing the EMM.
Cause: The enclosure has a bad sensor. The enclosure sensors monitor the fan speeds, temperature probes, etc.
Action: See the hardware documentation for more information.
854 None
752, 802, 852, 902, 952, 1002, 1052, 1102, 1152, 1202
754, 804, 854, 904, 954, 1004, 1054, 1104, 1154, 1204
851 None
854 None
853 None
Array Manager Event Number
None
None
84 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2299 Bad PHY %1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2300 The enclosure is
unstable.
2301 The enclosure has a
hardware error.
2302 The enclosure is not
responding.
Critical / Failure / Error
Critical / Failure / Error
Critical / Failure / Error
Critical / Failure / Error
Cause: There is a problem with a physical connection or PHY.
Action: Replace the EMM that contains the bad PHY. See the hardware documentation for information on replacing the EMM. Attach the storage to a different connector, if available. Make sure the cables are attached securely.
Cause: The controller is not receiving a consistent response from the enclosure. There could be a firmware problem or an invalid cabling configuration. If the cables are too long, they will degrade the signal.
Action: Power down all enclosures attached to the system and reboot the system. If the problem persists, upgrade the firmware to the latest supported version. You can download the most current version of the driver and firmware from support.dell.com. Make sure the cable configuration is valid. See the hardware documentation for valid cabling configurations.
Cause: The enclosure or an enclosure component is in a Failed or Degraded state.
Action: Check the health of the enclosure and its components. Replace any hardware that is in a Failed state. See the hardware documentation for more information.
Cause: The enclosure or an enclosure component is in a Failed or Degraded state.
Action: Check the health of the enclosure and its components. Replace any hardware that is in a Failed state. See the hardware documentation for more information.
854 None
854 None
854 None
854 None
Array Manager Event Number
Storage Management Message Reference 85
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2303 The enclosure cannot
support both SAS and SATA array disks. Array disks may be disabled.
2304 An attempt to hot
plug an EMM has been detected. This type of hot plug is not supported.
2305 The array disk is too
small to be used for a rebuild.
2306 Bad block table is
80% full.
2307 Bad block table is full.
Unable to log block %1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
Ok / Normal
Ok / Normal
Ok / Normal
Warning / Non-critical
Critical / Failure / Error
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: The bad block table is used for remapping bad disk blocks. This table fills, as bad disk blocks are remapped. When the table is full, bad disk blocks can no longer be remapped, and disk errors can no longer be corrected. At this point, data loss can occur. The bad block table is now 80% full.
Action: Back up your data. Replace the disk generating this alert and restore from back up.
Cause: The bad block table is used for remapping bad disk blocks. This table fills, as bad disk blocks are remapped. When the table is full, bad disk blocks can no longer be remapped and disk errors can no longer be corrected. At this point, data loss can occur.
Action: Replace the disk generating this alert and restore from backup. You may have lost data.
851 None
751 None
901 None
903 None
904 None
Array Manager Event Number
86 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2309 An array disk is
incompatible.
2310 A virtual disk is
permanently degraded.
2311 The firmware on the
EMMs is not the same version. EMM0 %1 EMM1 %2
NOTE: %1 and %2 are
substitution variables that will appear in the alert description for specific details about the alert.
2312 A power supply in the
enclosure has an AC failure.
2313 A power supply in the
enclosure has a DC failure.
Warning / Non-critical
Critical / Failure / Error
Warning / Non-critical
Warning / Non-critical
Warning / Non-critical
Cause: You have attempted to replace a disk with another disk that is using an incompatible technology. For example, you may have replaced one side of a mirror with a SAS disk when the other side of the mirror is using SATA technology.
Action: See the hardware documentation for information on replacing disks.
Cause: A redundant virtual disk has lost redundancy. This may occur when the virtual disk suffers the failure of multiple array disks. In this case, both the source array disk and the target disk with redundant data have failed. A rebuild is not possible because there is no longer redundancy.
Action: Replace the failed disks and restore from backup.
Cause: The firmware on the EMM modules is not the same version. It is required that both modules have the same version of the firmware. This alert may be caused if you attempt to insert an EMM module that has a different firmware version than an existing module.
Action: Upgrade to the same version of the firmware on both EMM modules.
Cause: The power supply has an AC failure.
Action: Replace the power supply.
Cause: The power supply has a DC failure.
Action: Replace the power supply.
903 None
1204 None
853 None
1003 None
1003 None
Array Manager Event Number
Storage Management Message Reference 87
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2314 The initialization
sequence of SAS components failed during system startup. SAS management and monitoring is not possible.
2315 Diagnostic message
%1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2316 Diagnostic message
%1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2317 BGI terminated due
to loss of ownership in a cluster configuration.
2318 Problems with the
battery or the battery charger have been detected. The battery health is poor.
Critical / Failure / Error
Ok / Normal
Critical / Failure / Error
Ok / Normal
Critical / Failure / Error
Cause: Storage Management is unable to monitor or manage SAS devices.
Action: Reboot the system. If problem persists, make sure you have supported versions of the drivers and firmware. Also, you may need to reinstall Storage Management or Server Administrator because of some missing installation components.
Cause: This alert is provided for informational purposes.
Action: None
Cause: A diagnostics test failed. The text for this alert is generated by the utility that ran the diagnostics.
Action: See the documentation for the utility that ran the diagnostics for more information.
Cause: This alert is provided for informational purposes.
Action: None
Cause: The battery or the battery charger is not functioning properly.
Action: Replace the battery pack.
104 None
751 None
754 None
1201 None
1154 None
Array Manager Event Number
88 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2319 Single-bit ECC error.
The DIMM is degrading.
2320 Single-bit ECC error.
The DIMM is critically degraded.
2321 Single-bit ECC error.
The DIMM is critically degraded. There will be no further reporting.
2322 The DC power supply
is switched off.
2323 The power supply is
switched on.
Warning / Non-critical
Critical / Failure / Error
Critical / Failure / Error
Critical / Failure / Error
Ok / Normal
Cause: The DIMM is beginning to malfunction.
Action: Replace the DIMM to avoid data loss or data corruption. The DIMM is a part of the controller battery pack. See your hardware documentation for information on replacing the DIMM.
Cause: The DIMM is malfunctioning. Data loss or data corruption may be eminent.
Action: Replace the DIMM immediately to avoid data loss or data corruption. The DIMM is a part of the controller battery pack. See your hardware documentation for information on replacing the DIMM.
Cause: The DIMM is malfunctioning. Data loss or data corruption is eminent. The DIMM must be replaced immediately. No further alerts will be generated.
Action: Replace the DIMM immediately. The DIMM is a part of the controller battery pack. Seeyour hardware documentation for information on replacing the DIMM.
Cause: The power supply unit is switched off. Either a user switched off the power supply unit or it is defective.
Action: Check if the power switch is turned off. If it is turned off, turn it on. If the problem persists, check if the power cord is attached and functional. If the problem is still not corrected or if the power switch is already turned on, replace the power supply unit.
Cause: This alert is provided for informational purposes.
Action: None
753 None
754 None
754 None
1004 None
1001 None
Array Manager Event Number
Storage Management Message Reference 89
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2324 The AC power supply
cable has been removed.
2325 The power supply
cable has been inserted.
2326 A foreign
configuration has been detected.
2327 The NVRAM has
corrupted data. The controller is reinitializing the NVRAM.
2328 The NVRAM has
corrupt data.
Critical / Failure / Error
Ok / Normal
Ok / Normal
Warning / Non-critical
Warning / Non-critical
Cause: The power cable may be pulled out or removed. The power cable may also have overheated and become warped and nonfunctional.
Action: Replace the power cable.
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes. The controller has array disks that were moved from another controller. These array disks contain virtual disks that were created on the other controller. See Import Foreign Configuration and Clear Foreign Configuration for more information.
Action: None
Cause: The NVRAM has corrupted data. This may ocurr after a power surge, a battery failure, or for other reasons. The controller is reinitializing the NVRAM.
Action: None. The controller is taking the required corrective action. If this alert is generated often (such as during each reboot), replace the controller.
Cause: The NVRAM has corrupt data. The controller is unable to correct the situation.
Action: Replace the controller.
1004 None
1001 None
751 None
753 None
753 None
Array Manager Event Number
90 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2329 SAS port report: %1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2330 SAS port report: %1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2331 A bad disk block has
been reassigned.
2332 A controller hot plug
has been detected.
2333 An enclosure
temperature sensor differential has been detected.
Warning / Non-critical
Ok / Normal
Warning / Non-critical
Ok / Normal
Warning / Non-critical
Cause: The text for this alert is generated by the controller and can vary depending on the situation.
Action: Make sure the cables are attached securely.
If the problem persists, replace the cable with a valid cable according to SAS specifications. If the problem still persists, you may need to replace some devices such as the controller or EMM. See the hardware documentation for more information.
Cause: This alert is provided for informational purposes.
Action: None
Cause: The disk has a bad block. Data has been readdressed to another disk block and no data loss has occurred.
Action: Monitor the disk for other alerts or indications of poor health. For example, you may receive alert 2306. Replace the disk if you suspect there is a problem.
Cause: This alert is provided for informational purposes.
Action: None
Cause: The firmware has detected a temperature sensor differential in the enclosure.
Action: Monitor the enclosure for other alerts related to the temperature. For example, you may receive alerts related to the fan or temperature probes. Check the health of the enclosure and its components. Replace any component that is failed.
753 None
751 None
903 None
751 None
853 None
Array Manager Event Number
Storage Management Message Reference 91
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2334 Controller event log:
%1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2335 Controller event log:
%1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2336 Controller event log:
%1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2337 The controller is
unable to recover cached data from the battery backup unit (BBU).
Ok / Normal
Warning / Non-critical
Critical / Failure / Error
Critical / Failure / Error
Cause: This alert is provided for informational purposes.
Action: None
Cause: The text for this alert is generated by the controller and can vary depending on the situation. This text is from events in the controller event log that were generated while Storage Management was not running.
Action: If there is a problem, review the controller event log and the Server Administrator Alert Log for significant events or alerts that may assist in diagnosing the problem. Check the health of the storage components. See the hardware documentation for more information.
Cause: The text for this alert is generated by the controller and can vary depending on the situation. This text is from events in the controller event log that were generated while Storage Management was not running.
Action: See the hardware documentation for more information.
Cause: The controller was unable to recover data from the cache.
Action: Check if the battery is charged and in good health. When the battery charge is unacceptably low, it cannot maintain cached data. Check if the battery has reached its recharge limit. The battery may need to be recharged or replaced.
751 None
753 None
754 None
1154 None
Array Manager Event Number
92 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2338 The controller has
recovered cached data from the BBU.
2339 The factory default
settings have been restored.
2340 The BGI completed
with uncorrectable errors.
2341 The "Check
Consistency" operation made corrections and completed.
2342 The "Check
Consistency" task found inconsistent parity data. Data redundancy may be lost.
Ok / Normal
Ok / Normal
Critical / Failure / Error
Ok / Normal
Warning / Non-critical
Cause: This alert is provided for informational purposes.
Action: None
Cause: This alert is provided for informational purposes.
Action: None
Cause: The BGI task encountered errors that cannot be corrected. The virtual disk contains array disks that have unusable disk space or disk errors that cannot be corrected.
Action: Replace the array disk that contains the disk errors. Review other alert messages to identify the array disk that has errors. If the virtual disk is redundant, you can replace the array disk and continue using the virtual disk. If the virtual disk is non-redundant, you may need to recreate the virtual disk after replacing the array disk. After replacing the array disk, run a "Check Consistency" task to check the data.
Cause: This alert is provided for informational purposes.
Action: None
Cause: The data on a source disk and the redundant data on a target disk is inconsistent.
Action: Restart the "Check Consistency" task. If you receive this alert again, check the health of the array disks included in the virtual disk. Review the alert messages for significant alerts related to the array disks. If you suspect that an array disk has a problem, replace it and restore from backup.
1151 None
751 None
1204 None
1201 None
1203 None
Array Manager Event Number
Storage Management Message Reference 93
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2343 The "Check
Consistency" logging of inconsistent parity data is disabled.
2344 The virtual disk
initialization terminated.
2345 The virtual disk
initialization failed.
2346 Error occurred: %1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2347 The rebuild failed due
to errors on the source physical disk.
2348 The rebuild failed due
to errors on the target physical disk.
Warning / Non-critical
Warning / Non-critical
Critical / Failure / Error
Warning / Non-critical
Critical / Failure / Error
Critical / Failure / Error
Cause: The "Check Consistency" operation can no longer report errors in the parity data.
Action: See the hardware documentation for more information.
Cause: A user has cancelled the virtual disk initialization.
Action: Restart the initialization.
Cause: The controller cannot communicate with the attached devices. A disk may be removed or contain errors. The cables may also be loose or defective.
Action: Check the health of attached devices. Review the Alert Log for significant events and make sure the cables are attached securely.
Cause: The text for this alert is generated by the firmware and can vary depending on the situation.
Action: Check the health of attached devices. Review the Alert Log for significant events. You may need to replace faulty hardware. Make sure the cables are attached securely.
See the hardware documentation for more information.
Cause: You are attempting to rebuild data that resides on a defective disk.
Action: Replace the source disk and restore from backup.
Cause: You are attempting to rebuild data on a disk that is defective.
Action: Replace the target disk. If a rebuild does not automatically start after replacing the disk, initiate the "Rebuild" task. You may need to assign the new disk as a hot spare to initiate the rebuild.
1203 None
1203 None
1204 None
903 None
904 None
904 None
Array Manager Event Number
94 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2349 A bad disk block
could not be reassigned during a write operation.
2350 There was an
unrecoverable disk media error during the rebuild.
2351 A physical disk is
marked as missing.
2352 A physical disk that
was marked as missing has been replaced.
2353 The enclosure
temperature has returned to normal.
2354 Enclosure firmware
download in progress.
Critical / Failure / Error
Critical / Failure / Error
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Cause: A write operation could not complete because the disk contains bad disk blocks that could not be reassigned. Data loss may have occurred and data redundancy may also be lost.
Action: Replace the disk.
Cause: The rebuild encountered an unrecoverable disk media error.
Action: Replace the disk.
Cause: This alert is provided for informational purposes.
Action: None.
Cause: This alert is provided for informational purposes.
Action: None.
Cause: This alert is provided for informational purposes.
Action: None.
Cause: This alert is provided for informational purposes.
Action: None.
904 None
904 None
901 None
901 None
851 None
851 None
Array Manager Event Number
Storage Management Message Reference 95
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2355 Enclosure firmware
download failed.The system was unable to download firmware to the enclosure. The controller may have lost communication with the enclosure. There may have been problems with the data transfer or the download media may be corrupt.
2356 SAS SMP
communications error %1.
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
Warning / Non-critical
Critical / Failure / Error
Cause: The system was unable to download firmware to the enclosure. The controller may have lost communication with the enclosure. There may have been problems with the data transfer or the download media may be corrupt.
Action: Attempt to download the enclosure firmware again. If problems continue, check if the controller can communicate with the enclosure. Make sure that the enclosure is powered on. Check the cables and the health of the enclosure and its components.
To check the health of the enclosure, select the enclosure object in the tree view. The Health subtab displays a red X or yellow exclamation point for enclosure components that are failed or degraded.
Cause: The text for this alert is generated by the firmware and can vary depending on the situation. The reference to SMP in this text refers to SAS Management Protocol.
Action: There may be a SAS topology error. See the hardware documentation for information on correct SAS topology configurations. There may be problems with the cables such as a loose connection or an invalid cabling configuration. See the hardware documentation for information on correct cabling configurations. Check if the firmware is a supported version.
853 None
754 None
Array Manager Event Number
96 Storage Management Message Reference
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2357 SAS expander error:
%1
NOTE: %1 is a
substitution variable that will appear in the alert description for specific details about the alert.
2358 The battery charge
cycle is complete.
2359 The physical disk is
not certified.
2360 A user has discarded
data from the controller cache.
2361 Array disk(s) that are
part of a virtual disk have been removed while the system was shut down. This removal was discovered during system start-up.
2362 Array disk(s) have
been removed from a virtual disk. The virtual disk will be in Failed state during the next system reboot.
Critical / Failure / Error
Ok / Normal
Warning / Non-critical
Ok / Normal
Ok / Normal
Ok / Normal
Cause: The text for this alert is generated by the firmware and can vary depending on the situation.
Action: There may be a problem with the enclosure. Check the health of the enclosure and its components. by selecting the enclosure object in the tree view. The Health subtab displays a red X or yellow exclamation point for enclosure components that are failed or degraded. See the enclosure documentation for more information.
Cause: This alert is provided for informational purposes.
Action: None.
Cause: The physical disk does not comply with the standards set by Dell and is not supported.
Action: Replace the physical disk with a physical disk that is supported.
Cause: This alert is provided for informational purposes.
Action: None.
Cause: This alert is provided for informational purposes.
Action: None.
Cause: This alert is provided for informational purposes.
Action: None.
754 None
1151 None
903 None
751 None
751 None
751 None
Array Manager Event Number
Storage Management Message Reference 97
Table 4-1. Storage Management Messages (continued)
Event ID Description Severity Cause and Action SNMP Trap
Numbers
2363 A virtual disk and all
of its member array disks have been removed while the system was shut down. This removal was discovered during system start-up.
2364 All virtual disks are
missing from the controller. This situation was discovered during system start-up.
2365 The speed of the
enclosure fan has changed.
2366 Dedicated spare
imported as global due to missing arrays
2367 Rebuild not possible
as SAS/SATA is not supported in the same virtual disk.
2368 The SEP has been
rebooted as part of the firmware download operation and will be unavailable until the operation completes.
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Ok / Normal
Cause: This alert is provided for informational purposes.
Action: None.
Cause: This alert is provided for informational purposes.
Action: None.
Cause: This alert is provided for informational purposes.
Action: None.
Cause: This alert is provided for informational purposes.
Action: None.
Cause: This alert is provided for informational purposes.
Action: None.
Cause: This alert is provided for informational purposes.
Action: None.
751 None
751 None
851 None
901 None
901 None
851 None
Array Manager Event Number
98 Storage Management Message Reference

Index

Numerics
0000, 15
0001, 15
1000, 15
1001, 15
1002, 15
1003, 15
1004, 15
1005, 16
1006, 16
1007, 16
1008, 16
1009, 16
1050, 17
1051, 17
1052, 17
1053, 18
1054, 18
1055, 18
1100, 19
1101, 19
1102, 19
1103, 19
1104, 19
1105, 20
1150, 20
1151, 20
1152, 21
1153, 21
1154, 21
1155, 22
1200, 22
1201, 23
1202, 23
1203, 23
1204, 24
1205, 24
1250, 25
1251, 25
1252, 25
1253, 25
1254, 26
1255, 26
1300, 26
1301, 27
1302, 27
1303, 27
1304, 27
1305, 28
1306, 28
1350, 28
1351, 29
1352, 29
1353, 29
1354, 30
1355, 30
1403, 31
1404, 31
1450, 31
1451, 31
1452, 31
1453, 32
1454, 32
1455, 32
1500, 32
1501, 32
1502, 32
1503, 33
1504, 33
1505, 33
1550, 33
1551, 33
1552, 34
1553, 34
1554, 34
1555, 34
1600, 34
1601, 34
Index 99
1602, 35
2085, 51
2121, 59
1603, 35
1604, 35
1605, 35
2048, 46
2049, 46
2050, 47
2051, 47
2052, 47
2053, 47
2054, 47
2055, 47
2056, 48
2057, 48
2058, 48
2059, 49
2061, 49
2063, 49
2064, 49
2086, 51
2088, 52
2089, 52
2090, 52
2091, 52
2092, 52
2094, 52
2095, 53
2098, 53
2099, 53
2100, 54
2101, 54
2102, 54
2103, 54
2104, 55
2105, 55
2106, 55
2107, 55
2122, 59
2123, 60
2124, 60
2126, 61
2127, 61
2128, 61
2129, 61
2130, 61
2131, 61
2132, 62
2135, 62
2136, 62
2137, 63
2138, 63
2139, 63
2140, 63
2141, 64
2142, 64
2065, 49
2067, 49
2070, 50
2074, 50
2076, 50
2077, 50
2079, 50
2080, 51
2081, 51
2082, 51
2083, 51
100 Index
2108, 55
2109, 56
2110, 56
2111, 57
2112, 57
2114, 57
2115, 57
2116, 58
2117, 58
2118, 58
2120, 58
2143, 64
2144, 64
2145, 64
2146, 64
2147, 64
2148, 64
2149, 64
2150, 65
2151, 65
2152, 65
2153, 65
2154, 65
2189, 71
2249, 76
2155, 65
2156, 65
2157, 66
2158, 66
2159, 66
2160, 66
2161, 66
2162, 66
2163, 67
2164, 67
2165, 67
2166, 67
2167, 68
2168, 68
2169, 68
2170, 68
2171, 69
2174, 69
2191, 72
2192, 72
2193, 72
2194, 72
2199, 72
2201, 73
2202, 73
2203, 73
2204, 73
2205, 74
2206, 74
2207, 74
2211, 74
2232, 74
2233, 75
2234, 75
2235, 75
2237, 75
2251, 76
2252, 76
2254, 76
2255, 77
2259, 77
2260, 77
2261, 77
2262, 77
2263, 77
2264, 77
2265, 78
2266, 78
2267, 78
2268, 79
2269, 79
2270, 79
2271, 79
2272, 80
2175, 69
2176, 69
2177, 69
2178, 70
2179, 70
2180, 70
2181, 70
2182, 71
2186, 71
2187, 71
2188, 71
2238, 75
2239, 75
2240, 75
2241, 75
2242, 75
2243, 75
2244, 76
2245, 76
2246, 76
2247, 76
2248, 76
2273, 80
2274, 80
2276, 80
2277, 80
2278, 81
2279, 81
2280, 81
2281, 81
2282, 81
2283, 82
2284, 82
Index 101
2285, 82
2314, 88
2342, 93
2286, 82
2287, 82
2288, 82
2289, 83
2290, 83
2291, 83
2292, 83
2293, 84
2294, 84
2295, 84
2296, 84
2297, 84
2298, 84
2299, 85
2300, 85
2301, 85
2302, 85
2303, 85-86
2315, 88
2316, 88
2317, 88
2318, 88
2319, 89
2320, 89
2321, 89
2322, 89
2323, 89
2324, 90
2325, 90
2326, 90
2327, 90
2328, 90
2329, 91
2330, 91
2331, 91
2332, 91
2343, 94
2344, 94
2345, 94
2346, 94
2347, 94
2348, 94
2349, 95
2350, 95
2351, 95
2352, 95
2353, 95
2354, 95
2355, 96
2356, 96
2357, 97
2358, 97
2359, 97
2360, 97
2304, 86
2305, 86
2306, 86
2307, 86
2309, 87
2310, 87
2311, 87
2312, 87
2313, 87
102 Index
2333, 91
2334, 92
2335, 92
2336, 92
2337, 92
2338, 93
2339, 93
2340, 93
2341, 93
2361, 97
2362, 97
2363, 98
2364, 98
2365, 98
2366, 98
2367, 98
2368, 98
Loading...