Reproduction in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden.
Trademarks used in this text: The DELL logo and Dell OpenManage are trademarks of Dell Inc.; Microsoft and Windows are registered
trademarks of Microsoft Corporation; Novell and NetWare are registered trademarks of Novell, Inc.; Red Hat is a registered trademark of
Hat, Inc.
Red
Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products.
Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own.
Dell OpenManage™ Server Administrator produces event messages stored primarily in the operating
system or Server Administrator event logs and sometimes in SNMP traps. This document describes
the event messages created by Server Administrator version 2.0 or later and displayed in the Server
Administrator Alert log.
Server Administrator creates events in response to sensor status changes and other monitored
parameters. The Server Administrator event monitor uses these status change events to add
descriptive messages to the operating system event log or the Server Administrator Alert log.
Each event message that Server Administrator adds to the alert log consists of a unique identifier
called the event ID for a specific event source category and a descriptive message. The event
message includes the severity, cause of the event, and other relevant information, such as the event
location and the monitored item’s previous state.
Tables provided in this guide list all Server Administrator event IDs in numeric order. Each entry
includes the event ID’s corresponding description, severity level, and cause. Message text in angle
brackets (for example,
Administrator.
Server
What’s New in this Release
The following changes in Server Administrator are documented in this guide:
•Support for additional Storage Management messages
•Removed support for Novell
<State>
) describes the event-specific information provided by the
®
NetWare
®
Messages Not Described in This Guide
This guide describes only event messages created by Server Administrator and displayed in the
Server Administrator Alert log. For information on other messages produced by your system, consult
one of the following sources:
•Your system’s
•Other system documentation
•Operating system documentation
•Application program documentation
For more information on Array Manager event messages, see the Array Manager documentation.
Installation and Troubleshooting Guide
Introduction7
Understanding Event Messages
This section describes the various types of event messages generated by the Server Administrator.
When
an event occurs on your system, the Server Administrator sends information about one of the
following event types to the systems management console:
Table 1-1. Understanding Event Messages
IconAlert SeverityComponent Status
An event that describes the successful operation of a unit.
OK/Normal
Warning/Non-critical
Critical/Failure/Error
for informational purposes and does not indicate an error condition.
For example, the alert may indicate the normal start or stop of an operation,
such as power supply or a
An event that is not necessarily significant, but may indicate a possible future
problem.
component (such as a temperature probe in an enclosure) has crossed a
warning threshold.
A significant event that indicates actual or imminent loss of data or loss of
function.
an array disk.
For example, a Warning/Non-critical alert may indicate that a
For example,
sensor reading returning to normal.
crossing a failure threshold or a hardware failure such as
Server Administrator generates events based on status changes in the following sensors:
•
Temperature Sensor
— Helps protect critical components by alerting the systems management
console when temperatures become too high inside a chassis; also monitors a variety of locations in the
chassis and in any attached systems.
•
Fan Sensor
Voltage Sensor
•
— Monitors fans in various locations in the chassis and in any attached systems.
— Monitors voltages across critical components in various chassis locations and in any
attached systems.
•
Current Sensor
— Monitors the current (or amperage) output from the power supply (or supplies) in
the chassis and in any attached systems.
•
Chassis Intrusion Sensor
Redundancy Unit Sensor
•
— Monitors intrusion into the chassis and any attached systems.
— Monitors redundant units (critical units such as fans, AC power cords, or
power supplies) within the chassis; also monitors the chassis and any attached systems. For example,
redundancy allows a second or
n
th fan to keep the chassis components at a safe temperature when
another fan has failed. Redundancy is normal when the intended number of critical components are
operating. Redundancy is degraded when a component fails, but others are still operating. Redundancy
is lost when there is one less critical redundancy device than required.
•
Power Supply Sensor
•
Memory Prefailure Sensor
— Monitors power supplies in the chassis and in any attached systems.
— Monitors memory modules by counting the number of Error Correction
Code (ECC) memory corrections.
The alert is provided
8Introduction
•
Fan Enclosure Sensor
insertion into the system, and by measuring how long a fan enclosure is absent from the chassis.
This sensor monitors the chassis and any attached systems.
•
AC Power Cord Sensor
Hardware Log Sensor
•
•
Processor Sensor
Pluggable Device Sensor
•
pluggable devices, such as memory cards.
— Monitors protective fan enclosures by detecting their removal from and
— Monitors the presence of AC power for an AC power cord.
— Monitors the size of a hardware log.
— Monitors the processor status in the system.
— Monitors the addition, removal, or configuration errors for some
Sample Event Message Text
The following example shows the format of the event messages logged by Server Administrator.
EventID: 1000
Source: Server Administrator
Category: Instrumentation Service
Type: Information
Date and Time: Wed Mar 15 10:38:00 2006
Computer:
Description:
Server Administrator starting
Data: Bytes in Hex
<computer name>
Viewing Alerts and Event Messages
An event log is used to record information about important events.
Storage Management generates alerts that are added to the Microsoft® Windows® application alert log
and to the Server Administrator Alert log. To view these alerts in Server Administrator:
1
Select the
2
Select the
3
Select the
You can also view the event log using your operating system’s event viewer. Each operating system’s event
viewer accesses the applicable operating system event log.
System
object in the tree view.
Logs
tab.
Alert
subtab.
Introduction9
The location of the event log file depends on the operating system you are using.
•In the Microsoft Windows 2000 Advanced Server and Windows Server® 2003 operating systems,
messages are logged to the system event log and optionally to a unicode text file,
(viewable using Notepad), that is located in the
install_path
•In the Red Hat
is
C:\Program Files\Dell\SysMgt
®
Enterprise Linux operating system, messages are logged to the system log file.
The default name of the system log file is
install_path
.
/var/log/messages
\omsa\log
directory. The default
. You can view the messages file using a text
dcsys32.log
editor such as vi or emacs.
NOTE: Logging messages to a unicode text file is optional. By default, the feature is disabled. To enable this
feature, modify the Event Manager section of the dcemdy32.ini file as follows:
•In Windows, locate the file at install_path\dataeng\ini and set
install_path is C:\Program Files\Dell\SysMgt. Restart the Systems Management Event Manager service.
•In Red Hat Enterprise Linux, locate the file at install_path/dataeng/ini andset
UnitextLog.enabled=True.
restart command to restart the systems management event manager service. This will also restart the systems
management data manager and SNMP services.
The default install_path is /opt/dell/svradmin. Issue the service dataeng
UnitextLog.enabled=True
. The default
The following subsections explain the procedure to open the Windows 2000 Advanced Server,
Windows
Server 2003, and Red Hat Enterprise Linux event viewers.
Viewing Events in Windows 2000 and Windows Server 2003
1
Click the
2
Double-click
3
In the
The
Start
Administrative Tools
Event Viewer
System Log
button, point to
window, click the
Settings
, and click
Control Panel
, and then double-click
Tree
tab and then click
Event Viewer
window displays a list of recently logged events.
.
.
System Log
.
4
To view the details of an event, double-click one of the event items.
NOTE: You can also look up the dcsys32.log file, in the install_path\omsa\log directory, to view the separate
event log file. The default install_path is C:\Program Files\Dell\SysMgt.
Viewing Events in Red Hat Enterprise Linux
1
Log in as
2
Use a text editor such as vi or emacs to view the file named
The following example shows the Red Hat Enterprise Linux message log, /var/log/messages. The text in
boldface type indicates the message text.
NOTE: These messages are typically displayed as one long line. In the following example, the message is
displayed using line breaks to help you see the message text more clearly.
10Introduction
root
.
/var/log/messages
.
...
Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service
EventID: 1000
Server Administrator starting
Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service
EventID: 1001
Server Administrator startup complete
Feb 6 14:21:21 server01 Server Administrator: Instrumentation Service
EventID: 1254 Chassis intrusion detected Sensor location: Main chassis
intrusion Chassis location: Main System Chassis Previous state was: OK
(Normal) Chassis intrusion state: Open
Feb 6 14:21:51 server01 Server Administrator: Instrumentation Service
EventID: 1252 Chassis intrusion returned to normal Sensor location: Main
chassis intrusion Chassis location: Main System Chassis Previous state
was: Critical (Failed) Chassis intrusion state: Closed
Viewing the Event Information
The event log for each operating system contains some or all of the following information:
•
Date
— The date the event occurred.
•
Time
— The local time the event occurred.
•
Ty p e
— A classification of the event severity: Information, Warning, or Error.
User
•
•
•
•
•
•
— The name of the user on whose behalf the event occurred.
Computer
Source
Category
Event ID
Description
depending on the event type.
— The name of the system where the event occurred.
— The software that logged the event.
— The classification of the event by the event source.
— The number identifying the particular event type.
— A description of the event. The format and contents of the event description vary,
Introduction11
Understanding the Event Description
Ta b l e 1-2 lists in alphabetical order each line item that may appear in the event description.
Table 1-2. Event Description Reference
Description Line ItemExplanation
Action performed was:
Action requested was:
Additional Details:
details for the event>
<Additional power supply status
information>
Chassis intrusion state:
<Intrusion state>
Chassis location:
chassis>
Configuration error type:
of configuration error>
Current sensor value (in Amps):
<Reading>
Date and time of action:
and time>
Device location: <
chassis
Discrete current state:
Discrete temperature state:
>
<State>
<Action>
<Action>
<Additional
<Name of
<type
<Date
Location in
<State>
Specifies the action that was performed, for example:
Action performed was: Power cycle
Specifies the action that was requested, for example:
Action requested was: Reboot, shutdown OS first
Specifies additional details available for the hot plug event,
for example:
Memory device: DIMM1_A Serial number: FFFF30B1
Specifies information pertaining to the event, for example:
Power supply input AC is off, Power supply POK
(power OK) signal is not normal, Power supply
is turned off
Specifies the chassis intrusion state (open or closed), for example:
Chassis intrusion state: Open
Specifies name of the chassis that generated the message,
for example:
Chassis location: Main System Chassis
Specifies the type of configuration error that occurred, for example:
Configuration error type: Revision mismatch
Specifies the current sensor value in amps, for example:
Current sensor value (in Amps): 7.853
Specifies the date and time the action was performed, for example:
Date and time of action: Tue Mar 21 16:20:33
2006
Specifies the location of the device in the specified chassis,
for example:
Device location: Memory Card A
Specifies the state of the current sensor, for example:
Discrete current state: Good
Specifies the state of the temperature sensor, for example:
Specifies the location of the redundant power supply or cooling
unit in the chassis, for example:
Redundancy unit: Fan Enclosure
Specifies the location of the sensor in the specified chassis,
for example:
Sensor location: CPU1
Specifies the temperature in degrees Celsius, for example:
Temperature sensor value (in degrees Celsius): 30
Specifies the voltage sensor value in volts, for example:
Voltage sensor value (in Volts): 1.693
14Introduction
Event Message Reference
The following tables list in numerical order each event ID and its corresponding description, along
with its severity and cause.
NOTE: For corrective actions, see the appropriate documentation.
Miscellaneous Messages
Miscellaneous messages in Table 2-1 indicate that certain alert systems are up and working.
Table 2-1. Miscellaneous Messages
Event ID DescriptionSeverityCause
0000Log was clearedInformationUser cleared the log from Server
Administrator.
0001Log backup createdInformationThe log was full, copied to backup,
and cleared.
1000Server Administrator startingInformationServer Administrator is beginning to
initialize.
1001Server Administrator startup
complete
1002A system BIOS update has been
scheduled for the next reboot
1003A previously scheduled system
BIOS update has been canceled
1004Thermal shutdown protection
has been initiated
InformationServer Administrator completed its
initialization.
InformationThe user has chosen to update the flash
basic input/output system (BIOS).
InformationThe user has decided to cancel the flash
BIOS update, or an error has occurred
during the flash.
ErrorThis message is generated when a
system is configured for thermal
shutdown due to an error event. If a
temperature sensor reading exceeds the
error threshold for which the system is
configured, the operating system shuts
down and the system powers off. This
event may also be initiated on certain
systems when a fan enclosure is
removed from the system for an
extended period of time.
Event Message Reference15
Table 2-1. Miscellaneous Messages (continued)
Event ID DescriptionSeverityCause
1005SMBIOS data is absentWarningThe system management BIOS does
not contain the required systems
management BIOS version 2.2 or
higher,or the BIOS is corrupted.
1006Automatic System Recovery
(ASR) action was performed
Action performed was:
Date and time of action:
and time>
1007User initiated host system
control action
Action requested was:
1008Systems Management Data
Manager Started
1009Systems Management Data
Manager Stopped
<Action>
<Date
<Action>
ErrorThis message is generated when an
automatic system recovery action is
InformationUser requested a host system control
InformationSystems Management Data Manager
InformationSystems Management Data Manager
performed due to a non-responsive
operating system. The action
performed and the time of action
are provided.
action to reboot, power off, or power
cycle the system. Alternatively, the user
had indicated protective measures to be
initiated in the event of a thermal
shutdown.
services were started.
services were stopped.
Temperature Sensor Messages
Temperature sensors listed in Table 2-2 help protect critical components by alerting the systems
management console when temperatures become too high inside a chassis. The temperature sensor
messages use additional variables: sensor location, chassis location, previous state, and temperature
sensor value or state.
16Event Message Reference
Table 2-2. Temperature Sensor Messages
Event ID DescriptionSeverityCause
1050Temperature sensor has failed
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in degrees
Celsius):
If sensor type is discrete:
Discrete temperature state:
1051Temperature sensor value unknown
Sensor location:
Chassis location:
If sensor type is not discrete:
Temperature sensor value (in degrees
Celsius):
If sensor type is discrete:
Discrete temperature state:
1052Temperature sensor returned to a normal
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in degrees
Celsius):
If sensor type is discrete:
Discrete temperature state:
<Reading>
<Reading>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<State>
<Location in chassis>
<Name of chassis>
<State>
<Location in chassis>
<Name of chassis>
<State>
<State>
InformationA temperature sensor on the
backplane board, system
board, or the carrier in the
specified system failed. The
sensor location, chassis
location, previous state, and
temperature sensor value
are provided.
InformationA temperature sensor on the
backplane board, system
board, or drive carrier in the
specified system could not
obtain a reading. The sensor
location, chassis location,
previous state, and a
nominal temperature sensor
value are provided.
InformationA temperature sensor on the
backplane board, system
board, or drive carrier in the
specified system returned to
a valid range after crossing a
failure threshold. The sensor
location, chassis location,
previous state, and
temperature sensor value
are provided.
Event Message Reference17
Table 2-2. Temperature Sensor Messages (continued)
Event ID DescriptionSeverityCause
1053Temperature sensor detected a warning
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in degrees
Celsius):
If sensor type is discrete:
Discrete temperature state:
1054Temperature sensor detected a failure
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in degrees
Celsius):
If sensor type is discrete:
Discrete temperature state:
1055Temperature sensor detected a
non-recoverable value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in degrees
Celsius):
If sensor type is discrete:
Discrete temperature state:
<Reading>
<Reading>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<State>
<Location in chassis>
<Name of chassis>
<State>
<State>
<Location in chassis>
<Name of chassis>
<State>
<State>
WarningA temperature sensor on the
backplane board, system
board, or drive carrier in the
specified system exceeded
its warning threshold. The
sensor location, chassis
location, previous state, and
temperature sensor value
are provided.
ErrorA temperature sensor on the
backplane board, system
board, or drive carrier in the
specified system exceeded
its failure threshold. The
sensor location, chassis
location, previous state, and
temperature sensor value
are provided.
ErrorA temperature sensor on the
backplane board, system
board, or drive carrier in the
specified system detected an
error from which it cannot
recover. The sensor location,
chassis location, previous
state, and temperature
sensor value are provided.
18Event Message Reference
Cooling Device Messages
Cooling device sensors listed in Table 2-3 monitor how well a fan is functioning. Cooling device messages
provide status and warning information for fans in a particular chassis.
Table 2-3. Cooling Device Messages
Event ID DescriptionSeverityCause
1100Fan sensor has failed
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
1101Fan sensor value unknown
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
1102Fan sensor returned to a normal value
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
1103Fan sensor detected a warning value
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
1104Fan sensor detected a failure value
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
<Location in chassis>
<Name of chassis>
<State>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<Reading>
<Location in chassis>
<Name of chassis>
<State>
<Reading>
InformationA fan sensor in the specified
system is not functioning. The
sensor location, chassis location,
previous state, and fan sensor
value are provided.
InformationA fan sensor in the specified
system could not obtain a
reading. The sensor location,
chassis location, previous state,
and a nominal fan sensor value
are provided.
InformationA fan sensor reading on the
specified system returned to a
valid range after crossing a
warning threshold. The sensor
location, chassis location,
previous state, and fan sensor
value are provided.
WarningA fan sensor reading in the
specified system exceeded a
warning threshold. The sensor
location, chassis location,
previous state, and fan sensor
value are provided.
ErrorA fan sensor in the specified
system detected the failure of
one or more fans. The sensor
location, chassis location,
previous state, and fan sensor
value are provided.
Event Message Reference19
Table 2-3. Cooling Device Messages (continued)
Event ID DescriptionSeverityCause
1105Fan sensor detected a
non-recoverable value
Sensor location:
Chassis location:
Previous state was:
Fan sensor value:
<Location in chassis>
<Name of chassis>
<State>
<Reading>
ErrorA fan sensor detected an error
from which it cannot recover.
The sensor location, chassis
location, previous state, and fan
sensor value are provided.
Voltage Sensor Messages
Voltage sensors listed in Table 2-4 monitor the number of volts across critical components. Voltage sensor
messages provide status and warning information for voltage sensors in a particular chassis.
Table 2-4. Voltage Sensor Messages
Event ID DescriptionSeverityCause
1150Voltage sensor has failed
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Location in chassis>
<Name of chassis>
<State>
<Reading>
If sensor type is discrete:
Discrete voltage state:
1151Voltage sensor value unknown
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Location in chassis>
<Name of chassis>
<State>
<State>
<Reading>
If sensor type is discrete:
Discrete voltage state:
<State>
InformationA voltage sensor in the specified
system failed. The sensor
location, chassis location,
previous state, and voltage
sensor value are provided.
InformationA voltage sensor in the specified
system could not obtain a
reading. The sensor location,
chassis location, previous state,
and a nominal voltage sensor
value are provided.
20Event Message Reference
Table 2-4. Voltage Sensor Messages (continued)
Event ID DescriptionSeverityCause
1152Voltage sensor returned to a normal
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Location in chassis>
<Name of chassis>
<State>
InformationA voltage sensor in the specified
system returned to a valid range
after crossing a failure threshold.
The sensor location, chassis
location, previous state, and
voltage sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete voltage state:
1153Voltage sensor detected a warning
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Location in chassis>
<Name of chassis>
<State>
<State>
WarningA voltage sensor in the specified
system exceeded its warning
threshold. The sensor location,
chassis location, previous state,
and voltage sensor value
are provided.
<Reading>
If sensor type is discrete:
Discrete voltage state:
1154Voltage sensor detected a failure
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Location in chassis>
<Name of chassis>
<State>
<State>
ErrorA voltage sensor in the specified
system exceeded its failure
threshold. The sensor location,
chassis location, previous state,
and voltage sensor value
are provided.
<Reading>
If sensor type is discrete:
Discrete voltage state:
<State>
Event Message Reference21
Table 2-4. Voltage Sensor Messages (continued)
Event ID DescriptionSeverityCause
1155Voltage sensor detected a
non-recoverable value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Voltage sensor value (in Volts):
<Reading>
If sensor type is discrete:
Discrete voltage state:
<Location in chassis>
<Name of chassis>
<State>
<State>
ErrorA voltage sensor in the specified
system detected an error from
which it cannot recover. The
sensor location, chassis location,
previous state, and voltage
sensor value are provided.
Current Sensor Messages
Current sensors listed in Table 2-5 measure the amount of current (in amperes) that is traversing critical
components. Current sensor messages provide status and warning information for current sensors in a
particular chassis.
Table 2-5. Current Sensor Messages
Event ID DescriptionSeverityCause
1200Current sensor has failed
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Reading>
If sensor type is discrete:
Discrete current state:
<Location in chassis>
<Name of chassis>
<State>
<State>
InformationA current sensor on the power
supply for the specified system
failed. The sensor location,
chassis location, previous state,
and current sensor value
are provided.
22Event Message Reference
Table 2-5. Current Sensor Messages (continued)
Event ID DescriptionSeverityCause
1201Current sensor value unknown
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Location in chassis>
<Name of chassis>
<State>
InformationA current sensor on the power
supply for the specified system
could not obtain a reading. The
sensor location, chassis location,
previous state, and a nominal
current sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete current state:
1202Current sensor returned to a normal
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Location in chassis>
<Name of chassis>
<State>
<State>
InformationA current sensor on the power
supply for the specified system
returned to a valid range after
crossing a failure threshold. The
sensor location, chassis location,
previous state, and current
sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete current state:
1203Current sensor detected a warning
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Location in chassis>
<Name of chassis>
<State>
<State>
WarningA current sensor on the power
supply for the specified system
exceeded its warning threshold.
The sensor location, chassis
location, previous state, and
current sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete current state:
<State>
Event Message Reference23
Table 2-5. Current Sensor Messages (continued)
Event ID DescriptionSeverityCause
1204Current sensor detected a failure
value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Location in chassis>
<Name of chassis>
<State>
ErrorA current sensor on the power
supply for the specified system
exceeded its failure threshold.
The sensor location, chassis
location, previous state, and
current sensor value
are provided.
<Reading>
If sensor type is discrete:
Discrete current state:
1205Current sensor detected a
non-recoverable value
Sensor location:
Chassis location:
Previous state was:
If sensor type is not discrete:
Current sensor value (in Amps):
<Location in chassis>
<Name of chassis>
<State>
<State>
ErrorA current sensor in the specified
system detected an error from
which it cannot recover. The
sensor location, chassis location,
previous state, and current
sensor value are provided.
<Reading>
If sensor type is discrete:
Discrete current state:
<State>
24Event Message Reference
Chassis Intrusion Messages
Chassis intrusion messages listed in Table 2-6 are a security measure. Chassis intrusion means that
someone is opening the cover to a system’s chassis. Alerts are sent to prevent unauthorized removal of
parts from a chassis.
Table 2-6. Chassis Intrusion Messages
Event ID DescriptionSeverityCause
1250Chassis intrusion sensor has failed
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
state>
1251Chassis intrusion sensor value unknown
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
state>
1252Chassis intrusion returned to normal
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
state>
1253Chassis intrusion in progress
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
state>
InformationA chassis intrusion sensor in the
specified system failed. The
sensor location, chassis location,
previous state, and chassis
intrusion state are provided.
InformationA chassis intrusion sensor in the
specified system could not
obtain a reading. The sensor
location, chassis location,
previous state, and chassis
intrusion state are provided.
InformationA chassis intrusion sensor in the
specified system detected that a
cover was opened while the
system was operating but has
since been replaced. The sensor
location, chassis location,
previous state, and chassis
intrusion state are provided.
WarningA chassis intrusion sensor in the
specified system detected that a
system cover is currently being
opened and the system is
operating. The sensor location,
chassis location, previous state,
and chassis intrusion state
are provided.
Event Message Reference25
Table 2-6. Chassis Intrusion Messages (continued)
Event ID DescriptionSeverityCause
1254Chassis intrusion detected
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
state>
1255Chassis intrusion sensor detected a
non-recoverable value
Sensor location:
Chassis location:
Previous state was:
Chassis intrusion state:
state>
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
<Location in chassis>
<Name of chassis>
<State>
<Intrusion
ErrorA chassis intrusion sensor in the
specified system detected that
the system cover was opened
while the system was operating.
The sensor location, chassis
location, previous state, and
chassis intrusion state
are provided.
ErrorA chassis intrusion sensor in the
specified system detected an
error from which it cannot
recover. The sensor location,
chassis location, previous state,
and chassis intrusion state
are provided.
Redundancy Unit Messages
Redundancy means that a system chassis has more than one of certain critical components. Fans and
power supplies, for example, are so important for preventing damage or disruption of a system that a
chassis may have “extra” fans or power supplies installed. Redundancy allows a second or nth fan to keep
the chassis components at a safe temperature when the primary fan has failed. Redundancy is normal
when the intended number of critical components are operating. Redundancy is degraded when a
component fails but others are still operating. Redundancy is lost when the number of components
functioning falls below the redundancy threshold.
The number of devices required for full redundancy is provided as part of the message when applicable
for the redundancy unit and the platform. For details on redundancy computation, see
platform documentation.
Ta b l e 2-7 lists the redundancy unit messages.
the respective
Table 2-7. Redundancy Unit Messages
Event ID DescriptionSeverityCause
1300Redundancy sensor has failed
Redundancy unit:
in chassis>
Chassis location:
Previous redundancy state was:
<Redundancy location
<Name of chassis>
<State>
InformationA redundancy sensor in the
26Event Message Reference
specified system failed. The
redundancy unit location,
chassis location, previous
redundancy state, and the
number of devices required for
full redundancy are provided.
Table 2-7. Redundancy Unit Messages (continued)
Event ID DescriptionSeverityCause
1301Redundancy sensor value unknown
Redundancy unit:
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
1302Redundancy not applicable
Redundancy unit:
<Name of chassis>
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
1303Redundancy is offline
Redundancy unit:
<Name of chassis>
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
1304Redundancy regained
Redundancy unit:
<Name of chassis>
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
<Name of chassis>
<State>
<State>
<State>
<State>
InformationA redundancy sensor in the
specified system could not
obtain a reading. The
redundancy unit location,
chassis location, previous
redundancy state, and the
number of devices required for
full redundancy are provided.
InformationA redundancy sensor in the
specified system detected that a
unit was not redundant. The
redundancy location, chassis
location, previous redundancy
state, and the number of devices
required for full redundancy are
provided.
InformationA redundancy sensor in the
specified system detected that a
redundant unit is offline. The
redundancy unit location,
chassis location, previous
redundancy state, and the
number of devices required for
full redundancy are provided.
InformationA redundancy sensor in the
specified system detected that a
“lost” redundancy device has
been reconnected or replaced;
full redundancy is in effect. The
redundancy unit location,
chassis location, previous
redundancy state, and the
number of devices required for
full redundancy are provided.
Event Message Reference27
Table 2-7. Redundancy Unit Messages (continued)
Event ID DescriptionSeverityCause
1305Redundancy degraded
Redundancy unit:
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
1306Redundancy lost
Redundancy unit:
<Redundancy location
in chassis>
Chassis location:
Previous redundancy state was:
<Name of chassis>
<State>
<Name of chassis>
<State>
WarningA redundancy sensor in the
specified system detected that
one of the components of the
redundancy unit has failed but
the unit is still redundant. The
redundancy unit location,
chassis location, previous
redundancy state, and the
number of devices required for
full redundancy are provided.
Warning or
Error
(depending
on the
number of
units that are
functional)
A redundancy sensor in the
specified system detected that
one of the components in the
redundant unit has been
disconnected, has failed, or is
not present. The redundancy
unit location, chassis location,
previous redundancy state, and
the number of devices required
for full redundancy are provided.
Power Supply Messages
Power supply sensors monitor how well a power supply is functioning. Power supply messages listed in
Ta b l e 2-8 provide status and warning information for power supplies present in a particular chassis.
Table 2-8. Power Supply Messages
Event ID DescriptionSeverityCause
1350Power supply sensor has failed
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
<Location in chassis>
<Name of chassis>
<State>
<type of power
InformationA power supply sensor in the
specified system failed. The
sensor location, chassis location,
previous state, and additional
power supply status information
are provided.
supply>
<Additional power supply status
information>
If in configuration error state:
Configuration error type:
<type of
configuration error>
28Event Message Reference
Table 2-8. Power Supply Messages (continued)
Event ID DescriptionSeverityCause
1351Power supply sensor value unknown
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
<Location in chassis>
<Name of chassis>
<State>
<type of power
supply>
InformationA power supply sensor in the
specified system could not
obtain a reading. The sensor
location, chassis location,
previous state, and additional
power supply status information
are provided.
<Additional power supply status
information>
If in configuration error state:
Configuration error type:
<type of
configuration error>
1352Power supply returned to normal
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
<Location in chassis>
<Name of chassis>
<State>
<type of power
InformationA power supply has been
reconnected or replaced. The
sensor location, chassis location,
previous state, and additional
power supply status information
are provided.
supply>
<Additional power supply status
information>
If in configuration error state:
Configuration error type:
<type of
configuration error>
1353Power supply detected a warning
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
<Location in chassis>
<Name of chassis>
<State>
<type of power
supply>
WarningA power supply sensor reading in
the specified system exceeded a
user-definable warning
threshold. The sensor location,
chassis location, previous state,
and additional power supply
status information are provided.
<Additional power supply status
information>
If in configuration error state:
Configuration error type:
<type of
configuration error>
Event Message Reference29
Table 2-8. Power Supply Messages (continued)
Event ID DescriptionSeverityCause
1354Power supply detected a failure
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
supply>
<Additional power supply status
information>
If in configuration error state:
Configuration error type:
configuration error>
1355Power supply sensor detected a non-
recoverable value
Sensor location:
Chassis location:
Previous state was:
Power Supply type:
supply>
<Additional power supply status
information>
If in configuration error state:
Configuration error type:
configuration error>
<Location in chassis>
<Name of chassis>
<State>
<type of power
<type of
<Location in chassis>
<Name of chassis>
<State>
<type of power
<type of
ErrorA power supply has been
disconnected or has failed. The
sensor location, chassis location,
previous state, and additional
power supply status information
are provided.
ErrorA power supply sensor in the
specified system detected an
error from which it cannot
recover. The sensor location,
chassis location, previous state,
and additional power supply
status information are provided.
Memory Device Messages
Memory device messages listed in Table 2-9 provide status and warning information for memory
modules present in a particular system. Memory devices determine health status by monitoring the
memory correction rate and the type of memory events that have occurred.
ECC
NOTE: A critical status does not always indicate a system failure or loss of data. In some instances, the system has
exceeded the ECC correction rate. Although the system continues to function, you should perform system
maintenance as described in Table
NOTE: In Table 2-9, <status> can be either critical or non-critical.
30Event Message Reference
2-9.
Table 2-9. Memory Device Messages
Event ID DescriptionSeverityCause
1403Memory device status is
Memory device location:
chassis>
Possible memory module event cause:
<status>
<location in
WarningA memory device correction rate
exceeded an acceptable value. The
memory device status and location
are provided.
<list of causes>
1404Memory device status is
Memory device location:
chassis>
Possible memory module event cause:
<list of causes>
<status>
<location in
ErrorA memory device correction rate
exceeded an acceptable value, a
memory spare bank was activated,
or a multibit ECC error occurred.
The system continues to function
normally (except for a multibit
error). Replace the memory
module identified in the message
during the system’s next scheduled
maintenance. Clear the memory
error on multibit ECC error.
The memory device status and
location are provided.
Fan Enclosure Messages
Some systems are equipped with a protective enclosure for fans. Fan enclosure messages listed in
Ta b l e 2-10 monitor whether foreign objects are present in an enclosure and how long a fan enclosure is
missing from a chassis.
Table 2-10. Fan Enclosure Messages
Event ID DescriptionSeverityCause
1450Fan enclosure sensor has failed
Sensor location:
Chassis location:
1451Fan enclosure sensor value unknown
Sensor location:
Chassis location:
1452Fan enclosure inserted into system
Sensor location:
Chassis location:
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
Information The fan enclosure sensor in the
specified system failed. The sensor
location and chassis location
are provided.
Information The fan enclosure sensor in the
specified system could not obtain a
reading. The sensor location and
chassis location are provided.
Information A fan enclosure has been inserted
into the specified system. The
sensor location and chassis location
are provided.
Event Message Reference31
Table 2-10. Fan Enclosure Messages (continued)
Event ID DescriptionSeverityCause
1453Fan enclosure removed from system
Sensor location:
Chassis location:
1454Fan enclosure removed from system for
an extended amount of time
Sensor location:
Chassis location:
1455Fan enclosure sensor detected a non-
recoverable value
Sensor location:
Chassis location:
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
WarningA fan enclosure has been removed
from the specified system. The
sensor location and chassis location
are provided.
ErrorA fan enclosure has been removed
from the specified system for a
user-definable length of time. The
sensor location and chassis location
are provided.
ErrorA fan enclosure sensor in the
specified system detected an error
from which it cannot recover. The
sensor location and chassis location
are provided.
AC Power Cord Messages
AC power cord messages listed in Table 2-11 provide status and warning information for power cords that
are part of an AC power switch, if your system supports AC switching.
Table 2-11. AC Power Cord Messages
Event ID DescriptionSeverityCause
1500AC power cord sensor has failed
Sensor location:
Chassis location:
1501AC power cord is not being monitored
Sensor location:
Chassis location:
1502AC power has been restored
Sensor location:
Chassis location:
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
Information An AC power cord sensor in the
Information The AC power cord status is not
Information An AC power cord that did not
32Event Message Reference
specified system failed. The
AC power cord status cannot be
monitored. The sensor location
and chassis location information
are provided.
being monitored. This occurs when
a system’s expected AC power
configuration is set to
nonredundant. The sensor
location and chassis location
information are provided.
have AC power has had the power
restored. The sensor location and
chassis location information are
provided.
Table 2-11. AC Power Cord Messages (continued)
Event ID DescriptionSeverityCause
1503AC power has been lost
Sensor location:
Chassis location:
1504AC power has been lost
Sensor location:
Chassis location:
1505AC power has been lost
Sensor location:
Chassis location:
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
<Location in chassis>
<Name of chassis>
WarningAn AC power cord has lost its
power, but there is sufficient
redundancy to classify this as a
warning. The sensor location and
chassis location information
are provided.
ErrorAn AC power cord has lost its
power, and lack of redundancy
requires this to be classified as an
error. The sensor location and
chassis location information
are provided.
ErrorAn AC power cord sensor in the
specified system failed. The AC
power cord status cannot be
monitored. The sensor location
and chassis location information
are provided.
Hardware Log Sensor Messages
Hardware logs provide hardware status messages to systems management software. On certain systems,
the hardware log is implemented as a circular queue. When the log becomes full, the oldest status
messages are overwritten when new status messages are logged. On some systems, the log is not circular.
On these systems, when the log becomes full, subsequent hardware status messages are lost. Hardware
log sensor messages listed in
logs that may fill up, resulting in lost status messages.
Ta b l e 2-12 provide status and warning information about the noncircular
Table 2-12. Hardware Log Sensor Messages
Event ID DescriptionSeverityCause
1550Log monitoring has been disabled
Log type:
1551Log status is unknown
Log type:
<Log type>
<Log type>
Information A hardware log sensor in the
specified system is disabled. The
log type information is provided.
Information A hardware log sensor in the
specified system could not obtain a
reading. The log type information
is provided.
system is no longer near or at its
capacity, usually as the result of
clearing the log. The log type
information is provided.
WarningThe size of a hardware log on the
specified system is near or at the
capacity of the hardware log. The
log type information is provided.
ErrorThe size of a hardware log on the
specified system is full. The log
type information is provided.
ErrorA hardware log sensor in the
specified system failed. The
hardware log status cannot be
monitored. The log type
information is provided.
Processor Sensor Messages
Processor sensors monitor how well a processor is functioning. Processor messages listed in Table 2-13
provide status and warning information for processors in a particular chassis.
Table 2-13. Processor Sensor Messages
Event ID DescriptionSeverityCause
1600Processor sensor has failed
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
1601Processor sensor value unknown
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
Information A processor sensor in the specified
Information A processor sensor in the specified
34Event Message Reference
system is not functioning. The
sensor location, chassis location,
previous state and processor sensor
status are provided.
system could not obtain a reading.
The sensor location, chassis
location, previous state and
processor sensor status are
provided.
Table 2-13. Processor Sensor Messages (continued)
Event ID DescriptionSeverityCause
1602Processor sensor returned to a normal
value
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
1603Processor sensor detected a warning
value
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
1604Processor sensor detected a failure
value
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
1605Processor sensor detected a non-
recoverable value
Sensor location:
Chassis location:
Previous state was:
Processor sensor status:
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
Information A processor sensor in the specified
system transitioned back to a
normal state. The sensor location,
chassis location, previous state and
processor sensor status
are provided.
WarningA processor sensor in the specified
system is in a throttled state. The
sensor location, chassis location,
previous state and processor sensor
status are provided.
ErrorA processor sensor in the specified
system is disabled, has a
configuration error, or experienced
a thermal trip. The sensor location,
chassis location, previous state and
processor sensor status
are provided.
ErrorA processor sensor in the specified
system has failed. The sensor
location, chassis location, previous
state and processor sensor status
are provided.
Event Message Reference35
Pluggable Device Messages
The pluggable device messages listed in Table 2-14 provide status and error information when some
devices, such as memory cards, are added or removed.
Table 2-14. Pluggable Device Messages
Event ID DescriptionSeverityCause
1650
1651Device added to system
1652Device removed from system
1653Device configuration error detected
<Device plug event type unknown>
Device location:
if available>
Chassis location:
available>
Additional details:
details for the events, if available>
Device location:
Chassis location:
Additional details:
details for the events>
Device location:
Chassis location:
Additional details:
details for the events>
Device location:
Chassis location:
Additional details:
details for the events>
<Location in chassis,
<Name of chassis, if
<Additional
<Location in chassis>
<Name of chassis>
<Additional
<Location in chassis>
<Name of chassis>
<Additional
<Location in chassis>
<Name of chassis>
<Additional
Information A pluggable device event message
of unknown type was received. The
device location, chassis location,
and additional event details, if
available, are provided.
Information A device was added in the specified
system. The device location,
chassis location, and additional
event details, if available, are
provided.
Information A device was removed from the
specified system. The device
location, chassis location, and
additional event details, if
available, are provided.
ErrorA configuration error was detected
for a pluggable device in the
specified system. The device may
have been added to the system
incorrectly.
36Event Message Reference
System Event Log Messages for IPMI Systems
The following tables list the system event log (SEL) messages, their severity, and cause.
NOTE: For corrective actions, see the appropriate documentation.
Temperature Sensor Events
The temperature sensor event messages help protect critical components by alerting the systems
management console when the temperature rises inside the chassis. These event messages use
additional variables, such as sensor location, chassis location, previous state, and temperature sensor
value or state.
Table 3-1. Temperature Sensor Events
Event MessageSeverityCause
<
Sensor Name/Location
temperature sensor detected a
failure <
Name/Location
that this sensor is monitoring.
For example, "PROC Temp" or
"Planar Temp."
Reading is specified in degree
Celsius. For example 100 C.
<Sensor Name/Location
temperature sensor detected a
warning <
<
Sensor Name/Location>
temperature sensor returned to
warning state <
<
Sensor Name/Location
temperature sensor returned to
normal state <
Reading
> is the entity
Reading
Reading
Reading
>
> where <
>
>.
>.
>
>.
Sensor
CriticalTemperature of the backplane board,
system board, or the carrier in the specified
system <Sensor Name/Location> exceeded
the critical threshold.
WarningTemperature of the backplane board,
system board, or the carrier in the specified
system <Sensor Name/Location> exceeded
the non-critical threshold.
WarningTemperature of the backplane board,
system board, or the carrier in the specified
system <Sensor Name/Location> returned
from critical state to non-critical state.
InformationTemperature of the backplane board,
system board, or the carrier in the specified
system <Sensor Name/Location> returned
to normal operating range.
System Event Log Messages for IPMI Systems37
Voltage Sensor Events
The voltage sensor event messages monitor the number of volts across critical components. These
messages provide status and warning information for voltage sensors for a particular chassis.
Table 3-2. Voltage Sensor Events
Event MessageSeverityCause
<
Sensor Name/Location
sensor detected a failure <
where <
entity that this sensor is
monitoring. For example, "CMOS
Battery."
Reading is specified in volts. For
example, 3.860 V.
Sensor Name/Location
<
sensor state asserted.
<
Sensor Name/Location
sensor state de-asserted.
Sensor Name/Location
<
sensor detected a warning
<
Reading
Sensor Name/Location
<
sensor returned to normal<
Sensor Name/Location
>.
> voltage
Reading
> is the
> voltage
> voltage
> voltage
> voltage
Reading
CriticalThe voltage of the monitored device is out of
>
CriticalThe voltage specified by <Sensor
InformationThe voltage of a previously reported <Sensor
WarningVoltage of the monitored entity <Sensor
InformationThe voltage of a previously reported <Sensor
>.
critical threshold.
Name/Location> is in critical state.
Name/Location> is returned to normal state.
Name/Location> exceeded the warning
threshold.
Name/Location> is returned to normal state.
38System Event Log Messages for IPMI Systems
Fan Sensor Events
The cooling device sensors monitor how well a fan is functioning. These messages provide status warning
and failure messages for fans for a particular chassis.
Table 3-3. Fan Sensor Events
Event MessageSeverityCause
<
Sensor Name/Location
detected a failure <
Sensor Name/Location
<
that this sensor is monitoring. For
example "BMC Back Fan" or "BMC Front
Fan."
Reading is specified in RPM. For
example, 100 RPM.
<Sensor Name/Location
returned to normal state <
<
Sensor Name/Location
detected a warning <
<
Sensor Name/Location
sensor redundancy degraded.
Sensor Name/Location
<
sensor redundancy lost.
<Sensor Name/Location> Fan Redundancy
sensor redundancy regained
> Fan sensor
Reading
> is the entity
> Fan sensor
> Fan sensor
Reading
> Fan Redundancy
> Fan Redundancy
> where
Reading
>.
>.
CriticalThe speed of the specified <Sensor
Name/Location> fan is not sufficient to
provide enough cooling to the system.
InformationThe fan specified by <Sensor
Name/Location> has returned to its normal
operating speed.
WarningThe speed of the specified <Sensor
Name/Location> fan may not be sufficient
to provide enough cooling to the system.
InformationThe fan specified by <Sensor
Name/Location> may have failed and
hence, the redundancy has been degraded.
CriticalThe fan specified by <Sensor
Name/Location> may have failed and
hence, the redundancy that was degraded
previously has been lost.
InformationThe fan specified by <Sensor
Name/Location> may have started
functioning again and hence, the
redundancy has been regained.
System Event Log Messages for IPMI Systems39
Processor Status Events
The processor status messages monitor the functionality of the processors in a system. These messages
provide processor health and warning information of a system.
Table 3-4. Processor Status Events
Event MessageSeverityCause
<
Processor Entity
sensor IERR, where <
Entity
generated the event. For example,
PROC for a single processor system
and PROC # for multiprocessor
system.
<
sensor Thermal Trip.
<
sensor recovered from IERR.
<
sensor disabled.
<
sensor terminator not present.
> is the processor that
Processor Entity
Processor Entity
Processor Entity
Processor Entity
> status processor
Processor
> status processor
> status processor
> status processor
> status processor
CriticalIERR internal error generated by the
<Processor Entity>.
CriticalThe processor generates this event before it
shuts down because of excessive heat caused
by lack of cooling or heat synchronizating.
InformationThis event is generated when a processor
recovers from the internal error.
WarningThis event is generated for all processors that
are disabled.
InformationThis event is generated if the terminator is
missing on an empty processor slot.
Power Supply Events
The power supply sensors monitor the functionality of the power supplies. These messages provide status
and warning information for power supplies for a particular system.
Table 3-5. Power Supply Events
Event MessageSeverityCause
<
Power Supply Sensor Name
supply sensor removed.
<
Power Supply Sensor Name
supply sensor AC recovered.
<
Power Supply Sensor Name
supply sensor returned to normal
state.
40System Event Log Messages for IPMI Systems
> power
> power
> power
CriticalThis event is generated when the power
supply sensor is removed.
InformationThis event is generated when the power
supply has been replaced.
InformationThis event is generated when the power
supply that failed or removed was replaced
and the state has returned to normal.
Table 3-5. Power Supply Events (continued)
Event MessageSeverityCause
<
Entity Name
redundancy degraded.
Entity Name
<
redundancy lost.
Entity Name
<
redundancy regained.
> PS Redundancy sensor
> PS Redundancy sensor
> PS Redundancy sensor
InformationPower supply redundancy is degraded if one
of the power supply sources is removed
or failed.
CriticalPower supply redundancy is lost if only one
power supply is functional.
InformationThis event is generated if the power supply
has been reconnected or replaced.
Memory ECC Events
The memory ECC event messages monitor the memory modules in a system. These messages monitor
the ECC memory correction rate and the type of memory events that occurred.
Table 3-6. Memory ECC Events
Event MessageSeverityCause
ECC error correction detected on
Bank # DIMM [A/B].
ECC uncorrectable error detected on
Bank # [DIMM].
Correctable memory error logging
disabled.
InformationThis event is generated when there is a
memory error correction on a particular Dual
Inline Memory Module (DIMM).
CriticalThis event is generated when the chipset is
unable to correct the memory errors. Usually,
a bank number is provided and DIMM may
or may not be identifiable, depending on
the error.
CriticalThis event is generated when the chipset in
the ECC error correction rate exceeds a
predefined limit.
System Event Log Messages for IPMI Systems41
BMC Watchdog Events
The BMC watchdog operations are performed when the system hangs or crashes. These messages
monitor the status and occurrence of these events in a system.
Table 3-7. BMC Watchdog Events
Event MessageSeverityCause
BMC OS Watchdog timer expired. InformationThis event is generated when the BMC watchdog
timer expires and no action is set.
BMC OS Watchdog performed
system reboot.
BMC OS Watchdog performed
system power off.
BMC OS Watchdog performed
system power cycle.
CriticalThis event is generated when the BMC watchdog
detects that the system has crashed (timer expired
because no response was received from Host) and the
action is set to reboot.
CriticalThis event is generated when the BMC watchdog
detects that the system has crashed (timer expired
because no response was received from Host) and the
action is set to power off.
CriticalThis event is generated when the BMC watchdog
detects that the system has crashed (timer expired
because no response was received from Host) and the
action is set to power cycle.
Memory Events
The memory modules can be configured in different ways in particular systems. These messages monitor the
status, warning, and configuration information about the memory modules in the system.
Table 3-8. Memory Events
Event MessageSeverityCause
Memory RAID redundancy
degraded.
Memory RAID redundancy lost.CriticalThis event is generated when redundancy is lost in a
Memory RAID redundancy regained InformationThis event is generated when the redundancy lost or
Memory Mirrored redundancy
degraded.
Memory Mirrored redundancy
lost.
InformationThis event is generated when there is a memory
failure in a RAID-configured memory configuration.
RAID-configured memory configuration.
degraded earlier is regained in a RAID-configured
memory configuration.
InformationThis event is generated when there is a memory
failure in a mirrored memory configuration.
CriticalThis event is generated when redundancy is lost in a
mirrored memory configuration.
42System Event Log Messages for IPMI Systems
Table 3-8. Memory Events (continued)
Event MessageSeverityCause
Memory Mirrored redundancy
regained.
Memory Spared redundancy
degraded.
Memory Spared redundancy lost. CriticalThis event is generated when redundancy is lost in a
Memory Spared redundancy
regained.
InformationThis event is generated when the redundancy lost or
degraded earlier is regained in a mirrored
memory configuration.
InformationThis event is generated when there is a memory
failure in a spared memory configuration.
spared memory configuration.
InformationThis event is generated when the redundancy lost or
degraded earlier is regained in a spared
memory configuration.
Hardware Log Sensor Events
The hardware logs provide hardware status messages to the system management software. On particular
systems, the subsequent hardware messages are not displayed when the log is full. These messages
provide status and warning messages when the logs are full.
Table 3-9. Hardware Log Sensor Events
Event MessageSeverityCause
Log full detected.CriticalThis event is generated when the SEL device detects
that only one entry can be added to the SEL before it
is full.
Log cleared.InformationThis event is generated when the SEL is cleared.
Drive Events
The drive event messages monitor the health of the drives in a system. These events are generated when
there is a fault in the drives indicated.
Table 3-10. Drive Events
Event MessageSeverityCause
Drive <
state.
Drive <
fault state.
Drive #
Drive #
> asserted fault
> de-asserted
CriticalThis event is generated when the specified drive in
the array is faulty.
InformationThis event is generated when the specified drive
recovers from a faulty condition.
System Event Log Messages for IPMI Systems43
Intrusion Events
The chassis intrusion messages are a security measure. Chassis intrusion alerts are generated when the
system's chassis is opened. Alerts are sent to prevent unauthorized removal of parts from the chassis.
Table 3-11. Intrusion Events
Event MessageSeverityCause
<
Intrusion sensor Name
detected an intrusion.
<
Intrusion sensor Name
returned to normal state.
> sensor
> sensor
CriticalThis event is generated when the intrusion sensor
detects an intrusion.
InformationThis event is generated when the earlier intrusion has
been corrected.
BIOS Generated System Events
The BIOS generated messages monitor the health and functionality of the chipsets, I/O channels, and
other BIOS-related functions. These system events are generated by the BIOS.
Table 3-12. BIOS Generated System Events
Event MessageSeverityCause
System Event I/O channel chk.CriticalThis event is generated when a critical interrupt is
generated in the I/O Channel.
System Event PCI Parity Err.CriticalThis event is generated when a parity error is
detected on the PCI bus.
System Event Chipset Err.CriticalThis event is generated when a chip error is detected.
System Event PCI System Err.CriticalThis event indicates historical data, and is generated
when the system has crashed and recovered.
System Event PCI Fatal Err.CriticalThis error is generated when a fatal error is detected
on the PCI bus.
System Event PCIE Fatal Err.CriticalThis error is generated when a fatal error is detected
on the PCIE bus.
44System Event Log Messages for IPMI Systems
Storage Management Message Reference
Storage Management’s alert or event management features let you monitor the health of storage
resources such as controllers, connectors, array disks, and virtual disks.
Alert Monitoring and Logging
The Storage Management Service performs alert monitoring and logging. By default, the Storage
Management Service starts when the managed system starts up. If you stop the Storage
Management Service, then alert monitoring and logging stops. Alert monitoring does the following:
•Updates the status of the storage object that generated the alert.
•Propagates the storage object’s status to all the related higher objects in the storage hierarchy. For
example, the status of a lower-level object will be propagated up to the status displayed on the
Health tab for the top-level storage object.
•Logs an alert into the Alert log and the Windows application log.
•Sends an SNMP trap if the operating system’s SNMP service is installed and enabled.
NOTE: Storage Management does not log alerts regarding the data I/O path. These alerts are logged by the
respective RAID drivers in the system alert log.
For updated information, lookup the Storage Management Online Help and the Dell OpenManage™
Server Administrator Storage Management User’s Guide.
Alert Descriptions and Corrective Actions
The following sections describe alerts generated by the RAID or SCSI controllers supported by
Storage Management. The alerts are displayed in the Server Administrator Alert subtab or through
the Windows Event Viewer. These alerts can also be forwarded as SNMP traps to other applications.
SNMP traps are generated for the alerts listed in the following sections. These traps are included in
the Storage Management management information base (MIB). The SNMP traps for these alerts
use all of the SNMP trap variables. For more information on SNMP support and the MIB, see the
SNMP Reference Guide.
To locate an alert, scroll through the following table to find the alert number displayed on the Server
Administrator Alert tab or search this file for the alert message text or number. See
Event Messages" for more information on severity levels.
Storage Management Message Reference45
"Understanding
NOTE: If you have an Array Manager installation, the Array Manager console reports the status of storage
components through error icons and graphical displays. When there is a change in status, Array Manager sends
events to the Array Manager event log, which can be viewed from the Array Manager console. For more
information, see the Dell OpenManage™ Array Manager User's Guide.
For more information regarding alert descriptions and the appropriate corrective actions, see the online help.
Table 4-1. Storage Management Messages
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2048Device failedCritical /
Failure /
Error
2049Array disk removedWarning /
Non-critical
Cause: A physical disk in the array failed. The
failed disk may have been identified by the
controller or channel. Performing a
consistency check can also identify a
failed disk.
Action: Replace the failed array disk. You can
identify which disk has failed by locating the
disk that has a red “X” for its status. Perform
a rescan after replacing the disk.
Cause: A physical disk has been removed
from the array. A user may have also executed
the "Prepare to Remove" task. This alert can
also be caused by loose or defective cables or
by problems with the enclosure.
If a physical disk was removed from
Action:
the array, either replace the disk or restore the
original disk. You can identify which disk has
been removed by locating the disk that has a
red “X” for its status. Perform a rescan after
replacing or restoring the disk. If a disk has not
been removed from the array, then check for
problems with the cables. See the
for more information on checking the cables.
Make sure that the enclosure is powered on. If
the problem persists, check the enclosure
documentation for further diagnostic
information.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2050Array disk offlineWarning /
Non-critical
2051Array disk degradedWarning /
Non-critical
2052Array disk insertedOk /
Normal
2053Virtual disk createdOk /
Normal
2054Virtual disk deletedWarning /
Non-critical
2055Virtual disk
configuration
changed
Ok /
Normal
Cause: A physical disk in the array is offline.
A disk can be made offline during a "Prepare
to Remove" operation or because a user
manually put the disk offline.
Perform a rescan. You can also select
Action:
the offline disk and perform a Make
Online operation.
Cause: An array disk has reported an error
condition and may be degraded. The array
disk may have reported the error condition in
response to a consistency check or
other operation.
Action: Replace the degraded array disk.
You can identify which disk is degraded by
locating the disk that has a red "X" for its
status. Perform a rescan after replacing
the disk.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: A virtual disk has been deleted.
"Performing a Reset Configuration" operation
may detect that a virtual disk has been
deleted and generate this alert.
Action: None
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2056Virtual disk failedCritical /
Failure /
Error
2057Virtual disk degraded Warning /
Non-critical
2058Virtual disk check
consistency started
Ok /
Normal
Cause: One or more physical disks included
in the virtual disk have failed. If the virtual
disk is non-redundant (does not use mirrored
or parity data), then the failure of a single
physical disk can cause the virtual disk to fail.
If the virtual disk is redundant, then more
physical disks have failed than can be rebuilt
using mirrored or parity information.
Create a new virtual disk and restore
Action:
from a backup.
Cause 1: This alert message occurs when a
physical disk included in a redundant virtual
disk fails. Because the virtual disk is
redundant (uses mirrored or parity
information) and only one physical disk has
failed, the virtual disk can be rebuilt.
Action 1: Configure a hot spare for the virtual
disk if one is not already configured. Rebuild
the virtual disk. When using a Expandable
RAID Controller (PERC) 2/SC, 3/SC, 2/DC,
3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC,
4/Di, or CERC ATA100/4ch controller,
rebuild the virtual disk by first configuring a
hot spare for the disk, and then initiating a
write operation to the disk. The write
operation will initiate a rebuild of the disk.
Cause 2: A physical disk in the array has
been removed.
Action 2: If a physical disk was removed from
the array, either replace the disk or restore the
original disk. You can identify which disk has
been removed by locating the disk that has a
red “X” for its status. Perform a rescan after
replacing the disk.
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2059Virtual disk format
started
2061Virtual disk
initialization started
2063Virtual disk
reconfiguration
started
2064Virtual disk rebuild
started
2065Array disk rebuild
started
2067Virtual disk check
consistency cancelled
Ok /
Normal
Ok /
Normal
Ok /
Normal
Ok /
Normal
Ok /
Normal
Ok /
Normal
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The check consistency operation
cancelled because a physical disk in the array
has failed or because a user cancelled the
check consistency operation.
Action: If the physical disk failed, then
replace the physical disk. You can identify
which disk failed by locating the disk that has
a red “X” for its status. Perform a rescan after
replacing the disk. When performing a
consistency check, be aware that the
consistency check can take a long time.
The time it takes depends on the size of the
physical disk or the virtual disk.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2070Virtual disk
initialization
cancelled
2074Array disk rebuild
cancelled
2076Virtual disk check
consistency failed
2077Virtual disk format
failed.
2079Virtual disk
initialization failed
Ok /
Normal
Ok /
Normal
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Cause: The virtual disk initialization
cancelled because a physical disk included in
the virtual disk has failed or because a user
cancelled the virtual disk initialization.
Action: If a physical disk failed, then replace
the physical disk. You can identify which disk
has failed by locating the disk that has a red
“X” for its status. Perform a rescan after
replacing the disk. Restart the format array
disk operation. Restart the virtual
disk initialization.
Cause: A user has cancelled the
rebuild operation.
Action: Restart the rebuild operation.
Cause: An array disk included in the virtual
disk failed or there is an error in the parity
information. A failed array disk can cause
errors in parity information.
Action: Replace the failed array disk. You can
identify which disk has failed by locating the
disk that has a red “X” for its status. Rebuild
the array disk. When finished, restart the
check consistency operation.
Cause: An array disk included in the virtual
disk failed.
Action: Replace the failed array disk. You can
identify which array disk has failed by
locating the disk that has a red "X" for its
status. Rebuild the array disk. When finished,
restart the virtual disk format operation.
Cause: An array disk included in the virtual
disk has failed or a user has cancelled the
initialization.
Action: If an array disk has failed, then
replace the array disk.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2080Array disk initialize
failed
2081Virtual disk
reconfiguration failed
2082Virtual disk rebuild
failed
2083Array disk rebuild
failed
2085Virtual disk check
consistency
completed
2086Virtual disk format
completed
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Ok /
Normal
Ok /
Normal
Cause: The array disk has failed or is corrupt.
Action: Replace the failed or corrupt disk.
You can identify a disk that has failed by
locating the disk that has a red “X” for its
status. Restart the initialization.
Cause: An array disk included in the virtual
disk has failed or is corrupt. A user may also
have cancelled the reconfiguration.
Action: Replace the failed or corrupt disk.
You can identify a disk that has failed by
locating the disk that has a red “X” for its
status. If the array disk is part of a redundant
array, then rebuild the array disk. When
finished, restart the reconfiguration.
Cause: An array disk included in the virtual
disk has failed or is corrupt. A user may also
have cancelled the rebuild.
Action: Replace the failed or corrupt disk.
You can identify a disk that has failed by
locating the disk that has a red “X” for its
status. Restart the virtual disk rebuild.
Cause: An array disk included in the virtual
disk has failed or is corrupt. A user may also
have cancelled the rebuild.
Action: Replace the failed or corrupt disk.
You can identify a disk that has failed by
locating the disk that has a red “X” for its
status. Rebuild the virtual disk rebuild.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2088Virtual disk
initialization
completed
2089Array disk initialize
completed
2090Virtual disk
reconfiguration
completed
2091Virtual disk rebuild
completed
2092Array disk rebuild
completed
2094Predictive Failure
reported. If this disk is
part of a redundant
virtual disk, select the
"Offline" option and
then replace the disk.
Then configure a hot
spare and it will start
the rebuild
automatically. If this
disk is a hot spare,
select the "Prepare to
Remove" option and
then replace the disk.
If this disk is part of a
non-redundant disk,
you should back up
your data
immediately. If the
disk fails, you will not
be able to recover
the data.
Ok /
Normal
Ok /
Normal
Ok /
Normal
Ok /
Normal
Ok /
Normal
Warning /
Non-critical
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The array disk is predicted to fail.
Many array disks contain Self Monitoring
Analysis and Reporting Technology
(S.M.A.R.T.). When enabled, SMART
monitors the health of the disk based on
indications such as the number of write
operations that have been performed on
the disk.
Action: Replace the array disk. Even though
the disk may not have failed yet, it is strongly
recommended that you replace the disk.
Review the message text for additional
information.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2095SCSI sense data. If
this disk is part of a
redundant virtual
disk, select the
"Offline" option and
then replace the disk.
Then configure a hot
spare and it will start
the rebuild
automatically. If this
disk is a hot spare,
select the "Prepare to
Remove" option and
then replace the disk.
If this disk is part of a
non-redundant disk,
you should back up
your data
immediately. If the
disk fails, you will not
be able to recover the
data.
2098Global hot spare
assigned
2099Global hot spare
unassigned
Warning /
Non-critical
Ok /
Normal
Ok /
Normal
Cause: An array disk has failed, is corrupt, or
is otherwise experiencing a problem.
Action: Replace the array disk. Even though
the disk may not have failed yet, it is strongly
recommended that you replace the disk.
Review the message text for
additional information.
Cause: A user has assigned an array disk as a
global hot spare. This alert is provided for
informational purposes.
Action: None
Cause: A user has unassigned an array disk as
a global hot spare. This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2100Temperature
exceeded the
maximum warning
threshold
2101Temperature dropped
below the minimum
warning threshold
2102Temperature
exceeded the
maximum failure
threshold
2103Temperature dropped
below the minimum
failure threshold
Warning /
Non-critical
Warning /
Non-critical
Critical /
Failure /
Error
Critical /
Failure /
Error
Cause: The array disk enclosure is too hot.
A variety of factors can cause the excessive
temperature. For example, a fan may have
failed, the thermostat may be set too high, or
the room temperature may be too hot.
Action: Check for factors that may cause
overheating. For example, verify that the
enclosure fan is working. You should also
check the thermostat settings and examine
whether the enclosure is located near a heat
source. Make sure the enclosure has enough
ventilation and that the room temperature is
not too hot. See the enclosure
documentation for more
diagnostic information.
Cause: The array disk enclosure is too cool.
Action: Check whether the thermostat
setting is too low and whether the room
temperature is too cool.
Cause: The array disk enclosure is too hot.
A variety of factors can cause the excessive
temperature. For example, a fan may have
failed, the thermostat may be set too high, or
the room temperature may be too hot.
Action: Check for factors that may cause
overheating. For example, verify that the
enclosure fan is working. You should also
check the thermostat settings and examine
whether the enclosure is located near a heat
source. Make sure the enclosure has enough
ventilation and that the room temperature is
not too hot. See the enclosure documentation
for more diagnostic information.
Cause: The array disk enclosure is too cool.
Action: Check whether the thermostat
setting is too low and whether the room
temperature is too cool.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2104Controller battery is
reconditioning
2105Controller battery
recondition is
completed
2106Smart FPT exceeded Warning /
2107Smart configuration
change
2108Smart warningWarning /
Ok /
Normal
Ok /
Normal
Non-critical
Critical /
Failure /
Error
Non-critical
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: A disk on the specified controller has
received a SMART alert (predictive failure)
indicating that the disk is likely to fail in the
near future.
Action: Replace the disk that has received the
SMART alert. If the array disk is a member of
a non-redundant virtual disk, then back up the
data before replacing the disk. Removing an
array disk that is included in a non-redundant
virtual disk will cause the virtual disk to fail
and may cause data loss.
Cause: A disk has received a SMART alert
(predictive failure) after a configuration change.
The disk is likely to fail in the near future.
Action: Replace the disk that has received the
SMART alert. If the array disk is a member of
a non-redundant virtual disk, then back up
the data before replacing the disk. Removing
an array disk that is included in a nonredundant virtual disk will cause the virtual
disk to fail and may cause data loss.
Cause: A disk has received a SMART alert
(predictive failure). The disk is likely to fail in
the near future.
Action: Replace the disk that has received the
SMART alert. If the array disk is a member of
a non-redundant virtual disk, then back up
the data before replacing the disk. Removing
an array disk that is included in a nonredundant virtual disk will cause the virtual
disk to fail and may cause data loss.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2109Smart warning
temperature
2110Smart warning
degraded
Warning /
Non-critical
Warning /
Non-critical
Cause: A disk has reached an unacceptable
temperature and received a SMART alert
(predictive failure). The disk is likely to fail in
the near future.
First Action: Determine why the array disk
has reached an unacceptable temperature.
A variety of factors can cause the excessive
temperature. For example, a fan may have
failed, the thermostat may be set too high, or
the room temperature may be too hot or cold.
Verify that the fans in the server or enclosure
are working. If the array disk is in an
enclosure, you should check the thermostat
settings and examine whether the enclosure
is located near a heat source. Make sure the
enclosure has enough ventilation and that
the room temperature is not too hot. See the
enclosure documentation for more
diagnostic information.
Second Action: If you cannot identify why
the disk has reached an unacceptable
temperature, then replace the disk. If the
array disk is a member of a non-redundant
virtual disk, then back up the data before
replacing the disk. Removing an array disk
that is included in a non-redundant virtual
disk will cause the virtual disk to fail and may
cause data loss.
Cause: A disk is degraded and has received a
SMART alert (predictive failure). The disk is
likely to fail in the near future.
Action: Replace the disk that has received the
SMART alert. If the array disk is a member of
a non-redundant virtual disk, then back up
the data before replacing the disk. Removing
an array disk that is included in a nonredundant virtual disk will cause the virtual
disk to fail and may cause data loss.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2111Failure prediction
threshold exceeded
due to test - No
action needed
2112Enclosure was shut
down
2114A consistency check
on a virtual disk has
been paused
(suspended)
2115A consistency check
on a virtual disk has
been resumed
Warning /
Non-critical
Critical /
Failure /
Error
Ok /
Normal
Ok /
Normal
Cause: A disk has received a SMART alert
(predictive failure) due to test conditions.
Action: None
Cause: The array disk enclosure is either
hotter or cooler than the maximum or
minimum allowable temperature range.
Action: Check for factors that may cause
overheating or excessive cooling. For
example, verify that the enclosure fan is
working. You should also check the
thermostat settings and examine whether the
enclosure is located near a heat source. Make
sure the enclosure has enough ventilation and
that the room temperature is not too hot or
too cold. See the enclosure documentation
for more diagnostic information.
Cause: The check consistency operation on a
virtual disk was paused by a user.
Action: To resume the check consistency
operation, right-click the virtual disk in the
Storage Management tree view and select
Resume Check Consistency.
Cause: The check consistency operation on a
virtual disk has resumed processing after
being paused by a user.
Action: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2116A virtual disk and its
mirror have been split
2117A mirrored virtual
disk has been
unmirrored
2118Change write policyOk /
2120Enclosure firmware
mismatch
Ok /
Normal
Ok /
Normal
Normal
Warning /
Non-critical
Cause: A user has caused a mirrored virtual
disk to be split. When a virtual disk is
mirrored, its data is copied to another virtual
disk in order to maintain redundancy. After
being split, both virtual disks retain a copy of
the data, although because the mirror is no
longer intact, updates to the data are no
longer copied to the mirror.
Action: This alert is provided for
informational purposes.
Cause: A user has caused a mirrored virtual
disk to be unmirrored. When a virtual disk is
mirrored, its data is copied to another virtual
disk in order to maintain redundancy. After
being unmirrored, the disk formerly used as
the mirror returns to being an array disk and
becomes available for inclusion in another
virtual disk.
Action: This alert is provided for
informational purposes.
Cause: A user has changed the write policy
for a virtual disk.
Action: This alert is provided for
informational purposes.
Cause: The firmware on the enclosure
management modules (EMM) is not the
same version. It is required that both
modules have the same version of the
firmware. This alert may be caused when a
user attempts to insert an EMM module that
has a different firmware version than an
existing module.
Action: Download the same version of the
firmware to both EMM modules.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2121Device returned to
normal
2122Redundancy
degraded
Ok /
Normal
Warning /
Non-critical
Cause: A device that was previously in an
error state has returned to a normal state.
For example, if an enclosure became too hot
and subsequently cooled down, then you may
receive this alert.
Action: This alert is provided for
informational purposes.
Cause: One or more of the enclosure
components has failed. For example, a fan or
power supply may have failed. Although the
enclosure is currently operational, the failure
of additional components could cause the
enclosure to fail.
Action: Identify and replace the failed
component. To identify the failed
component, select the enclosure in the tree
view and click the Health subtab. Any failed
component will be identified with a red X on
the enclosure’s Health subtab. Alternatively,
you can select the Storage object and click
the Health subtab. The controller status
displayed on the Health subtab indicates
whether a controller has a failed or degraded
component. See the enclosure
documentation for information on replacing
enclosure components and for other
diagnostic information.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2123Redundancy lostWarning /
Non-critical
2124Redundancy normal Ok /
Normal
Cause: A virtual disk or an enclosure has lost
data redundancy. In the case of a virtual disk,
one or more array disks included in the virtual
disk have failed. Due to the failed array disk
or disks, the virtual disk is no longer
maintaining redundant (mirrored or parity)
data. The failure of an additional array disk
will result in lost data. In the case of an
enclosure, more than one enclosure
component has failed. For example, the
enclosure may have suffered the loss of all
fans or all power supplies.
Action: Identify and replace the failed
components. To identify the failed
component, select the Storage object and
click the Health subtab. The controller status
displayed on the Health subtab indicates
whether a controller has a failed or degraded
component. Click the controller that displays
a Warning or Failed status. This action
displays the controller Health subtab which
displays the status of the individual controller
components. Continue clicking the
components with a Warning or Health status
until you identify the failed component. See
the online help for more information. See the
enclosure documentation for information on
replacing enclosure components and for
other diagnostic information.
Cause: Data redundancy has been restored to
a virtual disk or an enclosure that previously
suffered a loss of redundancy.
Action: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2126SCSI sense sector
reassign
2127Background
initialization (BGI)
started
2128BGI cancelledOk /
2129BGI failedCritical /
2130BGI completedOk /
2131Firmware version
mismatch
Warning /
Non-critical
Ok /
Normal
Normal
Failure /
Error
Normal
Warning /
Non-critical
Cause: A sector of the disk is corrupted and
data cannot be maintained on this portion of
the disk.
Action: If the disk is part of a non-redundant
virtual disk, then replace the disk. Any data
residing on the corrupt portion of the disk
may be lost and you may need to restore from
backup. If the disk is part of a redundant
virtual disk, then any data residing on the
corrupt portion of the disk will be reallocated
elsewhere in the virtual disk.
Cause: BGI of a virtual disk has started. This
alert is provided for informational purposes.
Action: None
Cause: BGI of a virtual disk has been
cancelled. A user or the firmware may have
stopped BGI.
Action: None
Cause: BGI of a virtual disk has failed.
Action: None
Cause: BGI of a virtual disk has completed.
This alert is provided for informational
purposes.
Action: None
Cause: The firmware on the controller is not
a supported version.
Action: Install a supported version of the
firmware. If you do not have a supported
version of the firmware available, it can be
downloaded from the Dell™ support website
at support.dell.com. If you do not have a
supported version of the firmware available,
check with your support provider for
information on how to obtain the most
current firmware.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2132Driver version
mismatch
2135Array Manager is
installed on the
system
2136Virtual disk
initialization
Warning /
Non-critical
Warning /
Non-critical
Ok /
Normal
Cause: The controller driver is not a
supported version.
Action: Install a supported version of the
driver. If you do not have a supported driver
version available, it can be downloaded from
the Dell support site at support.dell.com.
If you do not have a supported version of the
driver available, check with your support
provider for information on how to obtain the
most current driver.
Cause: Storage Management has been
installed on a system that has an Array
Manager installation.
Action: Installing Storage Management and
Array Manager on the same system is not a
supported configuration. Uninstall either
Storage Management or Array Manager.
Cause: Virtual disk initialization is in
progress. This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2137Communication
timeout
2138Enclosure alarm
enabled
2139Enclosure alarm
disabled
2140Dead disk segments
restored
Warning /
Non-critical
Ok /
Normal
Ok /
Normal
Ok /
Normal
Cause: The controller is unable to
communicate with an enclosure. There are
several reasons why communcation may be
lost. For example, there may be a bad or loose
cable. An unusual amount of I/O may also
interrupt communication with the enclosure.
In addition, communication loss may be
caused by software, hardware, or firmware
problems, bad or failed power supplies, and
enclosure shutdown.
When viewed in the Alert Log, the
description for this event displays several
variables. These variables are: Controller and
enclosure names, type of communication
problem, return code, and SCSI status.
Action: Check for problems with the cables.
See the online help for more information on
checking the cables. You should also check to
see if the enclosure has degraded or failed
components. To do so, select the enclosure
object in the tree view and click the Health
subtab. The Health subtab displays the status
of the enclosure components. Verify that the
controller has supported driver and firmware
versions installed and that the EMMs are
each running the same version of
supported firmware.
Cause: A user has enabled the enclosure
alarm. This alert is provided for informational
purposes.
Action: None
Cause: A user has disabled the enclosure alarm.
Action: None
Cause: Disk space that was formerly “dead”
or inaccessible to a redundant virtual disk has
been restored. This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2157Controller
configuration has
been reset
2158Array disk onlineOk /
2159Virtual disk renamed Ok /
Ok /
Normal
Normal
Normal
Cause: A user has reset the controller
configuration. See the online help for more
information. This alert is provided for
informational purposes.
Action: None
Cause: An offline array disk has been made
online. This alert is provided for
informational purposes.
Action: None
Cause: A user has renamed a virtual disk.
This alert is provided for informational
purposes.
751None
901None
1201608
NOTE: When renaming a virtual disk on a PERC
2, 2/Si, 3/Si, 3/Di, CERC SATA 1.5/6ch, or CERC
SATA 1.5/2s controller, this alert displays the
new virtual disk name. On the PERC 2/SC, 2/DC,
3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di,
4/IM, 4e/Si, 4e/Di, and CERC ATA 100/4ch
controllers, this alert displays the original
virtual disk name.
Action: None
2160Dedicated hotspare
assigned
2161Dedicated hotspare
unassigned
2162Communication
regained
Ok /
Normal
Ok /
Normal
Ok /
Normal
Cause: A user has assigned an array disk as a
dedicated hot spare to a virtual disk. See the
online help for more information. This alert
is provided for informational purposes.
Action: None
Cause: A user has unassigned an array disk as
a dedicated hot spare to a virtual disk. See the
online help for more information. This alert
is provided for informational purposes.
Action: None
Cause: Communication with an enclosure
has been restored. This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2163Rebuild completed
with errors
2164See the Readme file
for a list of validated
controller driver
versions
2165The RAID controller
firmware and driver
validation was not
performed. The
configuration file
cannot be opened.
2166The RAID controller
firmware and driver
validation was not
performed. The
configuration file is
out of date or
corrupted.
Ok /
Normal
Ok /
Normal
Warning /
Non-critical
Warning /
Non-critical
See the online help for more information.904690
Cause: Storage Management is unable to
determine whether the system has the
minimum required versions of the RAID
controller drivers.
Action: This alert is generated for
informational purposes. See the Readme file
for driver and firmware requirements. In
particular, if Storage Management
experiences performance problems, you
should verify that you have the minimum
supported versions of the drivers and
firmware installed.
Cause: Storage Management is unable to
determine whether the system has the
minimum required versions of the RAID
controller firmware and drivers. This
situation may occur for a variety of reasons.
For example, the installation directory path
to the configuration file may not be correct.
The configuration file may also have been
removed or renamed.
Action: Reinstall Storage Management
Cause: Storage Management is unable to
determine whether the system has the
minimum required versions of the RAID
controller firmware and drivers. This
situation has occurred because a
configuration file is unreadable or missing
data. The configuration file may be
corrupted.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2167The current kernel
version and the nonRAID SCSI driver
version are older than
the minimum
required levels.
See the Readme file
for a list of validated
kernel and driver
versions.
2168The non-RAID SCSI
driver version is older
than the minimum
required level.
See the Readme file
for the validated
driver version.
2169The controller battery
needs to be replaced.
2170The controller battery
charge level is normal.
Warning /
Non-critical
Warning /
Non-critical
Critical /
Failure /
Error
Ok /
Normal
Cause: The version of the kernel and the
driver do not meet the minimum
requirements. Storage Management may not
be able to display the storage or perform
storage management functions until you have
updated the system to meet the minimum
requirements.
Action: See the Readme file for kernel and
driver requirements. Update the system to
meet the minimum requirements and then
reinstall Storage Management.
Cause: The version of the driver does not
meet the minimum requirements. Storage
Management may not be able to display the
storage or perform storage management
functions until you have updated the system
to meet the minimum requirements.
Action: See the Readme file for the driver
requirements. Update the system to meet the
minimum requirements and then reinstall
Storage Management.
Cause: The controller battery cannot
recharge. The battery may be old or it may
have been already recharged the maximum
number of times. In addition, the battery
charger may not be working.
Action: Replace the battery pack.
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2171The controller battery
temperature is above
normal.
2172The controller battery
temperature is
normal.
2174The controller battey
has been removed.
2175The controller battery
has been replaced.
2176The controller battery
Learn cycle has
started.
2177The controller battery
Learn cycle has
completed.
Warning /
Non-critical
Ok /
Normal
Warning /
Non-critical
Ok /
Normal
Ok /
Normal
Ok /
Normal
Cause: The battery may be recharging, the
room temperature may be too hot, or the fan
in the system may be degraded or failed.
Action: If this alert was generated due to a
battery recharge, the situation will correct
when the recharge is complete. You should
also check if the room temperature is normal
and that the system components are
functioning properly.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The controller cannot communicate
with the battery, the battery may be removed,
or the contact point between the controller
and the battery may be burnt or corroded.
Action: Replace the battery if it is not in. If
the contact point between the battery and the
controller is burnt or corroded, you will need
to replace either the battery or the controller,
or both. See the hardware documentation for
information on how to safely access, remove,
and replace the battery.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2178The controller battery
Learn cycle has
timed out.
2179The controller battery
Learn cycle has been
postponed.
2180The controller battery
Learn cycle will start
in %1 days.
Warning /
Non-critical
Ok /
Normal
Ok /
Normal
Cause: The controller battery must be fully
charged before the Learn cycle can begin.
The battery may be unable to maintain a full
charge causing the Learn cycle to timeout.
Additionally, the battery must be able to
maintain cached data for a specified period of
time in the event of a power loss. For
example, some batteries maintain cached
data for 24 hours. If the battery is unable to
maintain cached data for the required period
of time, then the Learn cycle will timeout.
Action: Replace the battery pack as the
battery is unable to maintain a full charge.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
1153None
1151None
1151None
NOTE: The %1 is a
variable that will be
filled in with the
number of days before
which the Learn cycle
will start. You can set
the duration to start
the Learn cycle.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2181The controller battery
Learn cycle will start
in % hours.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
Action: None
1151None
NOTE: The %1 is a
variable that will be
filled in with the
number of hours
before which the
Learn cycle will start.
You can set the
duration to start the
Learn cycle.
2182An invalid SAS
configuration has
been detected.
2186The controller cache
has been discarded.
2187Single-bit ECC error
limit exceeded.
Critical /
Failure /
Error
Warning /
Non-critical
Warning /
Non-critical
Cause: The controller and attached
enclosures are not cabled correctly.
Action: See the hardware documentation for
information on correct cabling
configurations.
Cause: The controller has flushed the cache
and any data in the cache has been lost. This
may happen if the system has memory or
battery problems that cause the controller to
distrust the cache. Although user data may
have been lost, this alert does not always
indicate that relevant or user data has
been lost.
Action: Verify that the battery and memory
are functioning properly.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2188The controller write
policy has been
changed to "Write
Through."
2189The controller write
policy has been
changed to "Write
Back."
2191Multiple enclosures
are attached to the
controller. This is an
unsupported
configuration.
Warning /
Non-critical
Ok /
Normal
Critical /
Failure /
Error
Cause: The controller battery is unable to
maintain cached data for the required period
of time. For example, if the required period of
time is 24 hours, the battery is unable to
maintain cached data for 24 hours. It is
normal to receive this alert during the battery
Learn cycle as the Learn cycle discharges the
battery before recharging it. When
discharged, the battery cannot maintain
cached data.
Action: Check the health of the battery. If the
battery is weak, replace the battery pack.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: Many enclosures are attached to the
controller port. When the enclosure limit is
exceeded, the controller loses contact with all
enclosures attached to the port.
Action: Remove the last enclosure. You must
remove the enclosure that has been added
last and is causing the enclosure limit
to exceed.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2192The virtual disk
"Check Consistency"
has made corrections
and completed.
2193The virtual disk
reconfigure has
resumed.
2194The virtual disk read
policy has changed.
2199The virtual disk cache
policy has changed.
2201A global hot spare
failed.
Ok /
Normal
Ok /
Normal
Ok /
Normal
Ok /
Normal
Warning /
Non-critical
Cause: The virtual disk "Check Consistency"
has identified errors and made corrections.
For example, the "Check Consistency" may
have encountered a bad disk block and
remapped the disk block to restore data
consistency. This alert is provided for
informational purposes.
Action: Monitor the battery and cache health
to make sure they are functioning properly.
Monitor the Alert Log for events related to
the battery and write policy changes. You
should also monitor the Alert Log for events
related to disk errors. If you suspect that the
battery or a disk have problems, replace the
battery pack or the disk.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The controller is unable to
communicate with a disk that is assigned as a
global hot spare. The disk may have failed or
been removed. There may also be a bad or
loose cable.
Action: Check if the disk is healthy and that
it has not been removed. Check the cables.
If necessary, replace the disk and reassign the
hot spare.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2202A global hot spare has
been removed.
2203A dedicated hot spare
failed.
2204A dedicated hot spare
has been removed.
2205A dedicated hot spare
has been
automatically
unassigned.
Warning /
Non-critical
Warning /
Non-critical
Warning /
Non-critical
Warning /
Non-critical
Cause: The controller is unable to
communicate with a disk that is assigned as a
global hot spare. The disk may have been
removed. There may also be a bad or loose
cable.
Action: Check if the disk is healthy and that
it has not been removed. Check the cables.
If necessary, replace the disk and reassign the
hot spare.
Cause: The controller is unable to
communicate with a disk that is assigned as a
dedicated hot spare. The disk may have failed
or been removed. There may also be a bad or
loose cable.
Action: Check if the disk is healthy and that
it has not been removed. Check the cables.
If necessary, replace the disk and reassign the
hot spare.
Cause: The controller is unable to
communicate with a disk that is assigned as a
dedicated hot spare. The disk may have been
removed. There may also be a bad or
loose cable.
Action: Check if the disk is healthy and that
it has not been removed. Check the cables.
If necessary, replace the disk and reassign the
hot spare.
Cause: The hot spare is no longer required
because the virtual disk it was assigned to has
been deleted.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2206The only hot spare
available is a SATA
disk. SATA disks
cannot replace SAS
disks.
Warning /
Non-critical
Cause: The only array disk available to be
assigned as a hot spare is using SATA
technology. The array disks in the virtual disk
are using SAS technology. Due to this
difference in technology, the hot spare cannot
903None
rebuild data if one of the array disks in the
virtual disk fails.
Action: Add a SAS disk that is large enough
to be used as the hot spare and assign the new
disk as a hot spare.
2207The only hot spare
available is a SAS
disk. SAS disks
cannot replace SATA
disks.
Warning /
Non-critical
Cause: The only array disk available to be
assigned as a hot spare is using SAS
technology. The array disks in the virtual disk
are using SATA technology. Due to this
difference in technology, the hot spare cannot
903None
rebuild data if one of the array disks in the
virtual disk fails.
Action: Add a SATA disk that is large enough
to be used as the hot spare and assign the new
disk as a hot spare.
2211The physical disk is
not supported.
Warning /
Non-critical
Cause: The physical disk may not have a
supported version of the firmware or the disk
903None
may not be supported by Dell.
Action: If the disk is supported by Dell,
update the firmware to a supported version.
If the disk is not supported by Dell, replace
the disk with one that is supported.
2232The controller alarm
is silenced.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
751None
Action: None
2233The BGI rate has
changed.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
751None
Action: None
2234The "Patrol Read" rate
has changed.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2246The controller battery
is degraded.
Warning /
Non-critical
Cause: The controller battery charge is weak.
Action: As the charge weakens, the charger
1153None
should automatically recharge the battery.
If the battery has reached its recharge limit,
replace the battery pack. Monitor the battery
to make sure that it recharges successfully.
If the battery does not recharge, replace the
battery pack.
2247The controller battery
is charging.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
1151None
Action: None
2248The controller battery
is executing a
Learn cycle.
2249The array disk "Clear"
operation has started.
Ok /
Normal
Ok /
Normal
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
1151None
901None
Action: None
2251The array disk blink
has initiated.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
901None
Action: None
2252The array disk blink
has ceased.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
901None
Action: None
2254The "Clear" operation
has cancelled.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
901None
Action: None
2255The array disk has
started.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
901None
Action: None
2259An enclosure blink
operation has
initiated.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2260An enclosure blink
has ceased.
2261A global rescan has
initiated.
2262Smart thermal
shutdown is enabled.
2263Smart thermal
shutdown is disabled.
2264A device is missing.Warning /
2265A device is in an
unknown state.
Ok /
Normal
Ok /
Normal
Ok /
Normal
Ok /
Normal
Non-critical
Warning /
Non-critical
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The controller cannot communicate
with a device. The device may be removed.
There may also be a bad or loose cable.
Action: Check if the device is in and
connected. If it is in, check the cables.
Also check the connection to the controller
battery and the battery health. A battery with
a weak or depleted charge may cause this alert.
Cause: The controller cannot communicate
with a device. The state of the device cannot
be determined. There may be a bad or loose
cable. The system may also be experiencing
problems with the application programming
interface (API). There could also be a
problem with the driver or firmware.
Action: Check the cables.
Check if the controller has a supported
version of the driver and firmware. You can
download the most current version of the
driver and firmware from support.dell.com.
Rebooting the system may also resolve this
problem.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2266Controller log file
entry: %1
%1 is a substitution
variable that will
appear in the alert
description for
specific details about
the alert.
2267The controller
reconstruct rate has
changed.
2268%1, Storage
Management has lost
communication with
this RAID controller
and attached storage.
An immediate reboot
is strongly
recommended to
avoid further
problems. If the
reboot does not
restore
communication,
there may be a
hardware failure.
Ok /
Normal
Ok /
Normal
Critical /
Failure /
Error
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: Storage Management has lost
communication with a device. There may be
faulty hardware or loose or defective cables.
Action: Reboot the system. If the problem is
not resolved, check for hardware failures. Any
failed component must be replaced. Make
sure the cables are attached securely.
See the hardware documentation for more
diagnostics information.
751None
751None
104None
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2269The array disk "Clear"
operation has
completed.
Ok /
Normal
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2270The array disk "Clear"
operation failed.
2271The "Patrol Read"
corrected a media
error.
2272"Patrol Read" found
an uncorrectable
media error.
2273Bad media.Critical /
2274The array disk rebuild
has resumed.
2276The dedicated hot
spare is too small.
Critical /
Failure /
Error
Ok /
Normal
Critical /
Failure /
Error
Failure /
Error
Ok /
Normal
Warning /
Non-critical
Cause: A "Clear" operation was being
performed on an array disk, but it was
interrupted and did not complete
successfully. The controller may have lost
communication with the disk. The disk may
have been removed or the cables may be
loose or defective.
Action: Check if the disk is in and not in a
failed state. Make sure the cables are attached
securely.
Restart the "Clear" operation.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The "Patrol Read" task has faced an
error that cannot be corrected. There may be
a bad disk block that cannot be remapped.
Action: Replace the array disk to avoid future
data loss.
Cause: A source (array) disk in a redundant
virtual disk has a bad disk block. The
algorithm that maintains redundant data has
created a similar bad block on the target
redundant disk to maintain consistency in
disk block addressing. Data has been lost.
Action: Restore from backup.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The dedicated hot spare is not large
enough to protect all virtual disks that reside
on the disk group.
Action: Assign a larger disk as the dedicated
hot spare.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2277The global hot spare
is too small.
2278The controller battery
charge level is below a
normal threshold.
2279The controller battery
charge level is above a
normal threshold.
2280A disk media error has
been corrected.
2281Virtual disk has
inconsistent data.
Warning /
Non-critical
Critical /
Failure /
Error
Ok /
Normal
Ok /
Normal
Ok /
Normal
Cause: The global hot spare is not large
enough to protect all virtual disks that reside
on the controller.
Action: Assign a larger disk as the global
hot spare.
Cause: The battery is discharging. A battery
discharge is a normal activity during the
battery Learn cycle. Before completing, the
battery Learn cycle recharges the battery.
You should receive alert 2179 when the
recharge occurs.
Action: Check if the battery Learn cycle is in
progress. Alert 2176 indicates that the battery
Learn cycle has initiated. The battery also
displays the Learn state while the Learn cycle
is in progress. If a Learn cycle is not in
progress, replace the battery pack.
Cause: This alert is provided for
informational purposes. This alert indicates
that the battery is recharging during the
battery Learn cycle.
Action: None
Cause: A disk media error was detected while
the controller was completing a background
task. A bad disk block was identified. The
disk block has been remapped.
Action: Consider replacing the disk. If you
receive this alert frequently, be sure to replace
the disk. You should also routinely back up
your data.
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2282Hot spare SMART
polling failed.
2283A redundant path is
broken.
2284A redundant path has
been restored.
2285A disk media error
was corrected during
recovery.
2286A Learn cycle start is
pending while the
battery charges.
2287The "Patrol Read" is
paused.
2288The "Patrol Read" has
resumed.
Critical /
Failure /
Error
Warning /
Non-critical
Ok /
Normal
Ok /
Normal
Ok /
Normal
Ok /
Normal
Ok /
Normal
Cause: The controller firmware attempted to
do SMART polling on the hot spare but was
unable to complete it. The controller has lost
communication with the hot spare.
Action: Check the health of the disk assigned
as a hot spare. You may need to replace the
disk and reassign the hot spare. Make sure the
cables are attached securely.
Cause: The controller has two connectors
that are connected to the same enclosure.
The communication path on one connector
has lost connection with the enclosure. The
communication path on the other connector
is reporting this loss.
Action: Make sure the cables are attached
securely.
Make sure both EMMs are healthy.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2289Multi-bit ECC error. Critical /
Failure /
Error
2290Single-bit ECC error. Warning /
Non-critical
2291An EMM has been
discovered.
2292Communication with
the enclosure has
been lost.
Ok /
Normal
Critical /
Failure /
Error
Cause: An error involving multiple bits has
been encountered during a read or write
operation. The error correction algorithm
recalculates parity data during read and write
operations. If an error involves only a single
bit, it may be possible for the error correction
algorithm to correct the error and maintain
parity data. An error involving multiple bits,
however, usually indicates data loss. In some
cases, if the multi-bit error occurs during a
read operation, the data on the disk may be
alright. If the multi-bit error occurrs during a
write operation, data loss has occurred.
Action: Replace the dual in-line memory
module (DIMM). The DIMM is a part of the
controller battery pack. See your hardware
documentation for information on replacing
the DIMM. You may need to restore data
from backup.
Cause: An error involving a single bit has
been encountered during a read or write
operation. The error correction algorithm has
corrected this error.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The controller has lost
communication with an EMM. The cables
may be loose or defective.
Action: Make sure the cables are attached
securely.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2293The EMM has failed. Critical /
Failure /
Error
2294A device has been
inserted.
2295A device has been
removed.
2296An EMM has been
inserted.
2297An EMM has been
removed.
2298There is a bad sensor
on an enclosure.
Ok /
Normal
Critical /
Failure /
Error
Ok /
Normal
Critical /
Failure /
Error
Warning /
Non-critical
Cause: The failure may be caused by a loss of
power to the EMM. The EMM self test may
also have identified a failure. There could also
be a firmware problem or a multi-bit error.
Action: Replace the EMM. See the hardware
documentation for information on replacing
the EMM.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: A device has been removed and the
system is no longer functioning in
optimal condition.
Action: Replace the device.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: An EMM has been removed.
Action: Replace the EMM. See the hardware
documentation for information on replacing
the EMM.
Cause: The enclosure has a bad sensor. The
enclosure sensors monitor the fan speeds,
temperature probes, etc.
Action: See the hardware documentation for
more information.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2299Bad PHY %1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2300The enclosure is
unstable.
2301The enclosure has a
hardware error.
2302The enclosure is not
responding.
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Cause: There is a problem with a physical
connection or PHY.
Action: Replace the EMM that contains the
bad PHY. See the hardware documentation
for information on replacing the EMM.
Attach the storage to a different connector, if
available. Make sure the cables are attached
securely.
Cause: The controller is not receiving a
consistent response from the enclosure.
There could be a firmware problem or an
invalid cabling configuration. If the cables are
too long, they will degrade the signal.
Action: Power down all enclosures attached
to the system and reboot the system. If the
problem persists, upgrade the firmware to the
latest supported version. You can download
the most current version of the driver and
firmware from support.dell.com. Make sure
the cable configuration is valid. See the
hardware documentation for valid cabling
configurations.
Cause: The enclosure or an enclosure
component is in a Failed or Degraded state.
Action: Check the health of the enclosure
and its components. Replace any hardware
that is in a Failed state. See the hardware
documentation for more information.
Cause: The enclosure or an enclosure
component is in a Failed or Degraded state.
Action: Check the health of the enclosure
and its components. Replace any hardware
that is in a Failed state. See the hardware
documentation for more information.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2303The enclosure cannot
support both SAS and
SATA array disks.
Array disks may be
disabled.
2304An attempt to hot
plug an EMM has
been detected. This
type of hot plug is not
supported.
2305The array disk is too
small to be used for a
rebuild.
2306Bad block table is
80% full.
2307Bad block table is full.
Unable to log
block %1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
Ok /
Normal
Ok /
Normal
Ok /
Normal
Warning /
Non-critical
Critical /
Failure /
Error
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The bad block table is used for
remapping bad disk blocks. This table fills, as
bad disk blocks are remapped. When the
table is full, bad disk blocks can no longer be
remapped, and disk errors can no longer be
corrected. At this point, data loss can occur.
The bad block table is now 80% full.
Action: Back up your data. Replace the disk
generating this alert and restore from back up.
Cause: The bad block table is used for
remapping bad disk blocks. This table fills, as
bad disk blocks are remapped. When the
table is full, bad disk blocks can no longer be
remapped and disk errors can no longer be
corrected. At this point, data loss can occur.
Action: Replace the disk generating this alert
and restore from backup. You may have
lost data.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2309An array disk is
incompatible.
2310A virtual disk is
permanently
degraded.
2311The firmware on the
EMMs is not the
same version. EMM0
%1 EMM1 %2
NOTE: %1 and %2 are
substitution variables
that will appear in the
alert description for
specific details about
the alert.
2312A power supply in the
enclosure has an
AC failure.
2313A power supply in the
enclosure has a
DC failure.
Warning /
Non-critical
Critical /
Failure /
Error
Warning /
Non-critical
Warning /
Non-critical
Warning /
Non-critical
Cause: You have attempted to replace a disk
with another disk that is using an
incompatible technology. For example, you
may have replaced one side of a mirror with a
SAS disk when the other side of the mirror is
using SATA technology.
Action: See the hardware documentation for
information on replacing disks.
Cause: A redundant virtual disk has lost
redundancy. This may occur when the virtual
disk suffers the failure of multiple array disks.
In this case, both the source array disk and
the target disk with redundant data have
failed. A rebuild is not possible because there
is no longer redundancy.
Action: Replace the failed disks and restore
from backup.
Cause: The firmware on the EMM modules is
not the same version. It is required that both
modules have the same version of the
firmware. This alert may be caused if you
attempt to insert an EMM module that has a
different firmware version than an
existing module.
Action: Upgrade to the same version of the
firmware on both EMM modules.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2314The initialization
sequence of SAS
components failed
during system
startup. SAS
management and
monitoring is not
possible.
2315Diagnostic message
%1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2316Diagnostic message
%1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2317BGI terminated due
to loss of ownership
in a cluster
configuration.
2318Problems with the
battery or the battery
charger have been
detected. The battery
health is poor.
Critical /
Failure /
Error
Ok /
Normal
Critical /
Failure /
Error
Ok /
Normal
Critical /
Failure /
Error
Cause: Storage Management is unable to
monitor or manage SAS devices.
Action: Reboot the system. If problem
persists, make sure you have supported
versions of the drivers and firmware. Also, you
may need to reinstall Storage Management or
Server Administrator because of some
missing installation components.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: A diagnostics test failed. The text for
this alert is generated by the utility that ran
the diagnostics.
Action: See the documentation for the utility
that ran the diagnostics for more
information.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The battery or the battery charger is
not functioning properly.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2319Single-bit ECC error.
The DIMM is
degrading.
2320Single-bit ECC error.
The DIMM is
critically degraded.
2321Single-bit ECC error.
The DIMM is
critically degraded.
There will be no
further reporting.
2322The DC power supply
is switched off.
2323The power supply is
switched on.
Warning /
Non-critical
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Ok /
Normal
Cause: The DIMM is beginning to
malfunction.
Action: Replace the DIMM to avoid data loss
or data corruption. The DIMM is a part of
the controller battery pack. See your
hardware documentation for information on
replacing the DIMM.
Cause: The DIMM is malfunctioning. Data
loss or data corruption may be eminent.
Action: Replace the DIMM immediately to
avoid data loss or data corruption. The
DIMM is a part of the controller battery pack.
See your hardware documentation for
information on replacing the DIMM.
Cause: The DIMM is malfunctioning. Data
loss or data corruption is eminent. The
DIMM must be replaced immediately.
No further alerts will be generated.
Action: Replace the DIMM immediately. The
DIMM is a part of the controller battery pack.
Seeyour hardware documentation for
information on replacing the DIMM.
Cause: The power supply unit is switched off.
Either a user switched off the power supply
unit or it is defective.
Action: Check if the power switch is turned
off. If it is turned off, turn it on. If the
problem persists, check if the power cord is
attached and functional. If the problem is
still not corrected or if the power switch is
already turned on, replace the power supply
unit.
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2324The AC power supply
cable has been
removed.
2325The power supply
cable has been
inserted.
2326A foreign
configuration has
been detected.
2327The NVRAM has
corrupted data. The
controller is
reinitializing the
NVRAM.
2328The NVRAM has
corrupt data.
Critical /
Failure /
Error
Ok /
Normal
Ok /
Normal
Warning /
Non-critical
Warning /
Non-critical
Cause: The power cable may be pulled out or
removed. The power cable may also have
overheated and become warped and
nonfunctional.
Action: Replace the power cable.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes. The controller has
array disks that were moved from another
controller. These array disks contain virtual
disks that were created on the other
controller. See Import Foreign Configuration
and Clear Foreign Configuration for
more information.
Action: None
Cause: The NVRAM has corrupted data. This
may ocurr after a power surge, a battery
failure, or for other reasons. The controller is
reinitializing the NVRAM.
Action: None. The controller is taking the
required corrective action. If this alert is
generated often (such as during each reboot),
replace the controller.
Cause: The NVRAM has corrupt data. The
controller is unable to correct the situation.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2329SAS port report: %1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2330SAS port report: %1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2331A bad disk block has
been reassigned.
2332A controller hot plug
has been detected.
2333An enclosure
temperature sensor
differential has been
detected.
Warning /
Non-critical
Ok /
Normal
Warning /
Non-critical
Ok /
Normal
Warning /
Non-critical
Cause: The text for this alert is generated by
the controller and can vary depending on
the situation.
Action: Make sure the cables are attached
securely.
If the problem persists, replace the cable with
a valid cable according to SAS specifications.
If the problem still persists, you may need to
replace some devices such as the controller or
EMM. See the hardware documentation for
more information.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The disk has a bad block. Data has
been readdressed to another disk block and
no data loss has occurred.
Action: Monitor the disk for other alerts or
indications of poor health. For example, you
may receive alert 2306. Replace the disk if
you suspect there is a problem.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The firmware has detected a
temperature sensor differential in the
enclosure.
Action: Monitor the enclosure for other alerts
related to the temperature. For example, you
may receive alerts related to the fan or
temperature probes. Check the health of the
enclosure and its components. Replace any
component that is failed.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2334Controller event log:
%1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2335Controller event log:
%1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2336Controller event log:
%1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2337The controller is
unable to recover
cached data from the
battery backup unit
(BBU).
Ok /
Normal
Warning /
Non-critical
Critical /
Failure /
Error
Critical /
Failure /
Error
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The text for this alert is generated by
the controller and can vary depending on the
situation. This text is from events in the
controller event log that were generated while
Storage Management was not running.
Action: If there is a problem, review the
controller event log and the Server
Administrator Alert Log for significant events
or alerts that may assist in diagnosing the
problem. Check the health of the storage
components. See the hardware
documentation for more information.
Cause: The text for this alert is generated by
the controller and can vary depending on the
situation. This text is from events in the
controller event log that were generated while
Storage Management was not running.
Action: See the hardware documentation for
more information.
Cause: The controller was unable to recover
data from the cache.
Action: Check if the battery is charged and in
good health. When the battery charge is
unacceptably low, it cannot maintain cached
data. Check if the battery has reached its
recharge limit. The battery may need to be
recharged or replaced.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2338The controller has
recovered cached data
from the BBU.
2339The factory default
settings have been
restored.
2340The BGI completed
with uncorrectable
errors.
2341The "Check
Consistency"
operation made
corrections and
completed.
2342The "Check
Consistency" task
found inconsistent
parity data. Data
redundancy may be
lost.
Ok /
Normal
Ok /
Normal
Critical /
Failure /
Error
Ok /
Normal
Warning /
Non-critical
Cause: This alert is provided for
informational purposes.
Action: None
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The BGI task encountered errors that
cannot be corrected. The virtual disk contains
array disks that have unusable disk space or
disk errors that cannot be corrected.
Action: Replace the array disk that contains
the disk errors. Review other alert messages
to identify the array disk that has errors. If the
virtual disk is redundant, you can replace the
array disk and continue using the virtual disk.
If the virtual disk is non-redundant, you may
need to recreate the virtual disk after
replacing the array disk. After replacing the
array disk, run a "Check Consistency" task to
check the data.
Cause: This alert is provided for
informational purposes.
Action: None
Cause: The data on a source disk and the
redundant data on a target disk is
inconsistent.
Action: Restart the "Check Consistency" task.
If you receive this alert again, check the
health of the array disks included in the
virtual disk. Review the alert messages for
significant alerts related to the array disks. If
you suspect that an array disk has a problem,
replace it and restore from backup.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2343The "Check
Consistency" logging
of inconsistent parity
data is disabled.
2344The virtual disk
initialization
terminated.
2345The virtual disk
initialization failed.
2346Error occurred: %1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2347The rebuild failed due
to errors on the
source physical disk.
2348The rebuild failed due
to errors on the target
physical disk.
Warning /
Non-critical
Warning /
Non-critical
Critical /
Failure /
Error
Warning /
Non-critical
Critical /
Failure /
Error
Critical /
Failure /
Error
Cause: The "Check Consistency" operation
can no longer report errors in the parity data.
Action: See the hardware documentation for
more information.
Cause: A user has cancelled the virtual disk
initialization.
Action: Restart the initialization.
Cause: The controller cannot communicate
with the attached devices. A disk may be
removed or contain errors. The cables may
also be loose or defective.
Action: Check the health of attached devices.
Review the Alert Log for significant events
and make sure the cables are attached
securely.
Cause: The text for this alert is generated by
the firmware and can vary depending on
the situation.
Action: Check the health of attached devices.
Review the Alert Log for significant events.
You may need to replace faulty hardware.
Make sure the cables are attached securely.
See the hardware documentation for
more information.
Cause: You are attempting to rebuild data
that resides on a defective disk.
Action: Replace the source disk and restore
from backup.
Cause: You are attempting to rebuild data on
a disk that is defective.
Action: Replace the target disk. If a rebuild
does not automatically start after replacing
the disk, initiate the "Rebuild" task. You may
need to assign the new disk as a hot spare to
initiate the rebuild.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2349A bad disk block
could not be
reassigned during a
write operation.
2350There was an
unrecoverable disk
media error during
the rebuild.
2351A physical disk is
marked as missing.
2352A physical disk that
was marked as
missing has been
replaced.
2353The enclosure
temperature has
returned to normal.
2354Enclosure firmware
download in progress.
Critical /
Failure /
Error
Critical /
Failure /
Error
Ok /
Normal
Ok /
Normal
Ok /
Normal
Ok /
Normal
Cause: A write operation could not complete
because the disk contains bad disk blocks
that could not be reassigned. Data loss may
have occurred and data redundancy may also
be lost.
Action: Replace the disk.
Cause: The rebuild encountered an
unrecoverable disk media error.
Action: Replace the disk.
Cause: This alert is provided for
informational purposes.
Action: None.
Cause: This alert is provided for
informational purposes.
Action: None.
Cause: This alert is provided for
informational purposes.
Action: None.
Cause: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2355Enclosure firmware
download failed.The
system was unable to
download firmware to
the enclosure. The
controller may have
lost communication
with the enclosure.
There may have been
problems with the
data transfer or the
download media may
be corrupt.
2356SAS SMP
communications error
%1.
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
Warning /
Non-critical
Critical /
Failure /
Error
Cause: The system was unable to download
firmware to the enclosure. The controller may
have lost communication with the enclosure.
There may have been problems with the data
transfer or the download media may be
corrupt.
Action: Attempt to download the enclosure
firmware again. If problems continue, check
if the controller can communicate with the
enclosure. Make sure that the enclosure is
powered on. Check the cables and the health
of the enclosure and its components.
To check the health of the enclosure, select
the enclosure object in the tree view. The
Health subtab displays a red X or yellow
exclamation point for enclosure components
that are failed or degraded.
Cause: The text for this alert is generated by
the firmware and can vary depending on the
situation. The reference to SMP in this text
refers to SAS Management Protocol.
Action: There may be a SAS topology error.
See the hardware documentation for
information on correct SAS topology
configurations. There may be problems with
the cables such as a loose connection or an
invalid cabling configuration. See the
hardware documentation for information on
correct cabling configurations. Check if the
firmware is a supported version.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2357SAS expander error:
%1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2358The battery charge
cycle is complete.
2359The physical disk is
not certified.
2360A user has discarded
data from the
controller cache.
2361Array disk(s) that are
part of a virtual disk
have been removed
while the system was
shut down. This
removal was
discovered during
system start-up.
2362Array disk(s) have
been removed from a
virtual disk. The
virtual disk will be in
Failed state during
the next system
reboot.
Critical /
Failure /
Error
Ok /
Normal
Warning /
Non-critical
Ok /
Normal
Ok /
Normal
Ok /
Normal
Cause: The text for this alert is generated by
the firmware and can vary depending on the
situation.
Action: There may be a problem with the
enclosure. Check the health of the enclosure
and its components. by selecting the
enclosure object in the tree view. The Health
subtab displays a red X or yellow exclamation
point for enclosure components that are
failed or degraded. See the enclosure
documentation for more information.
Cause: This alert is provided for
informational purposes.
Action: None.
Cause: The physical disk does not comply
with the standards set by Dell and is not
supported.
Action: Replace the physical disk with a
physical disk that is supported.
Cause: This alert is provided for
informational purposes.
Action: None.
Cause: This alert is provided for
informational purposes.
Action: None.
Cause: This alert is provided for
informational purposes.