Reproduction in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden.
Trademarks used in this text: The DELL logo and Dell OpenManage are trademarks of Dell Inc.; Microsoft and Windows are registered
trademarks and Windows Server is a trademark of Microsoft Corporation; Red Hat is a registered trademark of Red
registered trademark of Novell, Inc. in the United States and other countries.
Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products.
Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own.
Dell OpenManage™ Server Administrator produces event messages stored primarily in the
operating
describes the event messages created by Server Administrator version 5.0 or later and displayed in
the Server Administrator Alert log.
Server Administrator creates events in response to sensor status changes and other monitored
parameters. The Server Administrator event monitor uses these status change events to add
descriptive messages to the operating system event log or the Server Administrator Alert log.
Each event message that Server Administrator adds to the Alert log consists of a unique identifier
called the event ID for a specific event source category and a descriptive message. The event
message includes the severity, cause of the event, and other relevant information, such as the event
location and the monitored item’s previous state.
Tables provided in this guide list all Server Administrator event IDs in numeric order. Each entry
includes the event ID’s corresponding description, severity level, and cause. Message text in angle
brackets (for example,
Server
What’s New in this Release
•Additional Miscellaneous messages
•Battery Sensor messages
•Additional Storage Management messages
system or Server Administrator event logs and sometimes in SNMP traps. This document
<State>
Administrator.
) describes the event-specific information provided by the
Messages Not Described in This Guide
This guide describes only event messages created by Server Administrator and displayed in the
Server Administrator Alert log. For information on other messages produced by your system, consult
one of the following sources:
•Your system’s
•Other system documentation
•Operating system documentation
•Application program documentation
For more information on Array Manager event messages, see the Array Manager documentation.
Installation and Troubleshooting Guide
Introduction5
Understanding Event Messages
This section describes the various types of event messages generated by the Server Administrator.
When
an event occurs on your system, the Server Administrator sends information about one of the
following event types to the systems management console:
Table 1-1. Understanding Event Messages
IconAlert SeverityComponent Status
An event that describes the successful operation of a unit.
OK/Normal
Warning/Non-critical
Critical/Failure/Error
informational purposes and does not indicate an error condition. For example, the
alert may indicate the normal start or stop of an operation, such as power supply or
sensor reading returning to normal.
a
An event that is not necessarily significant, but may indicate a possible future
problem.
component (such as a temperature probe in an enclosure) has crossed a warning
threshold.
A significant event that indicates actual or imminent loss of data or loss of function.
For example,
For example, a Warning/Non-critical alert may indicate that a
crossing a failure threshold or a hardware failure such as
Server Administrator generates events based on status changes in the following sensors:
•
Temperature Sensor
— Helps protect critical components by alerting the systems management
console when temperatures become too high inside a chassis; also monitors a variety of locations in the
chassis and in any attached systems.
Fan Sensor
•
•
Voltage Sensor
— Monitors fans in various locations in the chassis and in any attached systems.
— Monitors voltages across critical components in various chassis locations and in any
attached systems.
Current Sensor
•
— Monitors the current (or amperage) output from the power supply (or supplies) in
the chassis and in any attached systems.
•
Chassis Intrusion Sensor
•
Redundancy Unit Sensor
— Monitors intrusion into the chassis and any attached systems.
— Monitors redundant units (critical units such as fans, AC power cords, or
power supplies) within the chassis; also monitors the chassis and any attached systems. For example,
redundancy allows a second or
n
th fan to keep the chassis components at a safe temperature when
another fan has failed. Redundancy is normal when the intended number of critical components are
operating. Redundancy is degraded when a component fails, but others are still operating. Redundancy
is lost when there is one less critical redundancy device than required.
•
Power Supply Sensor
•
Memory Prefailure Sensor
— Monitors power supplies in the chassis and in any attached systems.
— Monitors memory modules by counting the number of Error Correction
Code (ECC) memory corrections.
The alert is provided for
an array disk.
6Introduction
•
Fan Enclosure Sensor
insertion into the system, and by measuring how long a fan enclosure is absent from the chassis.
This sensor monitors the chassis and any attached systems.
•
AC Power Cord Sensor
Hardware Log Sensor
•
•
Processor Sensor
Pluggable Device Sensor
•
pluggable devices, such as memory cards.
•
Battery Sensor
— Monitors the status of one or more batteries in the system.
— Monitors protective fan enclosures by detecting their removal from and
— Monitors the presence of AC power for an AC power cord.
— Monitors the size of a hardware log.
— Monitors the processor status in the system.
— Monitors the addition, removal, or configuration errors for some
Sample Event Message Text
The following example shows the format of the event messages logged by Server Administrator.
EventID: 1000
Source: Server Administrator
Category: Instrumentation Service
Type: Information
Date and Time: Mon Oct 21 10:38:00 2002
Computer:
Description:
Server Administrator starting
Data: Bytes in Hex
<computer name>
Viewing Alerts and Event Messages
An event log is used to record information about important events.
Server Administrator generates alerts that are added to the operating system event log and to the
Server
Administrator Alert log. To view these alerts in Server Administrator:
1
Select the
2
Select the
3
Select the
You can also view the event log using your operating system’s event viewer. Each operating system’s event
viewer accesses the applicable operating system event log.
System
object in the tree view.
Logs
tab.
Alert
subtab.
Introduction7
The location of the event log file depends on the operating system you are using.
•In the Microsoft® Windows® 2000 Advanced Server and Windows Server™ 2003 operating systems,
messages are logged to the system event log and optionally to a unicode text file,
using Notepad), that is located in the
C:\Program Files\Dell\SysMgt
•In the Red Hat
®
Enterprise Linux and SUSE® Linux Enterprise Server operating system, messages are
.
install_path
\omsa\log
directory. The default
logged to the system log file. The default name of the system log file is
dcsys32.log
install_path
/var/log/messages
(viewable
is
. You can view
the messages file using a text editor such as vi or emacs.
NOTE: Logging messages to a unicode text file is optional. By default, the feature is disabled. To enable this
feature, modify the Event Manager section of the dcemdy32.ini file as follows:
•In Windows, locate the file at <install_path>\dataeng\ini and set
The default install_path is C:\Program Files\Dell\SysMgt. Restart the DSM SA Event Manager service.
•In Red Hat Enterprise Linux and SUSE Linux Enterprise Server, locate the file at <install_path>/dataeng/ini and
UnitextLog.enabled=True.
set
"/etc/init.d/dataeng restart" command to restart the Server Administrator event manager service. This will also
restart the Server Administrator data manager and SNMP services.
The default install_path is /opt/dell/srvadmin. Issue the
UnitextLog.enabled=True
.
The following subsections explain how to open the Windows 2000 Advanced Server, Windows Server 2003,
and the Red Hat Enterprise Linux and SUSE Linux Enterprise Server event viewers.
Viewing Events in Windows 2000 and Windows Server 2003
1
Click the
2
Double-click
3
In the
The
Start
Administrative Tools
Event Viewer
System Log
button, point to
window, click the
Settings
, and click
Control Panel
, and then double-click
Tree
tab and then click
Event Viewer
window displays a list of recently logged events.
.
.
System Log
.
4
To view the details of an event, double-click one of the event items.
NOTE: You can also look up the dcsys32.log file, in the install_path\omsa\log directory, to view the separate
event log file. The default install_path is C:\Program Files\Dell\SysMgt.
Viewing Events in Red Hat Enterprise Linux and SUSE Linux Enterprise Server
1
Log in as
2
Use a text editor such as vi or emacs to view the file named
The following example shows the Red Hat Enterprise Linux (and SUSE Linux Enterprise Server)
message log, /var/log/messages. The
NOTE: These messages are typically displayed as one long line. In the following example, the message is
displayed using line breaks to help you see the message text more clearly.
8Introduction
root
.
/var/log/messages
.
text in boldface type indicates the message text.
...
Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service
EventID: 1000
Server Administrator starting
Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service
EventID: 1001
Server Administrator startup complete
Feb 6 14:21:21 server01 Server Administrator: Instrumentation Service
EventID: 1254 Chassis intrusion detected Sensor location: Main chassis
intrusion Chassis location: Main System Chassis Previous state was: OK
(Normal) Chassis intrusion state: Open
Feb 6 14:21:51 server01 Server Administrator: Instrumentation Service
EventID: 1252 Chassis intrusion returned to normal Sensor location: Main
chassis intrusion Chassis location: Main System Chassis Previous state
was: Critical (Failed) Chassis intrusion state: Closed
Viewing the Event Information
The event log for each operating system contains some or all of the following information:
•
Date
— The date the event occurred.
•
Time
— The local time the event occurred.
•
Ty p e
— A classification of the event severity: Information, Warning, or Error.
User
•
•
•
•
•
•
— The name of the user on whose behalf the event occurred.
Computer
Source
Category
Event ID
Description
depending on the event type.
— The name of the system where the event occurred.
— The software that logged the event.
— The classification of the event by the event source.
— The number identifying the particular event type.
— A description of the event. The format and contents of the event description vary,
Introduction9
Understanding the Event Description
Ta b l e 1-2 lists in alphabetical order each line item that may appear in the event description.
Table 1-2. Event Description Reference
Description Line ItemExplanation
Action performed was:
Action requested was:
Additional Details:
details for the event>
<Additional power supply status
information>
Chassis intrusion state:
<Intrusion state>
Chassis location:
chassis>
Configuration error type:
<type of configuration error>
Current sensor value (in Amps):
<Reading>
Date and time of action:
<Date and time>
Device location: <
chassis
Discrete current state:
Discrete temperature state:
>
<State>
<Action>
<Action>
<Additional
<Name of
Location in
<State>
Specifies the action that was performed, for example:
Action performed was: Power cycle
Specifies the action that was requested, for example:
Action requested was: Reboot, shutdown OS first
Specifies additional details available for the hot plug event, for
example:
Memory device: DIMM1_A Serial number: FFFF30B1
Specifies information pertaining to the event, for example:
Power supply input AC is off, Power supply
POK (power OK) signal is not normal, Power
supply is turned off
Specifies the chassis intrusion state (open or closed), for example:
Chassis intrusion state: Open
Specifies name of the chassis that generated the message, for
example:
Chassis location: Main System Chassis
Specifies the type of configuration error that occurred, for example:
Configuration error type: Revision mismatch
Specifies the current sensor value in amps, for example:
Current sensor value (in Amps): 7.853
Specifies the date and time the action was performed, for example:
Date and time of action: Sat Jun 12 16:20:33
2004
Specifies the location of the device in the specified chassis, for
example:
Device location: Memory Card A
Specifies the state of the current sensor, for example:
Discrete current state: Good
Specifies the state of the temperature sensor, for example:
Specifies the location of the redundant power supply or cooling
unit in the chassis, for example:
Redundancy unit: Fan Enclosure
Specifies the location of the sensor in the specified chassis,
for example:
Sensor location: CPU1
Specifies the temperature in degrees Celsius, for example:
Temperature sensor value (in degrees Celsius):
30
Specifies the voltage sensor value in volts, for example:
Voltage sensor value (in Volts): 1.693
12Introduction
Event Message Reference
The following tables lists in numerical order each event ID and its corresponding description, along
with its severity and cause.
NOTE: For corrective actions, see the appropriate documentation.
Miscellaneous Messages
Miscellaneous messages in Table 2-1 indicate that certain alert systems are up and working.
Table 2-1. Miscellaneous Messages
Event ID DescriptionSeverityCause
0000Log was clearedInformationUser cleared the log from Server
Administrator.
0001Log backup createdInformationThe log was full, copied to backup, and
cleared.
1000Server Administrator startingInformationServer Administrator is beginning to
initialize.
1001Server Administrator startup
complete
1002A system BIOS update has been
scheduled for the next reboot
1003A previously scheduled system
BIOS update has been canceled
1004Thermal shutdown protection
has been initiated
InformationServer Administrator completed its
initialization.
InformationThe user has chosen to update the flash
basic input/output system (BIOS).
InformationThe user decides to cancel the flash
BIOS update, or an error occurs during
the flash.
ErrorThis message is generated when a
system is configured for thermal
shutdown due to an error event. If a
temperature sensor reading exceeds the
error threshold for which the system is
configured, the operating system shuts
down and the system powers off. This
event may also be initiated on certain
systems when a fan enclosure is removed
from the system for an extended period
of time.
Event Message Reference13
Table 2-1. Miscellaneous Messages (continued)
Event ID DescriptionSeverityCause
1005SMBIOS data is absentWarningThe system does not contain the
required systems management BIOS
version 2.2 or higher, or the BIOS is
corrupted.
1006Automatic System Recovery
(ASR) action was performed
Action performed was:
Date and time of action:
and time>
1007User initiated host system
control action
Action requested was:
1008Systems Management Data
Manager Started
1009Systems Management Data
Manager Stopped
1011RCI table is corruptWarningThis message is generated when the
1012IPMI Status
Interface: <
being used
the IPMI interface
>, <
additional
<Action>
<Date
<Action>
information if available and
applicable
>
ErrorThis message is generated when an
automatic system recovery action is
InformationUser requested a host system control
InformationSystems Management Data Manager
InformationSystems Management Data Manager
InformationThis message is generated to indicate
performed due to a hung operating
system. The action performed and the
time of action are provided.
action to reboot, power off, or power
cycle the system. Alternatively the user
had indicated protective measures to be
initiated in the event of a thermal
shutdown.
services were started.
services were stopped.
BIOS Remote Configuration Interface
(RCI) table is corrupted or cannot be
read by the systems management
software.
the Intelligent Platform Management
Interface (IPMI)) status of the system.
Additional information, when available,
includes Baseboard Management
Controller (BMC) not present, BMC
not responding, System Event Log (SEL)
not present, and SEL Data Record (SDR)
not present.
14Event Message Reference
Temperature Sensor Messages
Temperature sensors listed in Table 2-2 help protect critical components by alerting the systems
management console when temperatures become too high inside a chassis. The temperature sensor
messages use additional variables: sensor location, chassis location, previous state, and temperature
sensor value or state.
Table 2-2. Temperature Sensor Messages
Event ID DescriptionSeverityCause
1050Temperature sensor has failed
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Temperature sensor value
(in degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
1051Temperature sensor value
unknown
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
If sensor type is not discrete:
Temperature sensor value (in
degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
<Reading>
<Reading>
InformationA temperature sensor on the backplane
board, system board, or the carrier in the
specified system failed. The sensor
location, chassis location, previous state,
and temperature sensor value are provided.
InformationA temperature sensor on the backplane
board, system board, or drive carrier in the
specified system could not obtain a reading.
The sensor location, chassis location,
previous state, and a nominal temperature
sensor value are provided.
Event Message Reference15
Table 2-2. Temperature Sensor Messages (continued)
Event ID DescriptionSeverityCause
1052Temperature sensor returned
to a normal value
Sensor location:
<Location in
chassis>
Chassis location:
<Name of
chassis>
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in
degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
<Reading>
InformationA temperature sensor on the backplane
board, system board, or drive carrier in the
specified system returned to a valid range
after crossing a failure threshold. The
sensor location, chassis location, previous
state, and temperature sensor value
are provided.
<State>
1053Temperature sensor detected
a warning value
Sensor location:
<Location in
chassis>
Chassis location:
<Name of
chassis>
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in
degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
<Reading>
WarningA temperature sensor on the backplane
board, system board, or drive carrier in the
specified system exceeded its warning
threshold. The sensor location, chassis
location, previous state, and temperature
sensor value are provided.
<State>
16Event Message Reference
Table 2-2. Temperature Sensor Messages (continued)
Event ID DescriptionSeverityCause
1054Temperature sensor detected
a failure value
Sensor location:
<Location in
chassis>
Chassis location:
<Name of
chassis>
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in
degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
<Reading>
ErrorA temperature sensor on the backplane
board, system board, or drive carrier in the
specified system exceeded its failure
threshold. The sensor location, chassis
location, previous state, and temperature
sensor value are provided.
<State>
1055Temperature sensor detected
a non-recoverable value
Sensor location:
<Location in
chassis>
Chassis location:
<Name of
chassis>
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in
degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
<Reading>
ErrorA temperature sensor on the backplane
board, system board, or drive carrier in the
specified system detected an error from
which it cannot recover. The sensor
location, chassis location, previous state,
and temperature sensor value are provided.
<State>
Event Message Reference17
Cooling Device Messages
Cooling device sensors listed in Table 2-3 monitor how well a fan is functioning. Cooling device
messages provide status and warning information for fans in a particular chassis.
Table 2-3. Cooling Device Messages
Event ID DescriptionSeverityCause
1100Fan sensor has failed
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Fan sensor value:
1101Fan sensor value unknown
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Fan sensor value:
1102Fan sensor returned to a
normal value
Sensor location:
in chassis>
Chassis location:
chassis>
Previous state was:
Fan sensor value:
1103Fan sensor detected a
warning value
Sensor location:
in chassis>
Chassis location:
chassis>
Previous state was:
Fan sensor value:
<Reading>
<Reading>
<Location
<Name of
<State>
<Reading>
<Location
<Name of
<State>
<Reading>
InformationA fan sensor in the specified system is not
functioning. The sensor location, chassis
location, previous state, and fan sensor value
are provided.
InformationA fan sensor in the specified system could not
obtain a reading. The sensor location, chassis
location, previous state, and a nominal fan
sensor value are provided.
InformationA fan sensor reading on the specified system
returned to a valid range after crossing a
warning threshold. The sensor location, chassis
location, previous state, and fan sensor value
are provided.
WarningA fan sensor reading in the specified system
exceeded a warning threshold. The sensor
location, chassis location, previous state, and
fan sensor value are provided.
18Event Message Reference
Table 2-3. Cooling Device Messages (continued)
Event ID DescriptionSeverityCause
1104Fan sensor detected a
failure value
Sensor location:
<Location
in chassis>
Chassis location:
<Name of
ErrorA fan sensor in the specified system detected
the failure of one or more fans. The sensor
location, chassis location, previous state, and
fan sensor value are provided.
chassis>
Previous state was:
Fan sensor value:
1105Fan sensor detected a
non-recoverable value
Sensor location:
in chassis>
Chassis location:
<State>
<Reading>
<Location
<Name of
ErrorA fan sensor detected an error from which it
cannot recover. The sensor location, chassis
location, previous state, and fan sensor value
are provided.
chassis>
Previous state was:
Fan sensor value:
<State>
<Reading>
Voltage Sensor Messages
Voltage sensors listed in Table 2-4 monitor the number of volts across critical components. Voltage
sensor messages provide status and warning information for voltage sensors in a particular chassis.
Table 2-4. Voltage Sensor Messages
Event ID DescriptionSeverityCause
1150Voltage sensor has failed
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value (in
Volts):
If sensor type is discrete:
Discrete voltage state:
<Reading>
<State>
InformationA voltage sensor in the specified system
failed. The sensor location, chassis
location, previous state, and voltage sensor
value are provided.
Event Message Reference19
Table 2-4. Voltage Sensor Messages (continued)
Event ID DescriptionSeverityCause
1151Voltage sensor value unknown
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value
(in Volts):
If sensor type is discrete:
Discrete voltage state:
1152Voltage sensor returned to a
normal value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value
(in Volts):
If sensor type is discrete:
Discrete voltage state:
1153Voltage sensor detected a
warning value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value
(in Volts):
If sensor type is discrete:
Discrete voltage state:
<Reading>
<State>
<Reading>
<State>
<Reading>
<State>
InformationA voltage sensor in the specified system
could not obtain a reading. The sensor
location, chassis location, previous state,
and a nominal voltage sensor value
are provided.
InformationA voltage sensor in the specified system
returned to a valid range after crossing a
failure threshold. The sensor location,
chassis location, previous state, and
voltage sensor value are provided.
WarningA voltage sensor in the specified system
exceeded its warning threshold. The
sensor location, chassis location, previous
state, and voltage sensor value are
provided.
20Event Message Reference
Table 2-4. Voltage Sensor Messages (continued)
Event ID DescriptionSeverityCause
1154Voltage sensor detected a
failure value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value
(in Volts):
If sensor type is discrete:
Discrete voltage state:
1155Voltage sensor detected a
non-recoverable value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value
(in Volts):
If sensor type is discrete:
Discrete voltage state:
<Reading>
<State>
<Reading>
<State>
ErrorA voltage sensor in the specified system
exceeded its failure threshold. The sensor
location, chassis location, previous state,
and voltage sensor value are provided.
ErrorA voltage sensor in the specified system
detected an error from which it cannot
recover. The sensor location, chassis
location, previous state, and voltage sensor
value are provided.
Event Message Reference21
Current Sensor Messages
Current sensors listed in Table 2-5 measure the amount of current (in amperes) that is traversing critical
components. Current sensor messages provide status and warning information for current sensors in a
particular chassis.
Table 2-5. Current Sensor Messages
Event ID DescriptionSeverityCause
1200Current sensor has failed
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value (in
Amps):
If sensor type is discrete:
Discrete current state:
<State>
1201Current sensor value unknown
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value (in
Amps):
If sensor type is discrete:
Discrete current state:
<State>
<Reading>
<Reading>
InformationA current sensor on the power supply for the
specified system failed. The sensor location,
chassis location, previous state, and current
sensor value are provided.
InformationA current sensor on the power supply for the
specified system could not obtain a reading.
The sensor location, chassis location,
previous state, and a nominal current sensor
value are provided.
22Event Message Reference
Table 2-5. Current Sensor Messages (continued)
Event ID DescriptionSeverityCause
1202Current sensor returned to
a normal value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value
(in Amps):
If sensor type is discrete:
Discrete current state:
<Reading>
InformationA current sensor on the power supply for the
specified system returned to a valid range
after crossing a failure threshold. The sensor
location, chassis location, previous state, and
current sensor value are provided.
<State>
1203Current sensor detected a
warning value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value
(in Amps):
If sensor type is discrete:
Discrete current state:
<Reading>
WarningA current sensor on the power supply for the
specified system exceeded its warning
threshold. The sensor location, chassis
location, previous state, and current sensor
value are provided.
<State>
Event Message Reference23
Table 2-5. Current Sensor Messages (continued)
Event ID DescriptionSeverityCause
1204Current sensor detected a
failure value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value
(in Amps):
If sensor type is discrete:
Discrete current state:
<Reading>
ErrorA current sensor on the power supply for the
specified system exceeded its failure threshold.
The sensor location, chassis location,
previous state, and current sensor value are
provided.
<State>
1205Current sensor detected a
non-recoverable value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value
(in Amps):
If sensor type is discrete:
Discrete current state:
<Reading>
ErrorA current sensor in the specified system
detected an error from which it cannot
recover. The sensor location, chassis location,
previous state, and current sensor value are
provided.
<State>
24Event Message Reference
Chassis Intrusion Messages
Chassis intrusion messages listed in Table 2-6 are a security measure. Chassis intrusion means that
someone is opening the cover to a system’s chassis. Alerts are sent to prevent unauthorized removal of
parts from a chassis.
Table 2-6. Chassis Intrusion Messages
Event ID DescriptionSeverityCause
1250Chassis intrusion sensor has
failed
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
1251Chassis intrusion sensor
value unknown
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
1252Chassis intrusion returned
to normal
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
InformationA chassis intrusion sensor in the specified
system failed. The sensor location, chassis
location, previous state, and chassis intrusion
state are provided.
InformationA chassis intrusion sensor in the specified
system could not obtain a reading. The sensor
location, chassis location, previous state, and
chassis intrusion state are provided.
InformationA chassis intrusion sensor in the specified
system detected that a cover was opened while
the system was operating but has since been
replaced. The sensor location, chassis location,
previous state, and chassis intrusion state are
provided.
Event Message Reference25
Table 2-6. Chassis Intrusion Messages (continued)
Event ID DescriptionSeverityCause
1253Chassis intrusion in
progress
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
1254Chassis intrusion detected
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
1255Chassis intrusion sensor
detected a non-recoverable
value
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
WarningA chassis intrusion sensor in the specified
system detected that a system cover is currently
being opened and the system is operating.
The sensor location, chassis location, previous
state, and chassis intrusion state are provided.
ErrorA chassis intrusion sensor in the specified
system detected that the system cover was
opened while the system was operating.
The sensor location, chassis location, previous
state, and chassis intrusion state are provided.
ErrorA chassis intrusion sensor in the specified
system detected an error from which it cannot
recover. The sensor location, chassis location,
previous state, and chassis intrusion state are
provided.
Redundancy Unit Messages
Redundancy means that a system chassis has more than one of certain critical components. Fans and
power supplies, for example, are so important for preventing damage or disruption of a computer system
that a chassis may have “extra” fans or power supplies installed. Redundancy allows a second or nth fan
to keep the chassis components at a safe temperature when the primary fan has failed. Redundancy is
normal when the intended number of critical components are operating. Redundancy is degraded when
a component fails but others are still operating. Redundancy is lost when the number of components
functioning falls below the redundancy threshold.
26Event Message Reference
Ta b l e 2-7 lists the redundancy unit messages.
The number of devices required for full redundancy is provided as part of the message, when applicable,
for the redundancy unit and the platform. For details on redundancy computation, see the respective
platform documentation.
Table 2-7. Redundancy Unit Messages
Event ID DescriptionSeverityCause
1300Redundancy sensor has failed
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
<State>
1301Redundancy sensor value
unknown
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
<State>
1302Redundancy not applicable
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
<State>
1303Redundancy is offline
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
<State>
InformationA redundancy sensor in the specified system
failed. The redundancy unit location, chassis
location, previous redundancy state, and the
number of devices required for full
redundancy are provided.
InformationA redundancy sensor in the specified system
could not obtain a reading. The redundancy
unit location, chassis location, previous
redundancy state, and the number of
devices required for full redundancy
are provided.
InformationA redundancy sensor in the specified system
detected that a unit was not redundant.
The redundancy location, chassis location,
previous redundancy state, and the number
of devices required for full redundancy are
provided.
InformationA redundancy sensor in the specified system
detected that a redundant unit is offline.
The redundancy unit location, chassis
location, previous redundancy state, and the
number of devices required for full
redundancy are provided.
Event Message Reference27
Table 2-7. Redundancy Unit Messages (continued)
Event ID DescriptionSeverityCause
1304Redundancy regained
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
InformationA redundancy sensor in the specified system
detected that a “lost” redundancy device has
been reconnected or replaced; full redundancy
is in effect. The redundancy unit location,
chassis location, previous redundancy state,
and the number of devices required for full
redundancy are provided.
<State>
1305Redundancy degraded
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
WarningA redundancy sensor in the specified system
detected that one of the components of the
redundancy unit has failed but the unit is
still redundant. The redundancy unit
location, chassis location, previous redundancy
state, and the number of devices required
for full redundancy are provided.
<State>
1306Redundancy lost
Redundancy unit:
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
<Redundancy
Warn i n g o r
Error
(depending
on the
number of
units that are
functional)
A redundancy sensor in the specified system
detected that one of the components in the
redundant unit has been disconnected, has
failed, or is not present. The redundancy
unit location, chassis location, previous
redundancy state, and the number of devices
required for full redundancy are provided.
<State>
28Event Message Reference
Power Supply Messages
Power supply sensors monitor how well a power supply is functioning. Power supply messages listed in
Ta b l e 2-8 provide status and warning information for power supplies present in a particular chassis.
Table 2-8. Power Supply Messages
Event ID DescriptionSeverityCause
1350Power supply sensor has
failed Sensor location:
<Location in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type of configuration error>
1351Power supply sensor value
unknown
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type of configuration error>
<type of
<type of
InformationA power supply sensor in the specified
system failed. The sensor location, chassis
location, previous state, and additional
power supply status information
are provided.
InformationA power supply sensor in the specified
system could not obtain a reading.
The sensor location, chassis location,
previous state, and additional power supply
status information are provided.
Event Message Reference29
Table 2-8. Power Supply Messages (continued)
Event ID DescriptionSeverityCause
1352Power supply returned to
normal Sensor location:
<Location in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
<type of
InformationA power supply has been reconnected or
replaced. The sensor location, chassis
location, previous state, and additional
power supply status information
are provided.
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type of configuration error>
1353Power supply detected a
warning Sensor location:
<Location in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
<type of
WarningA power supply sensor reading in the
specified system exceeded a user-definable
warning threshold. The sensor location,
chassis location, previous state, and
additional power supply status information
are provided.
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type of configuration error>
30Event Message Reference
Table 2-8. Power Supply Messages (continued)
Event ID DescriptionSeverityCause
1354Power supply detected a failure
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
<type of
ErrorA power supply has been disconnected or
has failed. The sensor location, chassis
location, previous state, and additional
power supply status information
are provided.
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type
of configuration error>
1355Power supply sensor detected
a non-recoverable value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
<type of
ErrorA power supply sensor in the specified system
detected an error from which it cannot
recover. The sensor location, chassis location,
previous state, and additional power supply
status information are provided.
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type of configuration error>
Event Message Reference31
Memory Device Messages
Memory device messages listed in Table 2-9 provide status and warning information for memory
modules present in a particular system. Memory devices determine health status by monitoring the ECC
memory correction rate and the type of memory events that have occurred.
NOTE: A critical status does not always indicate a system failure or loss of data. In some instances, the system has
exceeded the ECC correction rate. Although the system continues to function, you should perform system
maintenance as described in Table
NOTE: In Table 2-9, <status> can be either critical or non-critical.
Table 2-9. Memory Device Messages
Event ID DescriptionSeverityCause
1403Memory device status is
<status>
<location in chassis>
Possible memory module event
cause:
1404Memory device status is
<status>
<location in chassis>
Possible memory module event
cause: <list of causes>
Memory device location:
<list of causes>
Memory device location:
2-9.
WarningA memory device correction rate
exceeded an acceptable value.
The memory device status and location
are provided.
ErrorA memory device correction rate
exceeded an acceptable value, a memory
spare bank was activated, or a multibit
ECC error occurred. The system continues
to function normally (except for a
multibit error). Replace the memory
module identified in the message during
the system’s next scheduled maintenance.
Clear the memory error on multibit ECC
error. The memory device status and
location are provided.
32Event Message Reference
Fan Enclosure Messages
Some systems are equipped with a protective enclosure for fans. Fan enclosure messages listed in
Ta b l e 2-10 monitor whether foreign objects are present in an enclosure and how long a fan enclosure is
missing from a chassis.
Table 2-10. Fan Enclosure Messages
Event ID DescriptionSeverityCause
1450Fan enclosure sensor has
failed
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
1451Fan enclosure sensor value
unknown
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
1452Fan enclosure inserted into
system
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
1453Fan enclosure removed from
system
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
InformationThe fan enclosure sensor in the specified
system failed. The sensor location and chassis
location are provided.
InformationThe fan enclosure sensor in the specified
system could not obtain a reading. The sensor
location and chassis location are provided.
InformationA fan enclosure has been inserted into the
specified system. The sensor location and
chassis location are provided.
WarningA fan enclosure has been removed from the
specified system. The sensor location and
chassis location are provided.
Event Message Reference33
Table 2-10. Fan Enclosure Messages (continued)
Event ID DescriptionSeverityCause
1454Fan enclosure removed from
system for an extended
amount of time
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
1455Fan enclosure sensor
detected a non-recoverable
value
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
ErrorA fan enclosure has been removed from the
specified system for a user-definable length of
time. The sensor location and chassis location
are provided.
ErrorA fan enclosure sensor in the specified system
detected an error from which it cannot recover.
The sensor location and chassis location
are provided.
AC Power Cord Messages
AC power cord messages listed in Table 2-11 provide status and warning information for power cords
that are part of an AC power switch, if your system supports AC switching.
Table 2-11. AC Power Cord Messages
Event ID DescriptionSeverityCause
1500AC power cord sensor has
failed Sensor location:
<Location in chassis>
Chassis location: <Name of
chassis>
1501AC power cord is not being
monitored
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
InformationAn AC power cord sensor in the specified
InformationThe AC power cord status is not being
34Event Message Reference
system failed. The AC power cord status
cannot be monitored. The sensor location
and chassis location information are
provided.
monitored. This occurs when a system’s
expected AC power configuration is set to
nonredundant. The sensor location and
chassis location information are provided.
Table 2-11. AC Power Cord Messages (continued)
Event ID DescriptionSeverityCause
1502AC power has been restored
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
1503AC power has been lost
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
1504AC power has been lost
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
1505AC power has been lost
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
InformationAn AC power cord that did not have
AC power has had the power restored.
The sensor location and chassis location
information are provided.
WarningAn AC power cord has lost its power, but
there is sufficient redundancy to classify
this as a warning. The sensor location and
chassis location information are provided.
ErrorAn AC power cord has lost its power, and
lack of redundancy requires this to be
classified as an error. The sensor location and
chassis location information are provided.
ErrorAn AC power cord sensor in the specified
system failed. The AC power cord status
cannot be monitored. The sensor location
and chassis location information are
provided.
Hardware Log Sensor Messages
Hardware logs provide hardware status messages to systems management software. On certain systems,
the hardware log is implemented as a circular queue. When the log becomes full, the oldest status
messages are overwritten when new status messages are logged. On some systems, the log is not circular.
On these systems, when the log becomes full, subsequent hardware status messages are lost. Hardware
log sensor messages listed in
logs that may fill up, resulting in lost status messages.
Ta b l e 2-12 provide status and warning information about the noncircular
Event Message Reference35
Table 2-12. Hardware Log Sensor Messages
Event ID DescriptionSeverityCause
1550Log monitoring has been
disabled
Log type:
1551Log status is unknown
Log type:
1552Log size is no longer near
or at capacity
Log type:
1553Log size is near or at
capacity
Log type:
1554Log size is full
Log type:
1555Log sensor has failed
Log type:
<Log type>
<Log type>
<Log type>
<Log type>
<Log type>
<Log type>
InformationA hardware log sensor in the specified
system is disabled. The log type information
is provided.
InformationA hardware log sensor in the specified
system could not obtain a reading. The log
type information is provided.
InformationThe hardware log on the specified system is
no longer near or at its capacity, usually as
the result of clearing the log. The log type
information is provided.
WarningThe size of a hardware log on the specified
system is near or at the capacity of the
hardware log. The log type information is
provided.
ErrorThe size of a hardware log on the specified
system is full. The log type information is
provided.
ErrorA hardware log sensor in the specified
system failed. The hardware log status
cannot be monitored. The log type
information is provided.
36Event Message Reference
Processor Sensor Messages
Processor sensors monitor how well a processor is functioning. Processor messages listed in Table 2-13
provide status and warning information for processors in a particular chassis.
Table 2-13. Processor Sensor Messages
Event ID DescriptionSeverityCause
1600Processor sensor has failed
Sensor Location:
chassis>
Chassis Location:
chassis>
Previous state was:
Processor sensor status:
<status>
1601Processor sensor value
unknown Sensor Location:
<Location in chassis>
Chassis Location:
chassis>
Previous state was:
Processor sensor status:
<status>
1602Processor sensor returned to
a normal value
Sensor Location:
chassis>
Chassis Location:
chassis>
Previous state was:
Processor sensor status:
<status>
<Location in
<Name of
<State>
<Name of
<State>
<Location in
<Name of
<State>
InformationA processor sensor in the specified system is
not functioning. The sensor location, chassis
location, previous state and processor sensor
status are provided.
InformationA processor sensor in the specified system
could not obtain a reading. The sensor
location, chassis location, previous state and
processor sensor status are provided.
InformationA processor sensor in the specified system
transitioned back to a normal state.
The sensor location, chassis location, previous
state and processor sensor status
are provided.
Event Message Reference37
Table 2-13. Processor Sensor Messages (continued)
Event ID DescriptionSeverityCause
1603Processor sensor detected a
warning value
Sensor Location:
<Location in
chassis>
Chassis Location:
<Name of
WarningA processor sensor in the specified system is
in a throttled state. The sensor location,
chassis location, previous state and
processor sensor status are provided.
chassis>
Previous state was:
Processor sensor status:
<State>
<status>
1604Processor sensor detected a
failure value
Sensor Location:
<Location in
chassis>
Chassis Location:
<Name of
ErrorA processor sensor in the specified system is
disabled, has a configuration error, or
experienced a thermal trip. The sensor
location, chassis location, previous state and
processor sensor status are provided.
chassis>
Previous state was:
Processor sensor status:
<State>
<status>
1605Processor sensor detected a
non-recoverable value
Sensor Location:
<Location in
chassis>
Chassis Location:
<Name of
ErrorA processor sensor in the specified system
has failed. The sensor location, chassis
location, previous state and processor sensor
status are provided.
chassis>
Previous state was:
Processor sensor status:
<State>
<status>
38Event Message Reference
Pluggable Device Messages
The pluggable device messages listed in Table 2-14 provide status and error information when some
devices, such as memory cards, are added or removed.
Table 2-14. Pluggable Device Messages
Event ID DescriptionSeverityCause
1650
1651Device added to system
1652Device removed from system
1653Device configuration error
<Device plug event type unknown>
Device location:
if available>
Chassis location:
if available>
Additional details:
details for the events,
if available>
Device location:
chassis>
Chassis location:
Additional details:
details for the events>
Device location:
chassis>
Chassis location:
chassis>
Additional details:
details for the events>
detected
Device location:
chassis>
Chassis location:
chassis>
Additional details:
details for the events>
<Location in chassis,
<Name of chassis,
<Additional
<Location in
<Name of chassis>
<Additional
<Location in
<Name of
<Additional
<Location in
<Name of
<Additional
InformationA pluggable device event
message of unknown type was
received. The device location,
chassis location, and
additional event details, if
available, are provided.
InformationA device was added in the
specified system. The device
location, chassis location, and
additional event details, if
available, are provided.
InformationA device was removed from
the specified system.
The device location, chassis
location, and additional event
details, if available, are
provided.
ErrorA configuration error was
detected for a pluggable
device in the specified
system. The device may have
been added to the system
incorrectly.
Event Message Reference39
Battery Sensor Messages
Battery sensors monitor how well a battery is functioning. Battery messages listed in Table 2-15 provide
status and warning information for batteries in a particular chassis.
Table 2-15. Battery Sensor Messages
Event ID DescriptionSeverityCause
1700Battery sensor has failed
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was:
Battery sensor status:
1701Battery sensor value unknown
Sensor Location:
Chassis Location:
Previous state was:
Battery sensor status:
1702Battery sensor returned to a normal
value
Sensor Location:
Chassis Location:
Previous state was:
Battery sensor status:
1703Battery sensor detected a warning
value
Sensor Location:
Chassis Location:
Previous state was:
Battery sensor status:
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
InformationA battery sensor in the
specified system is not
functioning. The sensor
location, chassis location,
previous state, and battery
sensor status are provided.
InformationA battery sensor in the
specified system could not
retrieve a reading. The sensor
location, chassis location,
previous state, and battery
sensor status are provided.
InformationA battery sensor in the
specified system detected
that a battery transitioned
back to a normal state.
The sensor location, chassis
location, previous state, and
battery sensor status are
provided.
WarningA battery sensor in the
specified system detected
that a battery is in a predictive
failure state. The sensor
location, chassis location,
previous state, and battery
sensor status are provided.
40Event Message Reference
Table 2-15. Battery Sensor Messages (continued)
Event ID DescriptionSeverityCause
1704Battery sensor detected a failure
value
Sensor Location:
Chassis Location:
Previous state was:
Battery sensor status:
1705Battery sensor detected a non-
recoverable value
Sensor Location:
Chassis Location:
Previous state was:
Battery sensor status:
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
ErrorA battery sensor in the
specified system detected
that a battery has failed.
The sensor location, chassis
location, previous state, and
battery sensor status are
provided.
ErrorA battery sensor in the
specified system detected
that a battery has failed.
The sensor location, chassis
location, previous state, and
battery sensor status are
provided.
Event Message Reference41
42Event Message Reference
System Event Log Messages for IPMI Systems
The following tables list the system event log (SEL) messages, their severity, and cause.
NOTE: For corrective actions, see the appropriate documentation.
Temperature Sensor Events
The temperature sensor event messages help protect critical components by alerting the systems
management console when the temperature rises inside the chassis. These event messages use
additional variables, such as sensor location, chassis location, previous state, and temperature
sensor
value or state.
Table 3-1. Temperature Sensor Events
Event MessageSeverityCause
<
Sensor Name/Location
temperature sensor detected a
failure <
Name/Location
that this sensor is monitoring.
For example, "PROC Temp" or
"Planar Temp."
Reading is specified in degree
Celsius. For example 100 C.
<Sensor Name/Location
temperature sensor detected
a warning <
<
Sensor Name/Location>
temperature sensor returned
to warning state <
<
Sensor Name/Location
temperature sensor returned
to normal state <
Reading
> is the entity
Reading
>
> where <
>
>.
Reading
>
Reading
Sensor
>.
>.
CriticalTemperature of the backplane board, system
board, or the carrier in the specified system
<Sensor Name/Location> exceeded the critical
threshold.
WarningTemperature of the backplane board, system
board, or the carrier in the specified system
<Sensor Name/Location> exceeded the
non-critical threshold.
WarningTemperature of the backplane board, system
board, or the carrier in the specified system
<Sensor Name/Location> returned from critical
state to non-critical state.
InformationTemperature of the backplane board, system
board, or the carrier in the specified system
<Sensor Name/Location> returned to normal
operating range.
System Event Log Messages for IPMI Systems43
Voltage Sensor Events
The voltage sensor event messages monitor the number of volts across critical components.
These
messages provide status and warning information for voltage sensors for a particular chassis.
Table 3-2. Voltage Sensor Events
Event MessageSeverityCause
<
Sensor Name/Location
sensor detected a failure <
where <
entity that this sensor is
monitoring.
Reading is specified in volts.
For example, 3.860 V.
Sensor Name/Location
<
sensor state asserted.
<
Sensor Name/Location
sensor state de-asserted.
Sensor Name/Location
<
sensor detected a warning
<
Reading
Sensor Name/Location
<
sensor returned to normal
<
Reading
Sensor Name/Location
>.
>.
> voltage
Reading
> is the
> voltage
> voltage
> voltage
> voltage
CriticalThe voltage of the monitored device has
>
CriticalThe voltage specified by
InformationThe voltage of a previously reported
WarningVoltage of the monitored entity
InformationThe voltage of a previously reported
exceeded the critical threshold.
<Sensor Name/Location> is in critical state.
<Sensor Name/Location> is returned to
normal state.
<Sensor Name/Location> exceeded the
warning threshold.
<Sensor Name/Location> is returned to
normal state.
44System Event Log Messages for IPMI Systems
Fan Sensor Events
The cooling device sensors monitor how well a fan is functioning. These messages provide status warning
and failure messages for fans for a particular chassis.
Table 3-3. Fan Sensor Events
Event MessageSeverityCause
<
Sensor Name/Location
sensor detected a failure
<
Reading
Name/Location
that this sensor is monitoring.
For example "BMC Back Fan" or
"BMC Front Fan."
Reading is specified in RPM.
For example, 100 RPM.
> where <
> is the entity
<Sensor Name/Location
sensor returned to normal state
Reading
<
Sensor Name/Location
<
sensor detected a warning
Reading
<
<
Sensor Name/Location
Redundancy sensor redundancy
degraded.
Sensor Name/Location
<
Redundancy sensor redundancy
lost.
>.
>.
<Sensor Name/Location> Fan
Redundancy sensor redundancy
regained
> Fan
Sensor
> Fan
> Fan
> Fan
> Fan
CriticalThe speed of the specified <Sensor Name/Location>
fan is not sufficient to provide enough cooling to the
system.
InformationThe fan specified by <Sensor Name/Location> has
returned to its normal operating speed.
WarningThe speed of the specified <Sensor Name/Location>
fan may not be sufficient to provide enough cooling
to the system.
InformationThe fan specified by <Sensor Name/Location> may
have failed and hence, the redundancy has been
degraded.
CriticalThe fan specified by <Sensor Name/Location> may
have failed and hence, the redundancy that was
degraded previously has been lost.
InformationThe fan specified by <Sensor Name/Location> may
have started functioning again and hence, the
redundancy has been regained.
System Event Log Messages for IPMI Systems45
Processor Status Events
The processor status messages monitor the functionality of the processors in a system. These messages
provide processor health and warning information of a system.
Table 3-4. Processor Status Events
Event MessageSeverityCause
<
Processor Entity
sensor IERR, where <
Entity
generated the event. For example,
PROC for a single processor system
and PROC # for multiprocessor
system.
<
sensor Thermal Trip.
<
sensor recovered from IERR.
<
sensor disabled.
<
sensor terminator not present.
> is the processor that
Processor Entity
Processor Entity
Processor Entity
Processor Entity
< Processor Entity>
deasserted.
<Processor Entity>
asserted.
<Processor Entity>
was deasserted.
<Processor Entity>
error was asserted.
<Processor Entity>
error was deasserted.
<Processor Entity>
asserted.
<Processor Entity>
deasserted.
> status processor
Processor
> status processor
> status processor
> status processor
> status processor
presence was
presence was
thermal tripped
configuration
configuration
throttled was
throttled was
CriticalIERR internal error generated by the
<Processor Entity>.
CriticalThe processor generates this event before it
shuts down because of excessive heat caused
by lack of cooling or heat synchronization.
InformationThis event is generated when a processor
recovers from the internal error.
WarningThis event is generated for all processors that
are disabled.
InformationThis event is generated if the terminator is
missing on an empty processor slot.
CriticalThis event is generated when the system
could not detect the processor.
InformationThis event is generated when the earlier
processor detection error was corrected.
InformationThis event is generated when the processor
has recovered from an earlier thermal condition.
CriticalThis event is generated when the processor
configuration is incorrect.
InformationThis event is generated when the earlier
processor configuration error was corrected.
WarningThis event is generated when the processor
slows down to prevent over heating.
InformationThis event is generated when the earlier
processor throttled event was corrected.
46System Event Log Messages for IPMI Systems
Power Supply Events
The power supply sensors monitor the functionality of the power supplies. These messages provide status
and warning information for power supplies for a particular system.
Table 3-5. Power Supply Events
Event MessageSeverityCause
<
Power Supply Sensor Name
supply sensor removed.
<
Power Supply Sensor Name
supply sensor AC recovered.
<
Power Supply Sensor Name
supply sensor returned to normal
state.
Entity Name
<
sensor redundancy degraded.
<
Entity Name
sensor redundancy lost.
<
Entity Name
sensor redundancy regained.
> PS Redundancy
> PS Redundancy
> PS Redundancy
<Power Supply Sensor Name>
predictive failure was asserted
<Power Supply Sensor Name>
lost was asserted
<Power Supply Sensor Name>
predictive failure was deasserted
<Power Supply Sensor Name>
lost was deasserted
> power
> power
> power
input
input
CriticalThis event is generated when the power supply
sensor is removed.
InformationThis event is generated when the power supply
has been replaced.
InformationThis event is generated when the power supply
that failed or removed was replaced and the
state has returned to normal.
InformationPower supply redundancy is degraded if one of
the power supply sources is removed or failed.
CriticalPower supply redundancy is lost if only one
power supply is functional.
InformationThis event is generated if the power supply has
been reconnected or replaced.
WarningThis event is generated when the power supply
is about to fail.
CriticalThis event is generated when the power supply
is unplugged.
InformationThis event is generated when the power
supply has recovered from an earlier predictive
failure event.
InformationThis event is generated when the power supply
is plugged in.
System Event Log Messages for IPMI Systems47
Memory ECC Events
The memory ECC event messages monitor the memory modules in a system. These messages monitor
the ECC memory correction rate and the type of memory events that occurred.
Table 3-6. Memory ECC Events
Event MessageSeverityCause
ECC error correction detected
on Bank # DIMM [A/B].
ECC uncorrectable error
detected on Bank # [DIMM].
Correctable memory error
logging disabled.
InformationThis event is generated when there is a memory error
correction on a particular Dual Inline Memory Module
(DIMM).
CriticalThis event is generated when the chipset is unable to
correct the memory errors. Usually, a bank number is
provided and DIMM may or may not be identifiable,
depending on the error.
CriticalThis event is generated when the chipset in the ECC
error correction rate exceeds a predefined limit.
BMC Watchdog Events
The BMC watchdog operations are performed when the system hangs or crashes. These messages
monitor the status and occurrence of these events in a system.
Table 3-7. BMC Watchdog Events
Event MessageSeverityCause
BMC OS Watchdog timer expired. InformationThis event is generated when the BMC watchdog
timer expires and no action is set.
BMC OS Watchdog performed
system reboot.
BMC OS Watchdog performed
system power off.
BMC OS Watchdog performed
system power cycle.
CriticalThis event is generated when the BMC watchdog
detects that the system has crashed (timer expired
because no response was received from Host) and the
action is set to reboot.
CriticalThis event is generated when the BMC watchdog
detects that the system has crashed (timer expired
because no response was received from Host) and the
action is set to power off.
CriticalThis event is generated when the BMC watchdog
detects that the system has crashed (timer expired
because no response was received from Host) and the
action is set to power cycle.
48System Event Log Messages for IPMI Systems
Memory Events
The memory modules can be configured in different ways in particular systems. These messages monitor
the status, warning, and configuration information about the memory modules in the system.
Table 3-8. Memory Events
Event MessageSeverityCause
Memory RAID redundancy
degraded.
Memory RAID redundancy
lost.
Memory RAID redundancy
regained
Memory Mirrored
redundancy degraded.
Memory Mirrored
redundancy lost.
Memory Mirrored
redundancy regained.
Memory Spared redundancy
degraded.
Memory Spared redundancy
lost.
Memory Spared redundancy
regained.
Information This event is generated when there is a memory failure in a
RAID-configured memory configuration.
CriticalThis event is generated when redundancy is lost in a
RAID-configured memory configuration.
Information This event is generated when the redundancy lost or degraded
earlier is regained in a RAID-configured
memory configuration.
Information This event is generated when there is a memory failure in a
mirrored memory configuration.
CriticalThis event is generated when redundancy is lost in a mirrored
memory configuration.
Information This event is generated when the redundancy lost or degraded
earlier is regained in a mirrored memory configuration.
Information This event is generated when there is a memory failure in a
spared memory configuration.
CriticalThis event is generated when redundancy is lost in a spared
memory configuration.
Information This event is generated when the redundancy lost or degraded
earlier is regained in a spared memory configuration.
Hardware Log Sensor Events
The hardware logs provide hardware status messages to the system management software. On particular
systems, the subsequent hardware messages are not displayed when the log is full. These messages
provide status and warning messages when the logs are full.
Table 3-9. Hardware Log Sensor Events
Event MessageSeverityCause
Log full detected.CriticalThis event is generated when the SEL device detects that
only one entry can be added to the SEL before it is full.
Log cleared.InformationThis event is generated when the SEL is cleared.
System Event Log Messages for IPMI Systems49
Drive Events
The drive event messages monitor the health of the drives in a system. These events are generated when
there is a fault in the drives indicated.
Table 3-10. Drive Events
Event MessageSeverityCause
Drive <
state.
Drive <
fault state.
Drive
drive presence was asserted
Drive
predictive failure was
asserted
Drive
predictive failure was
deasserted
Drive
hot spare was asserted
Drive
hot spare was deasserted
Drive
consistency check in progress
was asserted
Drive
consistency check in progress
was deasserted
Drive
in critical array was
asserted
Drive
in critical array was
deasserted
Drive
in failed array was asserted
Drive #
> asserted fault
Drive #
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
> de-asserted
CriticalThis event is generated when the specified drive in the
array is faulty.
InformationThis event is generated when the specified drive
recovers from a faulty condition.
Informational This event is generated when the drive is installed.
WarningThis event is generated when the drive is about to fail.
Informational This event is generated when the drive from earlier
predictive failure is corrected.
WarningThis event is generated when the drive is placed in a
hot spare.
Informational This event is generated when the drive is taken out of
hot spare.
WarningThis event is generated when the drive is placed in
consistency check.
Informational This event is generated when the consistency check of
the drive is completed.
CriticalThis event is generated when the drive is placed in
critical array.
Informational This event is generated when the drive is removed
from critical array.
CriticalThis event is generated when the drive is placed in the
fail array.
50System Event Log Messages for IPMI Systems
Table 3-10. Drive Events (continued)
Event MessageSeverityCause
Drive
in failed array was deasserted
Drive
rebuild in progress was
asserted
Drive
rebuild aborted was asserted
<Drive #>
<Drive #>
<Drive #>
Informational This event is generated when the drive is removed
from the fail array.
Informational This event is generated when the drive is rebuilding.
WarningThis event is generated when the drive rebuilding
process is aborted.
Intrusion Events
The chassis intrusion messages are a security measure. Chassis intrusion alerts are generated when the
system's chassis is opened. Alerts are sent to prevent unauthorized removal of parts from the chassis.
Table 3-11. Intrusion Events
Event MessageSeverityCause
<
Intrusion sensor Name
sensor detected an intrusion.
<
Intrusion sensor Name
sensor returned to normal state.
<Intrusion sensor Name>
sensor intrusion was asserted
while system was ON
<Intrusion sensor Name>
sensor intrusion was asserted
while system was OFF
>
>
CriticalThis event is generated when the intrusion sensor
detects an intrusion.
InformationThis event is generated when the earlier intrusion
has been corrected.
CriticalThis event is generated when the intrusion sensor
detects an intrusion while the system is on.
CriticalThis event is generated when the intrusion sensor
detects an intrusion while the system is off.
System Event Log Messages for IPMI Systems51
BIOS Generated System Events
The BIOS generated messages monitor the health and functionality of the chipsets, I/O channels, and
other BIOS-related functions. These system events are generated by the BIOS.
Table 3-12. BIOS Generated System Events
Event MessageSeverityCause
System Event I/O channel chk. CriticalThis event is generated when a critical interrupt is
generated in the I/O Channel.
System Event PCI Parity Err.CriticalThis event is generated when a parity error is detected
on the PCI bus.
System Event Chipset Err.CriticalThis event is generated when a chip error is detected.
System Event PCI System Err.InformationThis event indicates historical data, and is generated
when the system has crashed and recovered.
System Event PCI Fatal Err.CriticalThis error is generated when a fatal error is detected on
the PCI bus.
System Event PCIE Fatal Err.CriticalThis error is generated when a fatal error is detected on
the PCIE bus.
POST Err
POST fatal error #<number>
Memory Spared
redundancy lost
Memory Mirrored
redundancy lost
Memory RAID
redundancy lost
Err Reg Pointer
OEM Diagnostic data event was
asserted
System Board PFault Fail
Safe state asserted
System Board PFault Fail
Safe state deasserted
Memory Add
(BANK# DIMM#) presence was
asserted
CriticalThis event is generated when an error accrues during
system boot. See the system documentation for more
information on the error code.
CriticalThis event is generated when memory spare is no
longer redundant.
CriticalThis event is generated when memory mirroring is no
longer redundant.
CriticalThis event is generated when memory RAID is no
longer redundant.
InformationThis event is generated when an OEM event accrues.
CriticalThis event is generated when the system board
voltages are not at normal levels.
InformationThis event is generated when earlier PFault Fail Safe
system voltages returns to a normal level.
InformationThis event is generated when memory is added to the
system.
52System Event Log Messages for IPMI Systems
Table 3-12. BIOS Generated System Events (continued)
Event MessageSeverityCause
Memory Removed
(BANK# DIMM#) presence was
asserted
Memory Cfg Err
configuration error (BANK#
DIMM#) was asserted
Mem Redun Gain
redundancy regained
Mem ECC Warning
transition to non-critical
from OK
Mem ECC Warning
transition to critical from
less severe
Mem CRC Err
transition to non-recoverable
Mem Fatal SB CRC
uncorrectable ECC was
asserted
Mem Fatal NB CRC
uncorrectable ECC was
asserted
Mem Overtemp
critical over temperature
was asserted
USB Over-current
transition to non-recoverable
Hdwr version err
hardware incompatibility
(BMC Firmware and CPU
mismatch) was asserted
InformationThis event is generated when memory is removed from
the system.
CriticalThis event is generated when memory configuration is
incorrect for the system.
InformationThis event is generated when memory redundancy is
regained.
WarningThis event is generated when correctable ECC errors
have increased from a normal rate.
CriticalThis event is generated when correctable ECC errors
reach a critical rate.
CriticalThis event is generated when CRC errors enter a
non-recoverable state.
CriticalThis event is generated when CRC errors occur while
storing to memory.
CriticalThis event is generated when CRC errors occur while
removing from memory.
CriticalThis event is generated when system memory reaches
critical temperature.
CriticalThis event is generated when the USB exceeds a
predefined current level.
CriticalThis event is generated when there is a mismatch
between the BMC firmware and the processor in use
or vice versa.
System Event Log Messages for IPMI Systems53
Table 3-12. BIOS Generated System Events (continued)
Event MessageSeverityCause
Hdwr version err
hardware incompatibility
(BMC Firmware and CPU
mismatch) was deasserted
Hdwr version err
hardware incompatibility
(BMC Firmware and other
mismatch) was asserted
Hdwr version err
hardware incompatibility
(BMC Firmware and CPU
mismatch) was deasserted
SBE Log Disabled
correctable memory error
logging disabled was asserted
CPU Protocol Err
transition to non-recoverable
CPU Bus PERR
transition to non-recoverable
CPU Init Err
transition to non-recoverable
CPU Machine Chk
transition to non-recoverable
Logging Disabled
all event logging disabled was
asserted
Unknown system event sensor
unknown system hardware
failure was asserted
InformationThis event is generated when the earlier mismatch
between the BMC firmware and the processor is
corrected.
CriticalThis event is generated when there is a mismatch
between the BMC firmware and the processor in use or
vice versa.
InformationThis event is generated when an earlier hardware
mismatch is corrected.
CriticalThis event is generated when the ECC single bit error
rate is exceeded.
CriticalThis event is generated when the processor protocol
enters a non-recoverable state.
CriticalThis event is generated when the processor bus PERR
enters a non-recoverable state.
CriticalThis event is generated when the processor
initialization enters a non-recoverable state.
CriticalThis event is generated when the processor machine
check enters a non-recoverable state.
CriticalThis event is generated when all event logging is
disabled.
CriticalThis event is generated when an unknown hardware
failure is detected.
54System Event Log Messages for IPMI Systems
R2 Generated System Events
Table 3-13. R2 Generated Events
DescriptionSeverityCause
System Event: OS stop event OS
graceful shutdown detected
OEM Event data record (after
OS graceful shutdown/restart event)
System Event: OS stop event runtime
critical stop
OEM Event data record (after OS
bugcheck event)
InformationThe OS was shutdown/restarted
normally.
InformationComment string accompanying an
OS shutdown/restart.
CriticalThe OS encountered a critical error and
was stopped abnormally.
InformationOS bugcheck code and paremeters.
Cable Interconnect Events
The cable interconnect messages are used for detecting errors in the hardware cabling.
Table 3-14. Cable Interconnect Events
DescriptionSeverityCause
<Cable sensor Name/Location>
Configuration error was asserted.
<Cable sensor Name/Location>
Connection was asserted.
CriticalThis event is generated when the cable is
not connected or is incorrectly
connected.
InformationThis event is generated when the earlier
cable connection error was corrected.
Battery Events
Table 3-15. Battery Events
DescriptionSeverityCause
<Battery sensor Name/Location>
Failed was asserted
<Battery sensor Name/Location>
Failed was deasserted
<Battery sensor Name/Location>
is low was asserted
<Battery sensor Name/Location>
is low was deasserted
CriticalThis event is generated when the sensor
detects a failed or missing battery.
InformationThis event is generated when the earlier
failed battery was corrected.
WarningThis event is generated when the sensor
detects a low battery condition.
InformationThis event is generated when the earlier
low battery condition was corrected.
System Event Log Messages for IPMI Systems55
Entity Presence Events
The entity presence messages are used for detecting different hardware devices.
Table 3-16. Entity Presence Events
DescriptionSeverityCause
<Device Name>
presence was asserted
<Device Name>
absent was asserted
InformationThis event is generated when the device was detected.
CriticalThis event is generated when the device was not detected.
56System Event Log Messages for IPMI Systems
Storage Management Message Reference
The Dell OpenManage™ Server Administrator Storage Management’s alert or event management
features let you monitor the health of storage resources such as controllers, connectors, array disks,
and virtual disks.
Alert Monitoring and Logging
The Storage Management Service performs alert monitoring and logging. By default, the Storage
Management Service starts when the managed system starts up. If you stop the Storage
Management Service, the alert monitoring and logging stops. Alert monitoring does the following:
•Updates the status of the storage object that generated the alert.
•Propagates the storage object’s status to all the related higher objects in the storage hierarchy. For
example, the status of a lower-level object will be propagated up to the status displayed on the
Health tab for the top-level storage object.
•Logs an alert in the Alert log and the operating system (OS) application log.
•Sends an SNMP trap if the operating system’s SNMP service is installed and enabled.
NOTE: Dell OpenManage Storage Management does not log alerts regarding the data I/O path. These alerts
are logged by the respective RAID drivers in the system alert log.
See the Storage Management Online Help and the Dell OpenManage Server Administrator Storage
Management User’s Guide for updated information.
Alert Descriptions and Corrective Actions
The following sections describe alerts generated by the RAID or SCSI controllers supported by
Storage Management. The alerts are displayed in the Server Administrator Alert subtab or through
Windows Event Viewer. These alerts can also be forwarded as SNMP traps to other applications.
SNMP traps are generated for the alerts listed in the following sections. These traps are included in
the Dell OpenManage Storage Management management information base (MIB). The SNMP
traps for these alerts use all of the SNMP trap variables. For more information on SNMP support and
the MIB, see the SNMP Reference Guide.
To locate an alert, scroll through the following table to find the alert number displayed on the Server
Administrator Alert tab or search this file for the alert message text or number. See
Event Messages" for more information on severity levels.
Storage Management Message Reference57
"Understanding
NOTE: If you have an Array Manager installation, the Array Manager console reports the status of storage
components through error icons and graphical displays. When there is a change in status, Array Manager sends
events to the Array Manager event log, which can be viewed from the Array Manager console. For more
information, see the Array Manager User's Guide.
For more information regarding alert descriptions and the appropriate corrective actions, see the
help.
online
Table 4-1. Storage Management Messages
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2048Device failedCritical /
Failure /
Error
2049Array disk removedWarning /
Non-critical
Cause: A physical disk in the array failed.
The failed disk may have been identified by
the controller or connector. Performing a
consistency check can also identify a failed
disk.
Action: Replace the failed array disk. You can
identify which disk has failed by locating the
disk that has a red “X” for its status. Perform
a rescan after replacing the disk.
Cause: A physical disk has been removed
from the array. A user may have also executed
the "Prepare to Remove" task. This alert can
also be caused by loose or defective cables or
by problems with the enclosure.
If a physical disk was removed from
Action:
the array, either replace the disk or restore the
original disk. You can identify which disk has
been removed by locating the disk that has a
red “X” for its status. Perform a rescan after
replacing or restoring the disk. If a disk has not
been removed from the array, then check for
problems with the cables. See the
for more information on checking the cables.
Make sure that the enclosure is powered on.
If the problem persists, check the enclosure
documentation for further diagnostic
information.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2050Array disk offlineWarning /
Non-critical
2051Array disk degradedWarning /
Non-critical
2052Array disk insertedOk / Normal Cause: This alert is provided for
2053Virtual disk createdOk / Normal Cause: This alert is provided for
2054Virtual disk deletedWarning /
Non-critical
2055Virtual disk
configuration
changed
Ok / Normal Cause: This alert is provided for
Cause: A physical disk in the array is offline.
A disk can be made offline during a Prepare
to Remove operation or because a user
manually put the disk offline.
Perform a rescan. You can also select
Action:
the offline disk and perform a Make Online
operation.
Cause: An array disk has reported an error
condition and may be degraded. The array
disk may have reported the error condition in
response to a consistency check or other
operation.
Action: Replace the degraded array disk. You
can identify which disk is degraded by locating
the disk that has a red "X" for its status.
Perform a rescan after replacing the disk.
informational purposes.
Action: None
informational purposes.
Action: None
Cause: A virtual disk has been deleted.
"Performing a Reset Configuration" may
detect that a virtual disk has been deleted
and generate this alert.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2056Virtual disk failedCritical /
Failure /
Error
2057Virtual disk degraded Warning /
Non-critical
2058Virtual disk check
consistency started
Ok / Normal Cause: This alert is provided for
Cause: One or more physical disks included
in the virtual disk have failed. If the virtual
disk is non-redundant (does not use mirrored
or parity data), then the failure of a single
physical disk can cause the virtual disk to fail.
If the virtual disk is redundant, then more
physical disks have failed than can be rebuilt
using mirrored or parity information.
Create a new virtual disk and restore
Action:
from a backup.
Cause 1: This alert message occurs when a
physical disk included in a redundant virtual
disk fails. Because the virtual disk is redundant
(uses mirrored or parity information) and
only one physical disk has failed, the virtual
disk can be rebuilt.
Action 1: Configure a hot spare for the virtual
disk if one is not already configured. Rebuild
the virtual disk. When using an Expandable
RAID Controller (PERC) 2/SC, 3/SC, 2/DC,
3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC,
4/Di, or CERC ATA100/4ch controller,
rebuild the virtual disk by first configuring a
hot spare for the disk, and then initiating a
write operation to the disk. The write
operation will initiate a rebuild of the disk.
Cause 2: A physical disk in the array has been
removed.
Action 2: If a physical disk was removed from
the array, either replace the disk or restore the
original disk. You can identify which disk has
been removed by locating the disk that has a
red “X” for its status. Perform a rescan after
replacing the disk.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2059Virtual disk format
started
2061Virtual disk
initialization started
2063Virtual disk
reconfiguration
started
2064Virtual disk rebuild
started
2065Array disk rebuild
started
2067Virtual disk check
consistency cancelled
Ok / Normal Cause: This alert is provided for informational
purposes.
Action: None
Ok / Normal Cause: This alert is provided for informational
purposes.
Action: None
Ok / Normal Cause: This alert is provided for informational
purposes.
Action: None
Ok / Normal Cause: This alert is provided for informational
purposes.
Action: None
Ok / Normal Cause: This alert is provided for informational
purposes.
Action: None
Ok / Normal Cause: The check consistency operation
cancelled because a physical disk in the array
has failed or because a user cancelled the
check consistency operation.
Action: If the physical disk failed, then replace
the physical disk. You can identify which disk
failed by locating the disk that has a red “X”
for its status. Perform a rescan after replacing
the disk. When performing a consistency
check, be aware that the consistency check
can take a long time. The time it takes
depends on the size of the physical disk or
the virtual disk.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2070Virtual disk
initialization
cancelled
2074Array disk rebuild
cancelled
2076Virtual disk check
consistency failed
2077Virtual disk format
failed.
2079Virtual disk
initialization failed
Ok / Normal Cause: The virtual disk initialization cancelled
because a physical disk included in the virtual
disk has failed or because a user cancelled the
virtual disk initialization.
Action: If a physical disk failed, then replace
the physical disk. You can identify which disk
has failed by locating the disk that has a
red “X” for its status. Perform a rescan after
replacing the disk. Restart the format array
disk operation. Restart the virtual disk
initialization.
Ok / Normal Cause: A user has cancelled the rebuild
operation.
Action: Restart the rebuild operation.
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Cause: An array disk included in the virtual
disk failed or there is an error in the parity
information. A failed array disk can cause
errors in parity information.
Action: Replace the failed array disk. You can
identify which disk has failed by locating the
disk that has a red “X” for its status. Rebuild
the array disk. When finished, restart the
check consistency operation.
Cause: An array disk included in the virtual
disk failed.
Action: Replace the failed array disk. You can
identify which array disk has failed by locating
the disk that has a red "X" for its status.
Rebuild the array disk. When finished, restart
the virtual disk format operation.
Cause: An array disk included in the virtual
disk has failed or a user has cancelled the
initialization.
Action: If an array disk has failed, then
replace the array disk.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2080Array disk initialize
failed
2081Virtual disk
reconfiguration failed
2082Virtual disk rebuild
failed
2083Array disk rebuild
failed
2085Virtual disk check
consistency
completed
2086Virtual disk format
completed
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Cause: The array disk has failed or is corrupt.
Action: Replace the failed or corrupt disk.
You can identify a disk that has failed by
locating the disk that has a red “X” for its
status. Restart the initialization.
Cause: An array disk included in the virtual
disk has failed or is corrupt. A user may also
have cancelled the reconfiguration.
Action: Replace the failed or corrupt disk.
You can identify a disk that has failed by
locating the disk that has a red “X” for its
status. If the array disk is part of a redundant
array, then rebuild the array disk. When
finished, restart the reconfiguration.
Cause: An array disk included in the virtual
disk has failed or is corrupt. A user may also
have cancelled the rebuild.
Action: Replace the failed or corrupt disk.
You can identify a disk that has failed by
locating the disk that has a red “X” for its
status. Restart the virtual disk rebuild.
Cause: An array disk included in the virtual
disk has failed or is corrupt. A user may also
have cancelled the rebuild.
Action: Replace the failed or corrupt disk.
You can identify a disk that has failed by
locating the disk that has a red “X” for its
status. Rebuild the virtual disk rebuild.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2088Virtual disk
initialization
completed
2089Array disk initialize
completed
2090Virtual disk
reconfiguration
completed
2091Virtual disk rebuild
completed
2092Array disk rebuild
completed
2094Predictive Failure
reported. If this disk is
part of a redundant
virtual disk, select the
‘Offline’ option and
then replace the disk.
Then configure a hot
spare and it will start
the rebuild
automatically. If this
disk is a hot spare,
select the ‘Prepare to
Remove’ option and
then replace the disk.
If this disk is part of a
non-redundant disk,
you should back up
your data
immediately. If the
disk fails, you will not
be able to recover
the data.
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Warn i n g /
Non-critical
Cause: The array disk is predicted to fail.
Many array disks contain Self Monitoring
Analysis and Reporting Technology (SMART).
When enabled, SMART monitors the health
of the disk based on indications such as the
number of write operations that have been
performed on the disk.
Action: Replace the array disk. Even though
the disk may not have failed yet, it is strongly
recommended that you replace the disk.
Review the message text for additional
information.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2095SCSI sense data. If
this disk is part of a
redundant virtual
disk, select the
‘Offline’ option and
then replace the disk.
Then configure a hot
spare and it will start
the rebuild
automatically. If this
disk is a hot spare,
select the ‘Prepare to
Remove’ option and
then replace the disk.
If this disk is part of a
non-redundant disk,
you should back up
your data
immediately. If the
disk fails, you will not
be able to recover the
data.
2098Global hot spare
assigned
2099Global hot spare
unassigned
Warn i n g /
Non-critical
Ok / Normal Cause: A user has assigned an array disk as a
Ok / Normal Cause: A user has unassigned an array disk as
Cause: An array disk has failed, is corrupt, or
is otherwise experiencing a problem.
Action: Replace the array disk. Even though
the disk may not have failed yet, it is strongly
recommended that you replace the disk.
Review the message text for additional
information.
global hot spare. This alert is provided for
informational purposes.
Action: None
a global hot spare. This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2100Temperature
exceeded the
maximum warning
threshold
2101Temperature dropped
below the minimum
warning threshold
2102Temperature
exceeded the
maximum failure
threshold
2103Temperature dropped
below the minimum
failure threshold
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Critical /
Failure /
Error
Critical /
Failure /
Error
Cause: The array disk enclosure is too hot.
A variety of factors can cause the excessive
temperature. For example, a fan may have
failed, the thermostat may be set too high,
or the room temperature may be too hot.
Action: Check for factors that may cause
overheating. For example, verify that the
enclosure fan is working. You should also
check the thermostat settings and examine
whether the enclosure is located near a heat
source. Make sure the enclosure has enough
ventilation and that the room temperature is
not too hot. See the array disk enclosure
documentation for more
diagnostic information.
Cause: The array disk enclosure is too cool.
Action: Check whether the thermostat
setting is too low and whether the room
temperature is too cool.
Cause: The array disk enclosure is too hot. A
variety of factors can cause the excessive
temperature. For example, a fan may have
failed, the thermostat may be set too high, or
the room temperature may be too hot.
Action: Check for factors that may cause
overheating. For example, verify that the
enclosure fan is working. You should also
check the thermostat settings and examine
whether the enclosure is located near a heat
source. Make sure the enclosure has enough
ventilation and that the room temperature is
not too hot. See the array disk enclosure
documentation for more diagnostic
information.
Cause: The array disk enclosure is too cool.
Action: Check whether the thermostat
setting is too low and whether the room
temperature is too cool.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2104Controller battery is
reconditioning
2105Controller battery
recondition is
completed
2106Smart FPT exceeded Warning /
2107Smart configuration
change
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Cause: A disk on the specified controller has
Non-critical
Critical /
Failure /
Error
received a SMART alert (predictive failure)
indicating that the disk is likely to fail in the
near future.
Action: Replace the disk that has received the
SMART alert. If the array disk is a member of
a non-redundant virtual disk, then back up
the data before replacing the disk. Removing
an array disk that is included in a nonredundant virtual disk will cause the virtual
disk to fail and may cause data loss.
Cause: A disk has received a SMART alert
(predictive failure) after a configuration
change. The disk is likely to fail in the near
future.
Action: Replace the disk that has received the
SMART alert. If the array disk is a member of
a non-redundant virtual disk, then back up
the data before replacing the disk. Removing
an array disk that is included in a nonredundant virtual disk will cause the virtual
disk to fail and may cause data loss.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2108Smart warningWarning /
Non-critical
2109SMART warning
temperature
Warn i n g /
Non-critical
Cause: A disk has received a SMART alert
(predictive failure). The disk is likely to fail in
the near future.
Action: Replace the disk that has received the
SMART alert. If the array disk is a member of
a non-redundant virtual disk, then back up the
data before replacing the disk. Removing an
array disk that is included in a non-redundant
virtual disk will cause the virtual disk to fail
and may cause data loss.
Cause: A disk has reached an unacceptable
temperature and received a SMART alert
(predictive failure). The disk is likely to fail in
the near future.
First Action: Determine why the array disk
has reached an unacceptable temperature.
A variety of factors can cause the excessive
temperature. For example, a fan may have
failed, the thermostat may be set too high, or
the room temperature may be too hot or cold.
Verify that the fans in the server or enclosure
are working. If the array disk is in an enclosure,
you should check the thermostat settings and
examine whether the enclosure is located
near a heat source. Make sure the enclosure
has enough ventilation and that the room
temperature is not too hot. See the array disk
enclosure documentation for more diagnostic
information.
Second Action: If you cannot identify why the
disk has reached an unacceptable temperature,
then replace the disk. If the array disk is a
member of a non-redundant virtual disk,
then back up the data before replacing the
disk. Removing an array disk that is included
in a non-redundant virtual disk will cause the
virtual disk to fail and may cause data loss.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2110SMART warning
degraded
2111Failure prediction
threshold exceeded
due to test - No
action needed
2112Enclosure was shut
down
2114A consistency check
on a virtual disk has
been paused
(suspended)
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Critical /
Failure /
Error
Ok / Normal Cause: The check consistency operation on a
Cause: A disk is degraded and has received a
SMART alert (predictive failure). The disk is
likely to fail in the near future.
Action: Replace the disk that has received the
SMART alert. If the array disk is a member of
a non-redundant virtual disk, then back up
the data before replacing the disk. Removing
an array disk that is included in a nonredundant virtual disk will cause the virtual
disk to fail and may cause data loss.
Cause: A disk has received a SMART alert
(predictive failure) due to test conditions.
Action: None
Cause: The array disk enclosure is either
hotter or cooler than the maximum or
minimum allowable temperature range.
Action: Check for factors that may cause
overheating or excessive cooling. For example,
verify that the enclosure fan is working. You
should also check the thermostat settings and
examine whether the enclosure is located
near a heat source. Make sure the enclosure
has enough ventilation and that the room
temperature is not too hot or too cold. See
the enclosure documentation for more
diagnostic information.
virtual disk was paused by a user.
Action: To resume the check consistency
operation, right-click the virtual disk in the
Storage Management tree view and select
Resume Check Consistency.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2115A consistency check
on a virtual disk has
been resumed
2116A virtual disk and its
mirror have been split
2117A mirrored virtual
disk has been
unmirrored
2118Change write policyOk / Normal Cause: A user has changed the write policy
Ok / Normal Cause: The check consistency operation on a
virtual disk has resumed processing after
being paused by a user.
Action: This alert is provided for
informational purposes.
Ok / Normal Cause: A user has caused a mirrored virtual
disk to be split. When a virtual disk is mirrored,
its data is copied to another virtual disk in
order to maintain redundancy. After being
split, both virtual disks retain a copy of the
data, although because the mirror is no longer
intact, updates to the data are no longer copied
to the mirror.
Action: This alert is provided for
informational purposes.
Ok / Normal Cause: A user has caused a mirrored virtual
disk to be unmirrored. When a virtual disk is
mirrored, its data is copied to another virtual
disk in order to maintain redundancy. After
being unmirrored, the disk formerly used as
the mirror returns to being an array disk and
becomes available for inclusion in another
virtual disk.
Action: This alert is provided for
informational purposes.
for a virtual disk.
Action: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2120Enclosure firmware
mismatch
2121Device returned to
normal
2122Redundancy
degraded
Warn i n g /
Non-critical
Ok / Normal Cause: A device that was previously in an
Warn i n g /
Non-critical
Cause: The firmware on the enclosure
management modules (EMM) is not the same
version. It is required that both modules have
the same version of the firmware. This alert
may be caused when a user attempts to insert
an EMM module that has a different
firmware version than an existing module.
Action: Download the same version of the
firmware to both EMM modules.
error state has returned to a normal state. For
example, if an enclosure became too hot and
subsequently cooled down, then you may
receive this alert.
Action: This alert is provided for
informational purposes.
Cause: One or more of the enclosure
components has failed. For example, a fan or
power supply may have failed. Although the
enclosure is currently operational, the failure
of additional components could cause the
enclosure to fail.
Action: Identify and replace the failed
component. To identify the failed component,
select the enclosure in the tree view and click
the Health subtab. Any failed component will
be identified with a red X on the enclosure’s
Health subtab. Alternatively, you can select
the Storage object and click the Health
subtab. The controller status displayed on the
Health subtab indicates whether a controller
has a failed or degraded component. See the
enclosure documentation for information on
replacing enclosure components and for
other diagnostic information.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2123Redundancy lostWarning /
Non-critical
2124Redundancy normal Ok / Normal Cause: Data redundancy has been restored to
Cause: A virtual disk or an enclosure has lost
data redundancy. In the case of a virtual disk,
one or more array disks included in the virtual
disk have failed. Due to the failed array disk
or disks, the virtual disk is no longer
maintaining redundant (mirrored or parity)
data. The failure of an additional array disk
will result in lost data. In the case of an
enclosure, more than one enclosure
component has failed. For example, the
enclosure may have suffered the loss of all
fans or all power supplies.
Action: Identify and replace the failed
components. To identify the failed component,
select the Storage object and click the Health
subtab. The controller status displayed on the
Health subtab indicates whether a controller
has a failed or degraded component. Click
the controller that displays a Warning or
Failed status. This action displays the controller
Health subtab which displays the status of
the individual controller components.
Continue clicking the components with a
Warning or Health status until you identify
the failed component. See the online help for
more information. See the enclosure
documentation for information on replacing
enclosure components and for other
diagnostic information.
a virtual disk or an enclosure that previously
suffered a loss of redundancy.
Action: This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2126SCSI sense sector
reassign
2127Background
initialization (BGI)
started
2128BGI cancelledOk / Normal Cause: BGI of a virtual disk has been
2129BGI failedCritical /
2130BGI completedOk / Normal Cause: BGI of a virtual disk has completed.
2131Firmware version
mismatch
Warn i n g /
Non-critical
Ok / Normal Cause: BGI of a virtual disk has started. This
Failure /
Error
Warn i n g /
Non-critical
Cause: A sector of the disk is corrupted and
data cannot be maintained on this portion of
the disk.
Action: If the disk is part of a non-redundant
virtual disk, then replace the disk. Any data
residing on the corrupt portion of the disk
may be lost and you may need to restore from
backup. If the disk is part of a redundant
virtual disk, then any data residing on the
corrupt portion of the disk will be reallocated
elsewhere in the virtual disk.
alert is provided for informational purposes.
Action: None
cancelled. A user or the firmware may have
stopped BGI.
Action: None
Cause: BGI of a virtual disk has failed.
Action: None
This alert is provided for informational
purposes.
Action: None
Cause: The firmware on the controller is not
a supported version.
Action: Install a supported version of the
firmware. If you do not have a supported
version of the firmware available, it can be
downloaded from the Dell support site at
support.dell.com. If you do not have a
supported version of the firmware available,
check with your support provider for
information on how to obtain the most
current firmware.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2132Driver version
mismatch
2135Array Manager is
installed on the
system
2136Virtual disk
initialization
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Ok / Normal Cause: Virtual disk initialization is in progress.
Cause: The controller driver is not a
supported version.
Action: Install a supported version of the driver.
If you do not have a supported driver version
available, it can be downloaded from the
Dell support site at support.dell.com. If you
do not have a supported version of the driver
available, check with your support provider
for information on how to obtain the most
current driver.
Cause: Storage Management has been installed
on a system that has an Array Manager
installation.
Action: Installing Storage Management and
Array Manager on the same system is not a
supported configuration. Uninstall either
Storage Management or Array Manager.
This alert is provided for informational
purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2137Communication
timeout
2138Enclosure alarm
enabled
2139Enclosure alarm
disabled
2140Dead disk segments
restored
Warn i n g /
Non-critical
Ok / Normal Cause: A user has enabled the enclosure
Ok / Normal Cause: A user has disabled the enclosure alarm.
Ok / Normal Cause: Disk space that was formerly “dead”
Cause: The controller is unable to communicate
with an enclosure. There are several reasons
why communication may be lost. For example,
there may be a bad or loose cable. An
unusual amount of I/O may also interrupt
communication with the enclosure. In
addition, communication loss may be caused
by software, hardware, or firmware problems,
bad or failed power supplies, and enclosure
shutdown.
When viewed in the Alert Log, the description
for this event displays several variables. These
variables are: Controller and enclosure names,
type of communication problem, return code,
and SCSI status.
Action: Check for problems with the cables.
See the online help for more information on
checking the cables. You should also check to
see if the enclosure has degraded or failed
components. To do so, select the enclosure
object in the tree view and click the Health
subtab. The Health subtab displays the status
of the enclosure components. Verify that the
controller has supported driver and firmware
versions installed and that the EMMs are
each running the same version of supported
firmware.
alarm. This alert is provided for informational
purposes.
Action: None
Action: None
or inaccessible to a redundant virtual disk has
been restored. This alert is provided for
informational purposes.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2162Communication
regained
2163Rebuild completed
with errors
2164See the Readme file
for a list of validated
controller driver
versions
2165The RAID controller
firmware and driver
validation was not
performed. The
configuration file
cannot be opened.
2166The RAID controller
firmware and driver
validation was not
performed. The
configuration file is
out of date or
corrupted.
Ok / Normal Cause: Communication with an enclosure
has been restored. This alert is provided for
informational purposes.
Action: None
Ok / Normal See the online help for more information.904690
Ok / Normal Cause: Storage Management is unable to
determine whether the system has the
minimum required versions of the
RAID controller drivers.
Action: This alert is generated for
informational purposes. See the Readme
file for driver and firmware requirements.
In particular, if Storage Management
experiences performance problems, you
should verify that you have the minimum
supported versions of the drivers and
firmware installed.
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Cause: Storage Management is unable to
determine whether the system has the
minimum required versions of the
RAID controller firmware and drivers. This
situation may occur for a variety of reasons.
For example, the installation directory path
to the configuration file may not be correct.
The configuration file may also have been
removed or renamed.
Action: Reinstall Storage Management
Cause: Storage Management is unable to
determine whether the system has the
minimum required versions of the
RAID controller firmware and drivers. This
situation has occurred because a configuration
file is unreadable or missing data. The
configuration file may be corrupted.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2167The current kernel
version and the nonRAID SCSI driver
version are older than
the minimum
required levels.
See the Readme file
for a list of validated
kernel and driver
versions.
2168The non-RAID SCSI
driver version is older
than the minimum
required level.
See the Readme file
for the validated
driver version.
2169The controller battery
needs to be replaced.
2170The controller battery
charge level is normal.
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Critical /
Failure /
Error
Ok / Normal Cause: This alert is provided for
Cause: The version of the kernel and the
driver do not meet the minimum requirements.
Storage Management may not be able to
display the storage or perform storage
management functions until you have
updated the system to meet the minimum
requirements.
Action: See the Readme file for kernel and
driver requirements. Update the system to
meet the minimum requirements and then
reinstall Storage Management.
Cause: The version of the driver does not
meet the minimum requirements. Storage
Management may not be able to display the
storage or perform storage management
functions until you have updated the system
to meet the minimum requirements.
Action: See the Readme file for the driver
requirements. Update the system to meet the
minimum requirements and then reinstall
Storage Management.
Cause: The controller battery cannot recharge.
The battery may be old or it may have been
already recharged the maximum number of
times. In addition, the battery charger may
not be working.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2171The controller battery
temperature is above
normal.
2172The controller battery
temperature is
normal.
2174The controller battery
has been removed.
2175The controller battery
has been replaced.
2176The controller battery
Learn cycle has
started.
2177The controller battery
Learn cycle has
completed.
Warn i n g /
Non-critical
Ok / Normal Cause: This alert is provided for
Warn i n g /
Non-critical
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Cause: The battery may be recharging, the
room temperature may be too hot, or the fan
in the system may be degraded or failed.
Action: If this alert was generated due to a
battery recharge, the situation will correct
when the recharge is complete. You should
also check if the room temperature is normal
and that the system components are
functioning properly.
informational purposes.
Action: None
Cause: The controller cannot communicate
with the battery, the battery may be removed,
or the contact point between the controller
and the battery may be burnt or corroded.
Action: Replace the battery if it has been
removed. If the contact point between the
battery and the controller is burnt or corroded,
you will need to replace either the battery or
the controller, or both. See the hardware
documentation for information on how to
safely access, remove, and replace the battery.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2178The controller battery
Learn cycle has
timed out.
2179The controller battery
Learn cycle has been
postponed.
2180The controller battery
Learn cycle will start
in % days.
Warn i n g /
Non-critical
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Cause: The controller battery must be fully
charged before the Learn cycle can begin.
The battery may be unable to maintain a full
charge causing the Learn cycle to timeout.
Additionally, the battery must be able to
maintain cached data for a specified period of
time in the event of a power loss. For example,
some batteries maintain cached data for
24 hours. If the battery is unable to maintain
cached data for the required period of time,
then the Learn cycle will timeout.
Action: Replace the battery pack as the
battery is unable to maintain a full charge.
informational purposes.
Action: None
informational purposes.
Action: None
1153None
1151None
1151None
NOTE: The % is a
variable that will be
replaced with the
number of days before
which the Learn cycle
will start. You can set
the duration to start
the Learn cycle.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2181The controller battery
Learn cycle will start
in % hours.
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
1151None
NOTE: The % is a
variable that will be
replaced with the
number of hours
before which the
Learn cycle will start.
You can set the
duration to start the
Learn cycle.
2182An invalid SAS
configuration has
been detected.
2186The controller cache
has been discarded.
2187Single-bit ECC error
limit exceeded.
Critical /
Failure /
Error
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Cause: The controller and attached
enclosures are not cabled correctly.
Action: See the hardware documentation for
information on correct cabling
configurations.
Cause: The controller has flushed the cache
and any data in the cache has been lost. This
may happen if the system has memory or
battery problems that cause the controller to
distrust the cache. Although user data may have
been lost, this alert does not always indicate
that relevant or user data has been lost.
Action: Verify that the battery and memory
are functioning properly.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2188The controller write
policy has been
changed to Write
Through.
2189The controller write
policy has been
changed to Write
Back.
2191Multiple enclosures
are attached to the
controller. This is an
unsupported
configuration.
2192The virtual disk
Check Consistency
has made corrections
and completed.
Warn i n g /
Non-critical
Ok / Normal Cause: This alert is provided for
Critical /
Failure /
Error
Ok / Normal Cause: The virtual disk Check Consistency
Cause: The controller battery is unable to
maintain cached data for the required period
of time. For example, if the required period of
time is 24 hours, the battery is unable to
maintain cached data for 24 hours. It is
normal to receive this alert during the battery
Learn cycle as the Learn cycle discharges the
battery before recharging it. When
discharged, the battery cannot maintain
cached data.
Action: Check the health of the battery. If the
battery is weak, replace the battery pack.
informational purposes.
Action: None
Cause: Many enclosures are attached to the
controller port. When the enclosure limit is
exceeded, the controller loses contact with all
enclosures attached to the port.
Action: Remove the last enclosure. You must
remove the enclosure that has been added last
and is causing the enclosure limit to exceed.
has identified errors and made corrections.
For example, the Check Consistency may
have encountered a bad disk block and
remapped the disk block to restore data
consistency. This alert is provided for
informational purposes.
Action: Monitor the battery and cache health
to make sure they are functioning properly.
Monitor the Alert Log for events related to
the battery and to write policy changes. You
should also monitor the Alert Log for events
related to disk errors. If you suspect that the
battery or a disk has problems, replace the
battery pack or the disk.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2193The virtual disk
reconfiguration has
resumed.
2194The virtual disk Read
policy has changed.
2199The virtual disk cache
policy has changed.
2201A global hot spare
failed.
2202A global hot spare
has been removed.
2203A dedicated hot spare
failed.
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Cause: The controller is unable to
communicate with a disk that is assigned as a
global hot spare. The disk may have failed or
has been removed. There may also be a bad or
loose cable.
Action: Check if the disk is healthy and that
it has not been removed. Check the cables. If
necessary, replace the disk and reassign the
hot spare.
Cause: The controller is unable to
communicate with a disk that is assigned as a
global hot spare. The disk may have been
removed. There may also be a bad or loose
cable.
Action: Check if the disk is healthy and that
it has not been removed. Check the cables. If
necessary, replace the disk and reassign the
hot spare.
Cause: The controller is unable to
communicate with a disk that is assigned as a
dedicated hot spare. The disk may have failed
or been removed. There may also be a bad or
loose cable.
Action: Check if the disk is healthy and that
it has not been removed. Check the cables. If
necessary, replace the disk and reassign the
hot spare.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2204A dedicated hot spare
has been removed.
2205A dedicated hot spare
has been
automatically
unassigned.
2206The only hot spare
available is a
SATA disk. SATA disks
cannot replace
SAS disks.
2207The only hot spare
available is a SAS disk.
SAS disks cannot
replace SATA disks.
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Cause: The controller is unable to communicate
with a disk that is assigned as a dedicated hot
spare. The disk may have been removed.
There may also be a bad or loose cable.
Action: Check if the disk is healthy and that
it has not been removed. Check the cables. If
necessary, replace the disk and reassign the
hot spare.
Cause: The hot spare is no longer required
because the virtual disk it was assigned to has
been deleted.
Action: None.
Cause: The only array disk available to
be assigned as a hot spare is using
SATA technology. The array disks in the
virtual disk are using SAS technology.
Because of this difference in technology, the
hot spare cannot rebuild data if one of the
array disks in the virtual disk fails.
Action: Add a SAS disk that is large enough
to be used as the hot spare and assign the new
disk as a hot spare.
Cause: The only array disk available to be
assigned as a hot spare is using SAS technology.
The array disks in the virtual disk are using
SATA technology. Because of this difference
in technology, the hot spare cannot rebuild
data if one of the array disks in the virtual
disk fails.
Action: Add a SATA disk that is large enough
to be used as the hot spare and assign the new
disk as a hot spare.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2211The physical disk is
not supported.
2232The controller alarm
is silenced.
2233The background
initialization (BGI)
rate has changed.
2234The Patrol Read rate
has changed.
2235The Check
Consistency rate has
changed.
2237A controller rescan
has been initiated.
2238The controller debug
log file has been
exported.
2239A foreign
configuration has
been cleared.
2240A foreign
configuration has
been imported.
Warn i n g /
Non-critical
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Cause: The physical disk may not have a
supported version of the firmware or the disk
may not be supported by Dell.
Action: If the disk is supported by Dell,
update the firmware to a supported version.
If the disk is not supported by Dell, replace
the disk with one that is supported.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2241The Patrol Read
mode has changed.
2242The Patrol Read has
started.
2243The Patrol Read has
stopped.
2244A virtual disk blink
has been initiated.
2245A virtual disk blink
has ceased.
2246The controller battery
is degraded.
2247The controller battery
is charging.
2248The controller battery
is executing a Learn
cycle.
2249The array disk Clear
operation has started.
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Warn i n g /
Non-critical
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Cause: The controller battery charge is weak.
Action: As the charge weakens, the charger
should automatically recharge the battery.
If the battery has reached its recharge limit,
replace the battery pack. Monitor the battery
to make sure that it recharges successfully.
If the battery does not recharge, replace the
battery pack.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2264A device is missing.Warning /
Non-critical
2265A device is in an
unknown state.
2266Controller log file
entry: %1
Warn i n g /
Non-critical
Ok / Normal Cause: This alert is provided for
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2267The controller
reconstruct rate has
changed.
Ok / Normal Cause: This alert is provided for
Cause: The controller cannot communicate
with a device. The device may be removed.
There may also be a bad or loose cable.
Action: Check if the device is in and not
removed. If it is in, check the cables. You
should also check the connection to the
controller battery and the battery health.
A battery with a weak or depleted charge
may cause this alert.
Cause: The controller cannot communicate
with a device. The state of the device cannot
be determined. There may be a bad or loose
cable. The system may also be experiencing
problems with the application programming
interface (API). There could also be a
problem with the driver or firmware.
Action: Check the cables. Check if the
controller has a supported version of the
driver and firmware. You can download the
most current version of the driver and
firmware from support.dell.com. Rebooting
the system may also resolve this problem.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2268%1, Storage
Management has lost
communication with
this RAID controller
and attached storage.
An immediate
reboot is strongly
recommended to
avoid further problems.
If the reboot does
not restore
communication,
there may be a
hardware failure.
Critical /
Failure /
Error
Cause: Storage Management has lost
communication with a device. There may be
faulty hardware or loose or defective cables.
Action: Reboot the system. If the problem is
not resolved, check for hardware failures. Any
failed component must be replaced. Make
sure the cables are attached securely. See the
hardware documentation for more
diagnostics information.
104None
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2269The array disk Clear
operation has
completed.
2270The array disk Clear
operation failed.
2271The Patrol Read
corrected a media
error.
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Critical /
Failure /
Error
Ok / Normal Cause: This alert is provided for
Cause: A Clear task was being performed on
an array disk, but it was interrupted and did
not complete successfully. The controller may
have lost communication with the disk. The
disk may have been removed or the cables
may be loose or defective.
Action: Check if the disk is in and not in a
Failed state. Make sure the cables are
attached securely. Restart the Clear task.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2272Patrol Read found an
uncorrectable media
error.
2273Bad media.Critical /
2274The array disk rebuild
has resumed.
2276The dedicated hot
spare is too small.
2277The global hot spare
is too small.
Critical /
Failure /
Error
Failure /
Error
Ok / Normal Cause: This alert is provided for
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Cause: The Patrol Read task has faced an
error that cannot be corrected. There may be
a bad disk block that cannot be remapped.
Action: Replace the array disk to avoid future
data loss.
Cause: A source (array) disk in a redundant
virtual disk has a bad disk block. The algorithm
that maintains redundant data has created a
similar bad block on the target redundant
disk to maintain consistency in disk block
addressing. Data has been lost.
Action: Restore from backup.
informational purposes.
Action: None
Cause: The dedicated hot spare is not large
enough to protect all virtual disks that reside
on the disk group.
Action: Assign a larger disk as the dedicated
hot spare.
Cause: The global hot spare is not large enough
to protect all virtual disks that reside on the
controller.
Action: Assign a larger disk as the global
hot spare.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2278The controller battery
charge level is below
a normal threshold.
2279The controller battery
charge level is above a
normal threshold.
2280A disk media error has
been corrected.
2281Virtual disk has
inconsistent data.
Ok / Normal Cause: The battery is discharging. A battery
discharge is a normal activity during the
battery Learn cycle. Before completing, the
battery Learn cycle recharges the battery. You
should receive alert 2179 when the recharge
occurs.
Action: Check if the battery Learn cycle is in
progress. Alert 2176 indicates that the battery
Learn cycle has initiated. The battery also
displays the Learn state while the Learn cycle
is in progress. If a Learn cycle is not in
progress, replace the battery pack.
Ok / Normal Cause: This alert is provided for informational
purposes. This alert indicates that the battery
is recharging during the battery Learn cycle.
Action: None
Ok / Normal Cause: A disk media error was detected
while the controller was completing a
background task. A bad disk block was
identified. The disk block has been
remapped.
Action: Consider replacing the disk. If you
receive this alert frequently, be sure to replace
the disk. You should also routinely back up
your data.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2282Hot spare SMART
polling failed.
2283A redundant path is
broken.
2284A redundant path has
been restored.
2285A disk media error
was corrected during
recovery.
2286A Learn cycle start is
pending while the
battery charges.
2287The Patrol Read is
paused.
2288The patrol read has
resumed.
Critical /
Failure /
Error
Warn i n g /
Non-critical
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Ok / Normal Cause: This alert is provided for
Cause: The controller firmware attempted a
SMART polling on the hot spare but was
unable to complete it. The controller has lost
communication with the hot spare.
Action: Check the health of the disk assigned
as a hot spare. You may need to replace the
disk and reassign the hot spare. Make sure the
cables are attached securely. See the Cables
Attached Correctly section in the
Dell OpenManage Server Administrator
Storage Management User’s Guide for more
information on checking the cables.
Cause: The controller has two connectors
that are connected to the same enclosure.
The communication path on one connector
has lost connection with the enclosure. The
communication path on the other connector
is reporting this loss.
Action: Make sure the cables are attached
securely. Make sure both EMMs are healthy.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2289Multi-bit ECC error. Critical /
Failure /
Error
2290Single-bit ECC error. Warning /
Non-critical
2291An EMM has been
discovered.
2292Communication with
the enclosure has
been lost.
Ok / Normal Cause: This alert is provided for
Critical /
Failure /
Error
Cause: An error involving multiple bits has
been encountered during a read or write
operation. The error correction algorithm
recalculates parity data during read and write
operations. If an error involves only a single
bit, it may be possible for the error correction
algorithm to correct the error and maintain
parity data. An error involving multiple bits,
however, usually indicates data loss. In some
cases, if the multi-bit error occurs during a
read operation, the data on the disk may be
correct/valid. If the multi-bit error occurs
during a write operation, data loss has
occurred.
Action: Replace the dual in-line memory
module (DIMM). The DIMM is a part of the
controller battery pack. See your hardware
documentation for information on replacing
the DIMM. You may need to restore data
from backup.
Cause: An error involving a single bit has
been encountered during a read or write
operation. The error correction algorithm has
corrected this error.
Action: None
informational purposes.
Action: None
Cause: The controller has lost communication
with an EMM. The cables may be loose or
defective.
Action: Make sure the cables are attached
securely. Reboot the system.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2293The EMM has failed. Critical /
Failure /
Error
2294A device has been
inserted.
2295A device has been
removed.
2296An EMM has been
inserted.
2297An EMM has been
removed.
2298There is a bad sensor
on an enclosure.
Ok / Normal Cause: This alert is provided for
Critical /
Failure /
Error
Ok / Normal Cause: This alert is provided for
Critical /
Failure /
Error
Warn i n g /
Non-critical
Cause: The failure may be caused by a loss of
power to the EMM. The EMM self test may
also have identified a failure. There could also
be a firmware problem or a multi-bit error.
Action: Replace the EMM. See the hardware
documentation for information on replacing
the EMM.
informational purposes.
Action: None
Cause: A device has been removed and the
system is no longer functioning in optimal
condition.
Action: Replace the device.
informational purposes.
Action: None
Cause: An EMM has been removed.
Action: Replace the EMM. See the
hardware documentation for information
on replacing the EMM.
Cause: The enclosure has a bad sensor. The
enclosure sensors monitor the fan speeds,
temperature probes, etc.
Action: See the hardware documentation for
more information.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2299Bad PHY %1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2300The enclosure is
unstable.
2301The enclosure has a
hardware error.
2302The enclosure is not
responding.
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Cause: There is a problem with a physical
connection or PHY.
Action: Replace the EMM that contains the
bad PHY. See the hardware documentation
for information on replacing the EMM.
Attach the storage to a different connector, if
available. Make sure the cables are attached
securely. See the Cables Attached Correctly
section in the Dell OpenManage Server
Administrator Storage Management
User’s Guide for more information on
checking the cables.
Cause: The controller is not receiving a
consistent response from the enclosure.
There could be a firmware problem or an
invalid cabling configuration. If the cables are
too long, they will degrade the signal.
Action: Power down all enclosures attached
to the system and reboot the system. If the
problem persists, upgrade the firmware to the
latest supported version. You can download
the most current version of the driver and
firmware from support.dell.com. Make sure
the cable configuration is valid. See the
hardware documentation for valid cabling
configurations.
Cause: The enclosure or an enclosure
component is in a Failed or Degraded state.
Action: Check the health of the enclosure
and its components. Replace any hardware
that is in a Failed state. See the hardware
documentation for more information.
Cause: The enclosure or an enclosure
component is in a Failed or Degraded state.
Action: Check the health of the enclosure
and its components. Replace any hardware
that is in a Failed state. See the hardware
documentation for more information.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2303The enclosure cannot
support both SAS and
SATA array disks.
Array disks may be
disabled.
2304An attempt to hot
plug an EMM has
been detected. This
type of hot plug is not
supported.
2305The array disk is too
small to be used for a
rebuild.
2306Bad block table is
80% full.
2307Bad block table is full.
Unable to log block %1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Ok / Normal Cause: This alert is provided for
informational purposes.
Action: None
Warn i n g /
Non-critical
Critical /
Failure /
Error
Cause: The bad block table is used for
remapping bad disk blocks. This table fills,
as bad disk blocks are remapped. When the
table is full, bad disk blocks can no longer be
remapped, and disk errors can no longer be
corrected. At this point, data loss can occur.
The bad block table is now 80% full.
Action: Back up your data. Replace the
disk generating this alert and restore from
back up.
Cause: The bad block table is used for
remapping bad disk blocks. This table fills, as
bad disk blocks are remapped. When the
table is full, bad disk blocks can no longer be
remapped and disk errors can no longer be
corrected. At this point, data loss can occur.
Action: Replace the disk generating this
alert and restore from backup. You may
have lost data.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2309An array disk is
incompatible.
2310A virtual disk is
permanently
degraded.
2311The firmware on the
EMMs is not the same
version. EMM0 %1
EMM1 %2
NOTE: %1 and %2 are
substitution variables
that will appear in the
alert description for
specific details about
the alert.
2312A power supply in the
enclosure has an
AC failure.
2313A power supply in the
enclosure has a
DC failure.
Warn i n g /
Non-critical
Critical /
Failure /
Error
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Warn i n g /
Non-critical
Cause: You have attempted to replace a disk
with another disk that is using an incompatible
technology. For example, you may have
replaced one side of a mirror with a SAS disk
when the other side of the mirror is using
SATA technology.
Action: See the hardware documentation for
information on replacing disks.
Cause: A redundant virtual disk has lost
redundancy. This may occur when the virtual
disk suffers the failure of multiple array disks.
In this case, both the source array disk and
the target disk with redundant data have
failed. A rebuild is not possible because there
is no redundancy.
Action: Replace the failed disks and restore
from backup.
Cause: The firmware on the EMM modules
is not the same version. It is required that
both modules have the same version of the
firmware. This alert may be caused if you
attempt to insert an EMM module that has a
different firmware version than an existing
module.
Action: Upgrade to the same version of the
firmware on both EMM modules.
Event ID DescriptionSeverityCause and ActionSNMP Trap
Numbers
2314The initialization
sequence of SAS
components failed
during system
startup. SAS
management and
monitoring is not
possible.
2315Diagnostic message %1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2316Diagnostic message %1
NOTE: %1 is a
substitution variable
that will appear in the
alert description for
specific details about
the alert.
2317BGI terminated due
to loss of ownership
in a cluster
configuration.
2318Problems with the
battery or the battery
charger have been
detected. The battery
health is poor.
Critical /
Failure /
Error
Ok / Normal Cause: This alert is provided for
Critical /
Failure /
Error
Ok / Normal Cause: This alert is provided for
Critical /
Failure /
Error
Cause: Storage Management is unable to
monitor or manage SAS devices.
Action: Reboot the system. If problem
persists, make sure you have supported
versions of the drivers and firmware. Also, you
may need to reinstall Storage Management or
Server Administrator because of some
missing installation components.
informational purposes.
Action: None
Cause: A diagnostics test failed. The text for
this alert is generated by the utility that ran
the diagnostics.
Action: See the documentation for the utility
that ran the diagnostics for more
information.
informational purposes.
Action: None
Cause: The battery or the battery charger is
not functioning properly.
Action: Replace the battery pack.
104None
751None
754None
1201None
1154None
Array
Manager
Event
Number
100Storage Management Message Reference
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.