Reproduction in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden.
Trademarks used in this text: Dell, the DELL logo and Dell OpenManage are trademarks of Dell Inc.; Microsoft and Windows are registered
trademarks and Windows Server is a trademark of Microsoft Corporation; Red Hat is a registered trademark of Red
registered trademark of Novell, Inc. in the United States and other countries.
Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products.
Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own.
Dell OpenManage™ Server Administrator produces event messages stored primarily in the
operating
describes the event messages created by Server Administrator version 5.3 or later and displayed in
the Server Administrator Alert log.
Server Administrator creates events in response to sensor status changes and other monitored
parameters. The Server Administrator event monitor uses these status change events to add
descriptive messages to the operating system event log or the Server Administrator Alert log.
Each event message that Server Administrator adds to the Alert log consists of a unique identifier
called the event ID for a specific event source category and a descriptive message. The event
message includes the severity, cause of the event, and other relevant information, such as the event
location and the monitored item’s previous state.
Tables provided in this guide list all Server Administrator event IDs in numeric order. Each entry
includes the event ID’s corresponding description, severity level, and cause. Message text in angle
brackets (for example,
Server
What’s New in this Release
Modifications have been made to the Storage Management Service events. For more information,
see "
system or Server Administrator event logs and sometimes in SNMP traps. This document
<State>
Administrator.
Alert Message Change History
) describes the event-specific information provided by the
".
Messages Not Described in This Guide
This guide describes only event messages created by Server Administrator and displayed in the
Server Administrator Alert log. For information on other messages produced by your system, consult
one of the following sources:
•Your system’s
•Other system documentation
•Operating system documentation
•Application program documentation
Installation and Troubleshooting Guide
Introduction5
Understanding Event Messages
This section describes the various types of event messages generated by the Server Administrator.
When
an event occurs on your system, the Server Administrator sends information about one of the
following event types to the systems management console:
Table 1-1. Understanding Event Messages
IconAlert SeverityComponent Status
An event that describes the successful operation of a unit.
OK/Normal
Warning/Non-critical
Critical/Failure/Error
informational purposes and does not indicate an error condition. For example, the
alert may indicate the normal start or stop of an operation, such as power supply or
sensor reading returning to normal.
a
An event that is not necessarily significant, but may indicate a possible future
problem.
component (such as a temperature probe in an enclosure) has crossed a warning
threshold.
A significant event that indicates actual or imminent loss of data or loss of function.
For example,
For example, a Warning/Non-critical alert may indicate that a
crossing a failure threshold or a hardware failure such as
Server Administrator generates events based on status changes in the following sensors:
•
Temperature Sensor
— Helps protect critical components by alerting the systems management
console when temperatures become too high inside a chassis; also monitors a variety of locations in the
chassis and in any attached systems.
•
Fan Sensor
•
Voltage Sensor
— Monitors fans in various locations in the chassis and in any attached systems.
— Monitors voltages across critical components in various chassis locations and in any
attached systems.
•
Current Sensor
— Monitors the current (or amperage) output from the power supply (or supplies) in
the chassis and in any attached systems.
•
Chassis Intrusion Sensor
•
Redundancy Unit Sensor
— Monitors intrusion into the chassis and any attached systems.
— Monitors redundant units (critical units such as fans, AC power cords, or
power supplies) within the chassis; also monitors the chassis and any attached systems. For example,
redundancy allows a second or
n
th fan to keep the chassis components at a safe temperature when
another fan has failed. Redundancy is normal when the intended number of critical components are
operating. Redundancy is degraded when a component fails, but others are still operating. Redundancy
is lost when there is one less critical redundancy device than required.
•
Power Supply Sensor
Memory Prefailure Sensor
•
— Monitors power supplies in the chassis and in any attached systems.
— Monitors memory modules by counting the number of Error Correction
Code (ECC) memory corrections.
The alert is provided for
an array disk.
6Introduction
•
Fan Enclosure Sensor
insertion into the system, and by measuring how long a fan enclosure is absent from the chassis.
This sensor monitors the chassis and any attached systems.
•
AC Power Cord Sensor
Hardware Log Sensor
•
•
Processor Sensor
Pluggable Device Sensor
•
pluggable devices, such as memory cards.
•
Battery Sensor
— Monitors the status of one or more batteries in the system.
— Monitors protective fan enclosures by detecting their removal from and
— Monitors the presence of AC power for an AC power cord.
— Monitors the size of a hardware log.
— Monitors the processor status in the system.
— Monitors the addition, removal, or configuration errors for some
Sample Event Message Text
The following example shows the format of the event messages logged by Server Administrator.
EventID: 1000
Source: Server Administrator
Category: Instrumentation Service
Type: Information
Date and Time: Mon Oct 21 10:38:00 2002
Computer:
Description:
Server Administrator starting
Data: Bytes in Hex
<computer name>
Viewing Alerts and Event Messages
An event log is used to record information about important events.
Server Administrator generates alerts that are added to the operating system event log and to the
Server
Administrator Alert log. To view these alerts in Server Administrator:
1
Select the
2
Select the
3
Select the
You can also view the event log using your operating system’s event viewer. Each operating system’s event
viewer accesses the applicable operating system event log.
System
object in the tree view.
Logs
tab.
Alert
subtab.
Introduction7
The location of the event log file depends on the operating system you are using.
•In the Microsoft® Windows® 2000 Advanced Server and Windows Server™ 2003 operating systems,
messages are logged to the system event log and optionally to a unicode text file,
using Notepad), that is located in the
C:\Program Files\Dell\SysMgt
•In the Red Hat
®
Enterprise Linux and SUSE® Linux Enterprise Server operating system, messages are
.
install_path
\omsa\log
directory. The default
logged to the system log file. The default name of the system log file is
dcsys32.log
install_path
/var/log/messages
(viewable
is
. You can view
the messages file using a text editor such as vi or emacs.
NOTE: Logging messages to a unicode text file is optional. By default, the feature is disabled. To enable this
feature, modify the Event Manager section of the dcemdy32.ini file as follows:
•In Windows, locate the file at <install_path>\dataeng\ini and set
The default install_path is C:\Program Files\Dell\SysMgt. Restart the DSM SA Event Manager service.
•In Red Hat Enterprise Linux and SUSE Linux Enterprise Server, locate the file at <install_path>/dataeng/ini and
UnitextLog.enabled=True.
set
"/etc/init.d/dataeng restart" command to restart the Server Administrator event manager service. This will
also restart the Server Administrator data manager and SNMP services.
The default install_path is /opt/dell/srvadmin. Issue the
UnitextLog.enabled=True
.
The following subsections explain how to open the Windows 2000 Advanced Server, Windows Server 2003,
and the Red Hat Enterprise Linux and SUSE Linux Enterprise Server event viewers.
Viewing Events in Windows 2000 Advanced Server and Windows Server 2003
1
Click the
2
Double-click
3
In the
The
Start
Administrative Tools
Event Viewer
System Log
button, point to
window, click the
Settings
, and click
Control Panel
, and then double-click
Tree
tab and then click
Event Viewer
window displays a list of recently logged events.
.
.
System Log
.
4
To view the details of an event, double-click one of the event items.
NOTE: You can also look up the dcsys32.log file, in the install_path\omsa\log directory, to view the separate
event log file. The default install_path is C:\Program Files\Dell\SysMgt.
Viewing Events in Red Hat Enterprise Linux and SUSE Linux Enterprise Server
1
Log in as
2
Use a text editor such as vi or emacs to view the file named
The following example shows the Red Hat Enterprise Linux (and SUSE Linux Enterprise Server)
message log, /var/log/messages. The
NOTE: These messages are typically displayed as one long line. In the following example, the message is
displayed using line breaks to help you see the message text more clearly.
8Introduction
root
.
/var/log/messages
.
text in boldface type indicates the message text.
...
Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service
EventID: 1000
Server Administrator starting
Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service
EventID: 1001
Server Administrator startup complete
Feb 6 14:21:21 server01 Server Administrator: Instrumentation Service
EventID: 1254 Chassis intrusion detected Sensor location: Main chassis
intrusion Chassis location: Main System Chassis Previous state was: OK
(Normal) Chassis intrusion state: Open
Feb 6 14:21:51 server01 Server Administrator: Instrumentation Service
EventID: 1252 Chassis intrusion returned to normal Sensor location: Main
chassis intrusion Chassis location: Main System Chassis Previous state
was: Critical (Failed) Chassis intrusion state: Closed
Viewing the Event Information
The event log for each operating system contains some or all of the following information:
•
Date
— The date the event occurred.
•
Time
— The local time the event occurred.
•
Ty p e
— A classification of the event severity: Information, Warning, or Error.
User
•
•
•
•
•
•
— The name of the user on whose behalf the event occurred.
Computer
Source
Category
Event ID
Description
depending on the event type.
— The name of the system where the event occurred.
— The software that logged the event.
— The classification of the event by the event source.
— The number identifying the particular event type.
— A description of the event. The format and contents of the event description vary,
Introduction9
Understanding the Event Description
Ta b l e 1-2 lists in alphabetical order each line item that may appear in the event description.
Table 1-2. Event Description Reference
Description Line ItemExplanation
Action performed was:
Action requested was:
Additional Details:
details for the event>
<Additional power supply status
information>
Chassis intrusion state:
<Intrusion state>
Chassis location:
chassis>
Configuration error type:
<type of configuration error>
Current sensor value (in Amps):
<Reading>
Date and time of action:
<Date and time>
Device location: <
chassis
Discrete current state:
Discrete temperature state:
>
<State>
<Action>
<Action>
<Additional
<Name of
Location in
<State>
Specifies the action that was performed, for example:
Action performed was: Power cycle
Specifies the action that was requested, for example:
Action requested was: Reboot, shutdown OS first
Specifies additional details available for the hot plug event, for
example:
Memory device: DIMM1_A Serial number: FFFF30B1
Specifies information pertaining to the event, for example:
Power supply input AC is off, Power supply
POK (power OK) signal is not normal, Power
supply is turned off
Specifies the chassis intrusion state (open or closed), for example:
Chassis intrusion state: Open
Specifies name of the chassis that generated the message, for
example:
Chassis location: Main System Chassis
Specifies the type of configuration error that occurred, for example:
Configuration error type: Revision mismatch
Specifies the current sensor value in amps, for example:
Current sensor value (in Amps): 7.853
Specifies the date and time the action was performed, for example:
Date and time of action: Sat Jun 12 16:20:33
2004
Specifies the location of the device in the specified chassis, for
example:
Device location: Memory Card A
Specifies the state of the current sensor, for example:
Discrete current state: Good
Specifies the state of the temperature sensor, for example:
Specifies the location of the redundant power supply or cooling
unit in the chassis, for example:
Redundancy unit: Fan Enclosure
Specifies the location of the sensor in the specified chassis,
for example:
Sensor location: CPU1
Specifies the temperature in degrees Celsius, for example:
Temperature sensor value (in degrees Celsius):
30
Specifies the voltage sensor value in volts, for example:
Voltage sensor value (in Volts): 1.693
12Introduction
Event Message Reference
The following tables lists in numerical order each event ID and its corresponding description, along
with its severity and cause.
NOTE: For corrective actions, see the appropriate documentation.
Miscellaneous Messages
Miscellaneous messages in Table 2-1 indicate that certain alert systems are up and working.
Table 2-1. Miscellaneous Messages
Event ID DescriptionSeverityCause
0000Log was clearedInformationUser cleared the log from Server
Administrator.
0001Log backup createdInformationThe log was full, copied to backup, and
cleared.
1000Server Administrator startingInformationServer Administrator is beginning to
initialize.
1001Server Administrator startup
complete
1002A system BIOS update has been
scheduled for the next reboot
1003A previously scheduled system
BIOS update has been canceled
1004Thermal shutdown protection
has been initiated
InformationServer Administrator completed its
initialization.
InformationThe user has chosen to update the flash
basic input/output system (BIOS).
InformationThe user decides to cancel the flash
BIOS update, or an error occurs during
the flash.
ErrorThis message is generated when a
system is configured for thermal
shutdown due to an error event. If a
temperature sensor reading exceeds the
error threshold for which the system is
configured, the operating system shuts
down and the system powers off. This
event may also be initiated on certain
systems when a fan enclosure is removed
from the system for an extended period
of time.
Event Message Reference13
Table 2-1. Miscellaneous Messages (continued)
Event ID DescriptionSeverityCause
1005SMBIOS data is absentWarningThe system does not contain the
required systems management BIOS
version 2.2 or higher, or the BIOS is
corrupted.
1006Automatic System Recovery
(ASR) action was performed
Action performed was:
Date and time of action:
and time>
1007User initiated host system
control action
Action requested was:
1008Systems Management Data
Manager Started
1009Systems Management Data
Manager Stopped
1011RCI table is corruptWarningThis message is generated when the
1012IPMI Status
Interface: <
being used
the IPMI interface
>, <
additional
<Action>
<Date
<Action>
information if available and
applicable
>
ErrorThis message is generated when an
automatic system recovery action is
InformationUser requested a host system control
InformationSystems Management Data Manager
InformationSystems Management Data Manager
InformationThis message is generated to indicate
performed due to a hung operating
system. The action performed and the
time of action are provided.
action to reboot, power off, or power
cycle the system. Alternatively the user
had indicated protective measures to be
initiated in the event of a thermal
shutdown.
services were started.
services were stopped.
BIOS Remote Configuration Interface
(RCI) table is corrupted or cannot be
read by the systems management
software.
the Intelligent Platform Management
Interface (IPMI)) status of the system.
Additional information, when available,
includes Baseboard Management
Controller (BMC) not present, BMC
not responding, System Event Log (SEL)
not present, and SEL Data Record (SDR)
not present.
14Event Message Reference
Temperature Sensor Messages
Temperature sensors listed in Table 2-2 help protect critical components by alerting the systems
management console when temperatures become too high inside a chassis. The temperature sensor
messages use additional variables: sensor location, chassis location, previous state, and temperature
sensor value or state.
Table 2-2. Temperature Sensor Messages
Event ID DescriptionSeverityCause
1050Temperature sensor has failed
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Temperature sensor value
(in degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
1051Temperature sensor value
unknown
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
If sensor type is not discrete:
Temperature sensor value (in
degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
<Reading>
<Reading>
InformationA temperature sensor on the backplane
board, system board, or the carrier in the
specified system failed. The sensor
location, chassis location, previous state,
and temperature sensor value are provided.
InformationA temperature sensor on the backplane
board, system board, or drive carrier in the
specified system could not obtain a reading.
The sensor location, chassis location,
previous state, and a nominal temperature
sensor value are provided.
Event Message Reference15
Table 2-2. Temperature Sensor Messages (continued)
Event ID DescriptionSeverityCause
1052Temperature sensor returned
to a normal value
Sensor location:
<Location in
chassis>
Chassis location:
<Name of
chassis>
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in
degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
<Reading>
InformationA temperature sensor on the backplane
board, system board, or drive carrier in the
specified system returned to a valid range
after crossing a failure threshold. The
sensor location, chassis location, previous
state, and temperature sensor value
are provided.
<State>
1053Temperature sensor detected
a warning value
Sensor location:
<Location in
chassis>
Chassis location:
<Name of
chassis>
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in
degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
<Reading>
WarningA temperature sensor on the backplane
board, system board, CPU, or drive carrier
in the specified system exceeded its
warning threshold. The sensor location,
chassis location, previous state, and
temperature sensor value are provided.
<State>
16Event Message Reference
Table 2-2. Temperature Sensor Messages (continued)
Event ID DescriptionSeverityCause
1054Temperature sensor detected
a failure value
Sensor location:
<Location in
chassis>
Chassis location:
<Name of
chassis>
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in
degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
<Reading>
ErrorA temperature sensor on the backplane
board, system board, or drive carrier in the
specified system exceeded its failure
threshold. The sensor location, chassis
location, previous state, and temperature
sensor value are provided.
<State>
1055Temperature sensor detected
a non-recoverable value
Sensor location:
<Location in
chassis>
Chassis location:
<Name of
chassis>
Previous state was:
If sensor type is not discrete:
Temperature sensor value (in
degrees Celsius):
If sensor type is discrete:
Discrete temperature state:
<State>
<Reading>
ErrorA temperature sensor on the backplane
board, system board, or drive carrier in the
specified system detected an error from
which it cannot recover. The sensor
location, chassis location, previous state,
and temperature sensor value are provided.
<State>
Event Message Reference17
Cooling Device Messages
Cooling device sensors listed in Table 2-3 monitor how well a fan is functioning. Cooling device messages
provide status and warning information for fans in a particular chassis.
Table 2-3. Cooling Device Messages
Event ID DescriptionSeverityCause
1100Fan sensor has failed
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Fan sensor value:
1101Fan sensor value unknown
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Fan sensor value:
1102Fan sensor returned to a
normal value
Sensor location:
in chassis>
Chassis location:
chassis>
Previous state was:
Fan sensor value:
1103Fan sensor detected a
warning value
Sensor location:
in chassis>
Chassis location:
chassis>
Previous state was:
Fan sensor value:
<Reading>
<Reading>
<Location
<Name of
<State>
<Reading>
<Location
<Name of
<State>
<Reading>
InformationA fan sensor in the specified system is not
functioning. The sensor location, chassis
location, previous state, and fan sensor value
are provided.
InformationA fan sensor in the specified system could not
obtain a reading. The sensor location, chassis
location, previous state, and a nominal fan
sensor value are provided.
InformationA fan sensor reading on the specified system
returned to a valid range after crossing a
warning threshold. The sensor location, chassis
location, previous state, and fan sensor value
are provided.
WarningA fan sensor reading in the specified system
exceeded a warning threshold. The sensor
location, chassis location, previous state, and
fan sensor value are provided.
18Event Message Reference
Table 2-3. Cooling Device Messages (continued)
Event ID DescriptionSeverityCause
1104Fan sensor detected a
failure value
Sensor location:
<Location
in chassis>
Chassis location:
<Name of
ErrorA fan sensor in the specified system detected
the failure of one or more fans. The sensor
location, chassis location, previous state, and
fan sensor value are provided.
chassis>
Previous state was:
Fan sensor value:
1105Fan sensor detected a
non-recoverable value
Sensor location:
in chassis>
Chassis location:
<State>
<Reading>
<Location
<Name of
ErrorA fan sensor detected an error from which it
cannot recover. The sensor location, chassis
location, previous state, and fan sensor value
are provided.
chassis>
Previous state was:
Fan sensor value:
<State>
<Reading>
Voltage Sensor Messages
Voltage sensors listed in Table 2-4 monitor the number of volts across critical components. Voltage sensor
messages provide status and warning information for voltage sensors in a particular chassis.
Table 2-4. Voltage Sensor Messages
Event ID DescriptionSeverityCause
1150Voltage sensor has failed
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value (in
Volts):
If sensor type is discrete:
Discrete voltage state:
<Reading>
<State>
InformationA voltage sensor in the specified system
failed. The sensor location, chassis
location, previous state, and voltage sensor
value are provided.
Event Message Reference19
Table 2-4. Voltage Sensor Messages (continued)
Event ID DescriptionSeverityCause
1151Voltage sensor value unknown
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value
(in Volts):
If sensor type is discrete:
Discrete voltage state:
1152Voltage sensor returned to a
normal value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value
(in Volts):
If sensor type is discrete:
Discrete voltage state:
1153Voltage sensor detected a
warning value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value
(in Volts):
If sensor type is discrete:
Discrete voltage state:
<Reading>
<State>
<Reading>
<State>
<Reading>
<State>
InformationA voltage sensor in the specified system
could not obtain a reading. The sensor
location, chassis location, previous state,
and a nominal voltage sensor value
are provided.
InformationA voltage sensor in the specified system
returned to a valid range after crossing a
failure threshold. The sensor location,
chassis location, previous state, and
voltage sensor value are provided.
WarningA voltage sensor in the specified system
exceeded its warning threshold. The
sensor location, chassis location, previous
state, and voltage sensor value are
provided.
20Event Message Reference
Table 2-4. Voltage Sensor Messages (continued)
Event ID DescriptionSeverityCause
1154Voltage sensor detected a
failure value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value
(in Volts):
If sensor type is discrete:
Discrete voltage state:
1155Voltage sensor detected a
non-recoverable value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not discrete:
Voltage sensor value
(in Volts):
If sensor type is discrete:
Discrete voltage state:
<Reading>
<State>
<Reading>
<State>
ErrorA voltage sensor in the specified system
exceeded its failure threshold. The sensor
location, chassis location, previous state,
and voltage sensor value are provided.
ErrorA voltage sensor in the specified system
detected an error from which it cannot
recover. The sensor location, chassis
location, previous state, and voltage sensor
value are provided.
Event Message Reference21
Current Sensor Messages
Current sensors listed in Table 2-5 measure the amount of current (in amperes) that is traversing critical
components. Current sensor messages provide status and warning information for current sensors in a
particular chassis.
Table 2-5. Current Sensor Messages
Event ID DescriptionSeverityCause
1200Current sensor has failed
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value (in
Amps):
Current sensor value (in
Watts):
If sensor type is discrete:
Discrete current state:
<State>
1201Current sensor value unknown
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value (in
Amps):
Current sensor value (in
Watts):
If sensor type is discrete:
Discrete current state:
<State>
<Reading>
<Reading>
<Reading>
<Reading>
OR
OR
InformationA current sensor in the specified system
failed. The sensor location, chassis location,
previous state, and current sensor value are
provided.
InformationA current sensor in the specified system
could not obtain a reading. The sensor
location, chassis location, previous state, and
a nominal current sensor value are provided.
22Event Message Reference
Table 2-5. Current Sensor Messages (continued)
Event ID DescriptionSeverityCause
1202Current sensor returned to
a normal value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value (in
Amps):
Current sensor value (in
Watts):
If sensor type is discrete:
Discrete current state:
<Reading>
<Reading>
OR
InformationA current sensor in the specified system
returned to a valid range after crossing a
failure threshold. The sensor location, chassis
location, previous state, and current sensor
value are provided.
<State>
1203Current sensor detected a
warning value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value (in
Amps):
Current sensor value (in
Watts):
If sensor type is discrete:
Discrete current state:
<Reading>
<Reading>
OR
WarningA current sensor in the specified system
exceeded its warning threshold. The sensor
location, chassis location, previous state, and
current sensor value are provided.
<State>
Event Message Reference23
Table 2-5. Current Sensor Messages (continued)
Event ID DescriptionSeverityCause
1204Current sensor detected a
failure value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value (in
Amps):
Current sensor value (in
Watts):
If sensor type is discrete:
Discrete current state:
<Reading>
<Reading>
OR
ErrorA current sensor in the specified system
exceeded its failure threshold. The sensor
location, chassis location, previous state, and
current sensor value are provided.
<State>
1205Current sensor detected a
non-recoverable value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
If sensor type is not
discrete:
Current sensor value (in
Amps):
Current sensor value (in
Watts):
If sensor type is discrete:
Discrete current state:
<Reading>
<Reading>
OR
ErrorA current sensor in the specified system
detected an error from which it cannot
recover. The sensor location, chassis location,
previous state, and current sensor value are
provided.
<State>
24Event Message Reference
Chassis Intrusion Messages
Chassis intrusion messages listed in Table 2-6 are a security measure. Chassis intrusion means that
someone is opening the cover to a system’s chassis. Alerts are sent to prevent unauthorized removal of
parts from a chassis.
Table 2-6. Chassis Intrusion Messages
Event ID DescriptionSeverityCause
1250Chassis intrusion sensor has
failed
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
1251Chassis intrusion sensor
value unknown
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
1252Chassis intrusion returned
to normal
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
InformationA chassis intrusion sensor in the specified
system failed. The sensor location, chassis
location, previous state, and chassis intrusion
state are provided.
InformationA chassis intrusion sensor in the specified
system could not obtain a reading. The sensor
location, chassis location, previous state, and
chassis intrusion state are provided.
InformationA chassis intrusion sensor in the specified
system detected that a cover was opened while
the system was operating but has since been
replaced. The sensor location, chassis location,
previous state, and chassis intrusion state are
provided.
Event Message Reference25
Table 2-6. Chassis Intrusion Messages (continued)
Event ID DescriptionSeverityCause
1253Chassis intrusion in
progress
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
1254Chassis intrusion detected
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
1255Chassis intrusion sensor
detected a non-recoverable
value
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Chassis intrusion state:
<Intrusion state>
WarningA chassis intrusion sensor in the specified
system detected that a system cover is currently
being opened and the system is operating.
The sensor location, chassis location, previous
state, and chassis intrusion state are provided.
ErrorA chassis intrusion sensor in the specified
system detected that the system cover was
opened while the system was operating.
The sensor location, chassis location, previous
state, and chassis intrusion state are provided.
ErrorA chassis intrusion sensor in the specified
system detected an error from which it cannot
recover. The sensor location, chassis location,
previous state, and chassis intrusion state are
provided.
Redundancy Unit Messages
Redundancy means that a system chassis has more than one of certain critical components. Fans and
power supplies, for example, are so important for preventing damage or disruption of a computer system
that a chassis may have “extra” fans or power supplies installed. Redundancy allows a second or nth fan
to keep the chassis components at a safe temperature when the primary fan has failed. Redundancy is
normal when the intended number of critical components are operating. Redundancy is degraded when
a component fails but others are still operating. Redundancy is lost when the number of components
functioning falls below the redundancy threshold.
26Event Message Reference
Ta b l e 2-7 lists the redundancy unit messages.
The number of devices required for full redundancy is provided as part of the message, when applicable,
for the redundancy unit and the platform. For details on redundancy computation, see the respective
platform documentation.
Table 2-7. Redundancy Unit Messages
Event ID DescriptionSeverityCause
1300Redundancy sensor has failed
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
<State>
1301Redundancy sensor value
unknown
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
<State>
1302Redundancy not applicable
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
<State>
1303Redundancy is offline
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
<State>
InformationA redundancy sensor in the specified system
failed. The redundancy unit location, chassis
location, previous redundancy state, and the
number of devices required for full
redundancy are provided.
InformationA redundancy sensor in the specified system
could not obtain a reading. The redundancy
unit location, chassis location, previous
redundancy state, and the number of
devices required for full redundancy
are provided.
InformationA redundancy sensor in the specified system
detected that a unit was not redundant.
The redundancy location, chassis location,
previous redundancy state, and the number
of devices required for full redundancy are
provided.
InformationA redundancy sensor in the specified system
detected that a redundant unit is offline.
The redundancy unit location, chassis
location, previous redundancy state, and the
number of devices required for full
redundancy are provided.
Event Message Reference27
Table 2-7. Redundancy Unit Messages (continued)
Event ID DescriptionSeverityCause
1304Redundancy regained
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
InformationA redundancy sensor in the specified system
detected that a “lost” redundancy device has
been reconnected or replaced; full redundancy
is in effect. The redundancy unit location,
chassis location, previous redundancy state,
and the number of devices required for full
redundancy are provided.
<State>
1305Redundancy degraded
Redundancy unit:
<Redundancy
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
WarningA redundancy sensor in the specified system
detected that one of the components of the
redundancy unit has failed but the unit is
still redundant. The redundancy unit
location, chassis location, previous redundancy
state, and the number of devices required
for full redundancy are provided.
<State>
1306Redundancy lost
Redundancy unit:
location in chassis>
Chassis location: <Name of
chassis>
Previous redundancy state was:
<Redundancy
Error A redundancy sensor in the specified system
detected that one of the components in the
redundant unit has been disconnected, has
failed, or is not present. The redundancy
unit location, chassis location, previous
redundancy state, and the number of devices
required for full redundancy are provided.
<State>
28Event Message Reference
Power Supply Messages
Power supply sensors monitor how well a power supply is functioning. Power supply messages listed in
Ta b l e 2-8 provide status and warning information for power supplies present in a particular chassis.
Table 2-8. Power Supply Messages
Event ID DescriptionSeverityCause
1350Power supply sensor has
failed Sensor location:
<Location in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type of configuration error>
1351Power supply sensor value
unknown
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type of configuration error>
<type of
<type of
InformationA power supply sensor in the specified
system failed. The sensor location, chassis
location, previous state, and additional
power supply status information
are provided.
InformationA power supply sensor in the specified
system could not obtain a reading.
The sensor location, chassis location,
previous state, and additional power supply
status information are provided.
Event Message Reference29
Table 2-8. Power Supply Messages (continued)
Event ID DescriptionSeverityCause
1352Power supply returned to
normal Sensor location:
<Location in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
<type of
InformationA power supply has been reconnected or
replaced. The sensor location, chassis
location, previous state, and additional
power supply status information
are provided.
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type of configuration error>
1353Power supply detected a
warning Sensor location:
<Location in chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
<type of
WarningA power supply sensor reading in the
specified system exceeded a user-definable
warning threshold. The sensor location,
chassis location, previous state, and
additional power supply status information
are provided.
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type of configuration error>
30Event Message Reference
Table 2-8. Power Supply Messages (continued)
Event ID DescriptionSeverityCause
1354Power supply detected a failure
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
<type of
ErrorA power supply has been disconnected or
has failed. The sensor location, chassis
location, previous state, and additional
power supply status information
are provided.
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type
of configuration error>
1355Power supply sensor detected
a non-recoverable value
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
Previous state was: <State>
Power Supply type:
<type of
ErrorA power supply sensor in the specified system
detected an error from which it cannot
recover. The sensor location, chassis location,
previous state, and additional power supply
status information are provided.
power supply>
<Additional power supply status
information>
If in configuration error
state:
Configuration error type:
<type of configuration error>
Event Message Reference31
Memory Device Messages
Memory device messages listed in Table 2-9 provide status and warning information for memory
modules present in a particular system. Memory devices determine health status by monitoring the ECC
memory correction rate and the type of memory events that have occurred.
NOTE: A critical status does not always indicate a system failure or loss of data. In some instances, the system has
exceeded the ECC correction rate. Although the system continues to function, you should perform system
maintenance as described in Table
NOTE: In Table 2-9, <status> can be either critical or non-critical.
Table 2-9. Memory Device Messages
Event ID DescriptionSeverityCause
1403Memory device status is
<status>
<location in chassis>
Possible memory module event
cause:
1404Memory device status is
<status>
<location in chassis>
Possible memory module event
cause: <list of causes>
Memory device location:
<list of causes>
Memory device location:
2-9.
WarningA memory device correction rate
exceeded an acceptable value.
The memory device status and location
are provided.
ErrorA memory device correction rate
exceeded an acceptable value, a memory
spare bank was activated, or a multibit
ECC error occurred. The system continues
to function normally (except for a
multibit error). Replace the memory
module identified in the message during
the system’s next scheduled maintenance.
Clear the memory error on multibit ECC
error. The memory device status and
location are provided.
32Event Message Reference
Fan Enclosure Messages
Some systems are equipped with a protective enclosure for fans. Fan enclosure messages listed in
Ta b l e 2-10 monitor whether foreign objects are present in an enclosure and how long a fan enclosure is
missing from a chassis.
Table 2-10. Fan Enclosure Messages
Event ID DescriptionSeverityCause
1450Fan enclosure sensor has
failed
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
1451Fan enclosure sensor value
unknown
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
1452Fan enclosure inserted into
system
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
1453Fan enclosure removed from
system
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
InformationThe fan enclosure sensor in the specified
system failed. The sensor location and chassis
location are provided.
InformationThe fan enclosure sensor in the specified
system could not obtain a reading. The sensor
location and chassis location are provided.
InformationA fan enclosure has been inserted into the
specified system. The sensor location and
chassis location are provided.
WarningA fan enclosure has been removed from the
specified system. The sensor location and
chassis location are provided.
Event Message Reference33
Table 2-10. Fan Enclosure Messages (continued)
Event ID DescriptionSeverityCause
1454Fan enclosure removed from
system for an extended
amount of time
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
1455Fan enclosure sensor
detected a non-recoverable
value
Sensor location: <Location
in chassis>
Chassis location: <Name of
chassis>
ErrorA fan enclosure has been removed from the
specified system for a user-definable length of
time. The sensor location and chassis location
are provided.
ErrorA fan enclosure sensor in the specified system
detected an error from which it cannot recover.
The sensor location and chassis location
are provided.
AC Power Cord Messages
AC power cord messages listed in Table 2-11 provide status and warning information for power cords that
are part of an AC power switch, if your system supports AC switching.
Table 2-11. AC Power Cord Messages
Event ID DescriptionSeverityCause
1500AC power cord sensor has
failed Sensor location:
<Location in chassis>
Chassis location: <Name of
chassis>
1501AC power cord is not being
monitored
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
InformationAn AC power cord sensor in the specified
InformationThe AC power cord status is not being
34Event Message Reference
system failed. The AC power cord status
cannot be monitored. The sensor location
and chassis location information are
provided.
monitored. This occurs when a system’s
expected AC power configuration is set to
nonredundant. The sensor location and
chassis location information are provided.
Table 2-11. AC Power Cord Messages (continued)
Event ID DescriptionSeverityCause
1502AC power has been restored
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
1503AC power has been lost
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
1504AC power has been lost
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
1505AC power has been lost
Sensor location: <Location in
chassis>
Chassis location: <Name of
chassis>
InformationAn AC power cord that did not have
AC power has had the power restored.
The sensor location and chassis location
information are provided.
WarningAn AC power cord has lost its power, but
there is sufficient redundancy to classify
this as a warning. The sensor location and
chassis location information are provided.
ErrorAn AC power cord has lost its power, and
lack of redundancy requires this to be
classified as an error. The sensor location and
chassis location information are provided.
ErrorAn AC power cord sensor in the specified
system failed. The AC power cord status
cannot be monitored. The sensor location
and chassis location information are
provided.
Hardware Log Sensor Messages
Hardware logs provide hardware status messages to systems management software. On certain systems,
the hardware log is implemented as a circular queue. When the log becomes full, the oldest status
messages are overwritten when new status messages are logged. On some systems, the log is not circular.
On these systems, when the log becomes full, subsequent hardware status messages are lost. Hardware
log sensor messages listed in
logs that may fill up, resulting in lost status messages.
Ta b l e 2-12 provide status and warning information about the noncircular
Event Message Reference35
Table 2-12. Hardware Log Sensor Messages
Event ID DescriptionSeverityCause
1550Log monitoring has been
disabled
Log type:
1551Log status is unknown
Log type:
1552Log size is no longer near
or at capacity
Log type:
1553Log size is near or at
capacity
Log type:
1554Log size is full
Log type:
1555Log sensor has failed
Log type:
<Log type>
<Log type>
<Log type>
<Log type>
<Log type>
<Log type>
InformationA hardware log sensor in the specified
system is disabled. The log type information
is provided.
InformationA hardware log sensor in the specified
system could not obtain a reading. The log
type information is provided.
InformationThe hardware log on the specified system is
no longer near or at its capacity, usually as
the result of clearing the log. The log type
information is provided.
WarningThe size of a hardware log on the specified
system is near or at the capacity of the
hardware log. The log type information is
provided.
ErrorThe size of a hardware log on the specified
system is full. The log type information is
provided.
ErrorA hardware log sensor in the specified
system failed. The hardware log status
cannot be monitored. The log type
information is provided.
36Event Message Reference
Processor Sensor Messages
Processor sensors monitor how well a processor is functioning. Processor messages listed in Table 2-13
provide status and warning information for processors in a particular chassis.
Table 2-13. Processor Sensor Messages
Event ID DescriptionSeverityCause
1600Processor sensor has failed
Sensor Location:
chassis>
Chassis Location:
chassis>
Previous state was:
Processor sensor status:
<status>
1601Processor sensor value
unknown Sensor Location:
<Location in chassis>
Chassis Location:
chassis>
Previous state was:
Processor sensor status:
<status>
1602Processor sensor returned to
a normal value
Sensor Location:
chassis>
Chassis Location:
chassis>
Previous state was:
Processor sensor status:
<status>
<Location in
<Name of
<State>
<Name of
<State>
<Location in
<Name of
<State>
InformationA processor sensor in the specified system is
not functioning. The sensor location, chassis
location, previous state and processor sensor
status are provided.
InformationA processor sensor in the specified system
could not obtain a reading. The sensor
location, chassis location, previous state and
processor sensor status are provided.
InformationA processor sensor in the specified system
transitioned back to a normal state.
The sensor location, chassis location, previous
state and processor sensor status
are provided.
Event Message Reference37
Table 2-13. Processor Sensor Messages (continued)
Event ID DescriptionSeverityCause
1603Processor sensor detected a
warning value
Sensor Location:
<Location in
chassis>
Chassis Location:
<Name of
WarningA processor sensor in the specified system is
in a throttled state. The sensor location,
chassis location, previous state and
processor sensor status are provided.
chassis>
Previous state was:
Processor sensor status:
<State>
<status>
1604Processor sensor detected a
failure value
Sensor Location:
<Location in
chassis>
Chassis Location:
<Name of
ErrorA processor sensor in the specified system is
disabled, has a configuration error, or
experienced a thermal trip. The sensor
location, chassis location, previous state and
processor sensor status are provided.
chassis>
Previous state was:
Processor sensor status:
<State>
<status>
1605Processor sensor detected a
non-recoverable value
Sensor Location:
<Location in
chassis>
Chassis Location:
<Name of
ErrorA processor sensor in the specified system
has failed. The sensor location, chassis
location, previous state and processor sensor
status are provided.
chassis>
Previous state was:
Processor sensor status:
<State>
<status>
38Event Message Reference
Pluggable Device Messages
The pluggable device messages listed in Table 2-14 provide status and error information when some
devices, such as memory cards, are added or removed.
Table 2-14. Pluggable Device Messages
Event ID DescriptionSeverityCause
1650
1651Device added to system
1652Device removed from system
1653Device configuration error
<Device plug event type unknown>
Device location:
if available>
Chassis location:
if available>
Additional details:
details for the events,
if available>
Device location:
chassis>
Chassis location:
Additional details:
details for the events>
Device location:
chassis>
Chassis location:
chassis>
Additional details:
details for the events>
detected
Device location:
chassis>
Chassis location:
chassis>
Additional details:
details for the events>
<Location in chassis,
<Name of chassis,
<Additional
<Location in
<Name of chassis>
<Additional
<Location in
<Name of
<Additional
<Location in
<Name of
<Additional
InformationA pluggable device event
message of unknown type was
received. The device location,
chassis location, and
additional event details, if
available, are provided.
InformationA device was added in the
specified system. The device
location, chassis location, and
additional event details, if
available, are provided.
InformationA device was removed from
the specified system.
The device location, chassis
location, and additional event
details, if available, are
provided.
ErrorA configuration error was
detected for a pluggable
device in the specified
system. The device may have
been added to the system
incorrectly.
Event Message Reference39
Battery Sensor Messages
Battery sensors monitor how well a battery is functioning. Battery messages listed in Table 2-15 provide
status and warning information for batteries in a particular chassis.
Table 2-15. Battery Sensor Messages
Event ID DescriptionSeverityCause
1700Battery sensor has failed
Sensor location: <Location in chassis>
Chassis location: <Name of chassis>
Previous state was:
Battery sensor status:
1701Battery sensor value unknown
Sensor Location:
Chassis Location:
Previous state was:
Battery sensor status:
1702Battery sensor returned to a normal
value
Sensor Location:
Chassis Location:
Previous state was:
Battery sensor status:
1703Battery sensor detected a warning
value
Sensor Location:
Chassis Location:
Previous state was:
Battery sensor status:
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
InformationA battery sensor in the
specified system is not
functioning. The sensor
location, chassis location,
previous state, and battery
sensor status are provided.
InformationA battery sensor in the
specified system could not
retrieve a reading. The sensor
location, chassis location,
previous state, and battery
sensor status are provided.
InformationA battery sensor in the
specified system detected
that a battery transitioned
back to a normal state.
The sensor location, chassis
location, previous state, and
battery sensor status are
provided.
WarningA battery sensor in the
specified system detected
that a battery is in a predictive
failure state. The sensor
location, chassis location,
previous state, and battery
sensor status are provided.
40Event Message Reference
Table 2-15. Battery Sensor Messages (continued)
Event ID DescriptionSeverityCause
1704Battery sensor detected a failure
value
Sensor Location:
Chassis Location:
Previous state was:
Battery sensor status:
1705Battery sensor detected a non-
recoverable value
Sensor Location:
Chassis Location:
Previous state was:
Battery sensor status:
<Location in chassis>
<Name of chassis>
<State>
<status>
<Location in chassis>
<Name of chassis>
<State>
<status>
ErrorA battery sensor in the
specified system detected
that a battery has failed.
The sensor location, chassis
location, previous state, and
battery sensor status are
provided.
ErrorA battery sensor in the
specified system detected
that a battery has failed.
The sensor location, chassis
location, previous state, and
battery sensor status are
provided.
Event Message Reference41
42Event Message Reference
System Event Log Messages for IPMI Systems
The following tables list the system event log (SEL) messages, their severity, and cause.
NOTE: For corrective actions, see the appropriate documentation.
Temperature Sensor Events
The temperature sensor event messages help protect critical components by alerting the systems
management console when the temperature rises inside the chassis. These event messages use
additional variables, such as sensor location, chassis location, previous state, and temperature
sensor
value or state.
Table 3-1. Temperature Sensor Events
Event MessageSeverityCause
<
Sensor Name/Location
temperature sensor detected a
failure <
Name/Location
that this sensor is monitoring.
For example, "PROC Temp" or
"Planar Temp."
Reading is specified in degree
Celsius. For example 100 C.
<Sensor Name/Location
temperature sensor detected
a warning <
<
Sensor Name/Location>
temperature sensor returned
to warning state <
<
Sensor Name/Location
temperature sensor returned
to normal state <
Reading
> is the entity
Reading
>
> where <
>
>.
Reading
>
Reading
Sensor
>.
>.
CriticalTemperature of the backplane board, system
board, or the carrier in the specified system
<Sensor Name/Location> exceeded the critical
threshold.
WarningTemperature of the backplane board, system
board, or the carrier in the specified system
<Sensor Name/Location> exceeded the
non-critical threshold.
WarningTemperature of the backplane board, system
board, or the carrier in the specified system
<Sensor Name/Location> returned from critical
state to non-critical state.
InformationTemperature of the backplane board, system
board, or the carrier in the specified system
<Sensor Name/Location> returned to normal
operating range.
System Event Log Messages for IPMI Systems43
Voltage Sensor Events
The voltage sensor event messages monitor the number of volts across critical components.
These
messages provide status and warning information for voltage sensors for a particular chassis.
Table 3-2. Voltage Sensor Events
Event MessageSeverityCause
<
Sensor Name/Location
sensor detected a failure <
where <
entity that this sensor is
monitoring.
Reading is specified in volts.
For example, 3.860 V.
Sensor Name/Location
<
sensor state asserted.
<
Sensor Name/Location
sensor state de-asserted.
Sensor Name/Location
<
sensor detected a warning
<
Reading
Sensor Name/Location
<
sensor returned to normal
<
Reading
Sensor Name/Location
>.
>.
> voltage
Reading
> is the
> voltage
> voltage
> voltage
> voltage
CriticalThe voltage of the monitored device has
>
CriticalThe voltage specified by
InformationThe voltage of a previously reported
WarningVoltage of the monitored entity
InformationThe voltage of a previously reported
exceeded the critical threshold.
<Sensor Name/Location> is in critical state.
<Sensor Name/Location> is returned to
normal state.
<Sensor Name/Location> exceeded the
warning threshold.
<Sensor Name/Location> is returned to
normal state.
44System Event Log Messages for IPMI Systems
Fan Sensor Events
The cooling device sensors monitor how well a fan is functioning. These messages provide status warning
and failure messages for fans for a particular chassis.
Table 3-3. Fan Sensor Events
Event MessageSeverityCause
<
Sensor Name/Location
sensor detected a failure
<
Reading
Name/Location
that this sensor is monitoring.
For example "BMC Back Fan" or
"BMC Front Fan."
Reading is specified in RPM.
For example, 100 RPM.
> where <
> is the entity
<Sensor Name/Location
sensor returned to normal state
Reading
<
Sensor Name/Location
<
sensor detected a warning
Reading
<
<
Sensor Name/Location
Redundancy sensor redundancy
degraded.
Sensor Name/Location
<
Redundancy sensor redundancy
lost.
>.
>.
<Sensor Name/Location> Fan
Redundancy sensor redundancy
regained
> Fan
Sensor
> Fan
> Fan
> Fan
> Fan
CriticalThe speed of the specified <Sensor Name/Location>
fan is not sufficient to provide enough cooling to the
system.
InformationThe fan specified by <Sensor Name/Location> has
returned to its normal operating speed.
WarningThe speed of the specified <Sensor Name/Location>
fan may not be sufficient to provide enough cooling
to the system.
InformationThe fan specified by <Sensor Name/Location> may
have failed and hence, the redundancy has been
degraded.
CriticalThe fan specified by <Sensor Name/Location> may
have failed and hence, the redundancy that was
degraded previously has been lost.
InformationThe fan specified by <Sensor Name/Location> may
have started functioning again and hence, the
redundancy has been regained.
System Event Log Messages for IPMI Systems45
Processor Status Events
The processor status messages monitor the functionality of the processors in a system. These messages
provide processor health and warning information of a system.
Table 3-4. Processor Status Events
Event MessageSeverityCause
<
Processor Entity
sensor IERR, where <
Entity
generated the event. For example,
PROC for a single processor system
and PROC # for multiprocessor
system.
<
sensor Thermal Trip.
<
sensor recovered from IERR.
<
sensor disabled.
<
sensor terminator not present.
> is the processor that
Processor Entity
Processor Entity
Processor Entity
Processor Entity
< Processor Entity>
deasserted.
<Processor Entity>
asserted.
<Processor Entity>
was deasserted.
<Processor Entity>
error was asserted.
<Processor Entity>
error was deasserted.
<Processor Entity>
asserted.
<Processor Entity>
deasserted.
> status processor
Processor
> status processor
> status processor
> status processor
> status processor
presence was
presence was
thermal tripped
configuration
configuration
throttled was
throttled was
CriticalIERR internal error generated by the
<Processor Entity>.
CriticalThe processor generates this event before it
shuts down because of excessive heat caused
by lack of cooling or heat synchronization.
InformationThis event is generated when a processor
recovers from the internal error.
WarningThis event is generated for all processors that
are disabled.
InformationThis event is generated if the terminator is
missing on an empty processor slot.
CriticalThis event is generated when the system
could not detect the processor.
InformationThis event is generated when the earlier
processor detection error was corrected.
InformationThis event is generated when the processor
has recovered from an earlier thermal condition.
CriticalThis event is generated when the processor
configuration is incorrect.
InformationThis event is generated when the earlier
processor configuration error was corrected.
WarningThis event is generated when the processor
slows down to prevent over heating.
InformationThis event is generated when the earlier
processor throttled event was corrected.
46System Event Log Messages for IPMI Systems
Power Supply Events
The power supply sensors monitor the functionality of the power supplies. These messages provide status
and warning information for power supplies for a particular system.
Table 3-5. Power Supply Events
Event MessageSeverityCause
<
Power Supply Sensor Name
supply sensor removed.
<
Power Supply Sensor Name
supply sensor AC recovered.
<
Power Supply Sensor Name
supply sensor returned to normal
state.
Entity Name
<
sensor redundancy degraded.
<
Entity Name
sensor redundancy lost.
<
Entity Name
sensor redundancy regained.
> PS Redundancy
> PS Redundancy
> PS Redundancy
<Power Supply Sensor Name>
predictive failure was asserted
<Power Supply Sensor Name>
lost was asserted
<Power Supply Sensor Name>
predictive failure was deasserted
<Power Supply Sensor Name>
lost was deasserted
> power
> power
> power
input
input
CriticalThis event is generated when the power supply
sensor is removed.
InformationThis event is generated when the power supply
has been replaced.
InformationThis event is generated when the power supply
that failed or removed was replaced and the
state has returned to normal.
InformationPower supply redundancy is degraded if one of
the power supply sources is removed or failed.
CriticalPower supply redundancy is lost if only one
power supply is functional.
InformationThis event is generated if the power supply has
been reconnected or replaced.
WarningThis event is generated when the power supply
is about to fail.
CriticalThis event is generated when the power supply
is unplugged.
InformationThis event is generated when the power
supply has recovered from an earlier predictive
failure event.
InformationThis event is generated when the power supply
is plugged in.
System Event Log Messages for IPMI Systems47
Memory ECC Events
The memory ECC event messages monitor the memory modules in a system. These messages monitor
the ECC memory correction rate and the type of memory events that occurred.
Table 3-6. Memory ECC Events
Event MessageSeverityCause
ECC error correction detected
on Bank # DIMM [A/B].
ECC uncorrectable error
detected on Bank # [DIMM].
Correctable memory error
logging disabled.
InformationThis event is generated when there is a memory error
correction on a particular Dual Inline Memory Module
(DIMM).
CriticalThis event is generated when the chipset is unable to
correct the memory errors. Usually, a bank number is
provided and DIMM may or may not be identifiable,
depending on the error.
CriticalThis event is generated when the chipset in the ECC
error correction rate exceeds a predefined limit.
BMC Watchdog Events
The BMC watchdog operations are performed when the system hangs or crashes. These messages
monitor the status and occurrence of these events in a system.
Table 3-7. BMC Watchdog Events
Event MessageSeverityCause
BMC OS Watchdog timer expired. InformationThis event is generated when the BMC watchdog
timer expires and no action is set.
BMC OS Watchdog performed
system reboot.
BMC OS Watchdog performed
system power off.
BMC OS Watchdog performed
system power cycle.
CriticalThis event is generated when the BMC watchdog
detects that the system has crashed (timer expired
because no response was received from Host) and the
action is set to reboot.
CriticalThis event is generated when the BMC watchdog
detects that the system has crashed (timer expired
because no response was received from Host) and the
action is set to power off.
CriticalThis event is generated when the BMC watchdog
detects that the system has crashed (timer expired
because no response was received from Host) and the
action is set to power cycle.
48System Event Log Messages for IPMI Systems
Memory Events
The memory modules can be configured in different ways in particular systems. These messages monitor
the status, warning, and configuration information about the memory modules in the system.
Table 3-8. Memory Events
Event MessageSeverityCause
Memory RAID redundancy
degraded.
Memory RAID redundancy
lost.
Memory RAID redundancy
regained
Memory Mirrored
redundancy degraded.
Memory Mirrored
redundancy lost.
Memory Mirrored
redundancy regained.
Memory Spared redundancy
degraded.
Memory Spared redundancy
lost.
Memory Spared redundancy
regained.
Information This event is generated when there is a memory failure in a
RAID-configured memory configuration.
CriticalThis event is generated when redundancy is lost in a
RAID-configured memory configuration.
Information This event is generated when the redundancy lost or degraded
earlier is regained in a RAID-configured
memory configuration.
Information This event is generated when there is a memory failure in a
mirrored memory configuration.
CriticalThis event is generated when redundancy is lost in a mirrored
memory configuration.
Information This event is generated when the redundancy lost or degraded
earlier is regained in a mirrored memory configuration.
Information This event is generated when there is a memory failure in a
spared memory configuration.
CriticalThis event is generated when redundancy is lost in a spared
memory configuration.
Information This event is generated when the redundancy lost or degraded
earlier is regained in a spared memory configuration.
Hardware Log Sensor Events
The hardware logs provide hardware status messages to the system management software. On particular
systems, the subsequent hardware messages are not displayed when the log is full. These messages
provide status and warning messages when the logs are full.
Table 3-9. Hardware Log Sensor Events
Event MessageSeverityCause
Log full detected.CriticalThis event is generated when the SEL device detects that
only one entry can be added to the SEL before it is full.
Log cleared.InformationThis event is generated when the SEL is cleared.
System Event Log Messages for IPMI Systems49
Drive Events
The drive event messages monitor the health of the drives in a system. These events are generated when
there is a fault in the drives indicated.
Table 3-10. Drive Events
Event MessageSeverityCause
Drive <
state.
Drive <
fault state.
Drive
drive presence was asserted
Drive
predictive failure was
asserted
Drive
predictive failure was
deasserted
Drive
hot spare was asserted
Drive
hot spare was deasserted
Drive
consistency check in progress
was asserted
Drive
consistency check in progress
was deasserted
Drive
in critical array was
asserted
Drive
in critical array was
deasserted
Drive
in failed array was asserted
Drive #
> asserted fault
Drive #
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
<Drive #>
> de-asserted
CriticalThis event is generated when the specified drive in the
array is faulty.
InformationThis event is generated when the specified drive
recovers from a faulty condition.
Informational This event is generated when the drive is installed.
WarningThis event is generated when the drive is about to fail.
Informational This event is generated when the drive from earlier
predictive failure is corrected.
WarningThis event is generated when the drive is placed in a
hot spare.
Informational This event is generated when the drive is taken out of
hot spare.
WarningThis event is generated when the drive is placed in
consistency check.
Informational This event is generated when the consistency check of
the drive is completed.
CriticalThis event is generated when the drive is placed in
critical array.
Informational This event is generated when the drive is removed
from critical array.
CriticalThis event is generated when the drive is placed in the
fail array.
50System Event Log Messages for IPMI Systems
Table 3-10. Drive Events (continued)
Event MessageSeverityCause
Drive
in failed array was deasserted
Drive
rebuild in progress was
asserted
Drive
rebuild aborted was asserted
<Drive #>
<Drive #>
<Drive #>
Informational This event is generated when the drive is removed
from the fail array.
Informational This event is generated when the drive is rebuilding.
WarningThis event is generated when the drive rebuilding
process is aborted.
Intrusion Events
The chassis intrusion messages are a security measure. Chassis intrusion alerts are generated when the
system's chassis is opened. Alerts are sent to prevent unauthorized removal of parts from the chassis.
Table 3-11. Intrusion Events
Event MessageSeverityCause
<
Intrusion sensor Name
sensor detected an intrusion.
<
Intrusion sensor Name
sensor returned to normal state.
<Intrusion sensor Name>
sensor intrusion was asserted
while system was ON
<Intrusion sensor Name>
sensor intrusion was asserted
while system was OFF
>
>
CriticalThis event is generated when the intrusion sensor
detects an intrusion.
InformationThis event is generated when the earlier intrusion
has been corrected.
CriticalThis event is generated when the intrusion sensor
detects an intrusion while the system is on.
CriticalThis event is generated when the intrusion sensor
detects an intrusion while the system is off.
System Event Log Messages for IPMI Systems51
BIOS Generated System Events
The BIOS generated messages monitor the health and functionality of the chipsets, I/O channels, and
other BIOS-related functions. These system events are generated by the BIOS.
Table 3-12. BIOS Generated System Events
Event MessageSeverityCause
System Event I/O channel chk. CriticalThis event is generated when a critical interrupt is
generated in the I/O Channel.
System Event PCI Parity Err.CriticalThis event is generated when a parity error is detected
on the PCI bus.
System Event Chipset Err.CriticalThis event is generated when a chip error is detected.
System Event PCI System Err.InformationThis event indicates historical data, and is generated
when the system has crashed and recovered.
System Event PCI Fatal Err.CriticalThis error is generated when a fatal error is detected on
the PCI bus.
System Event PCIE Fatal Err.CriticalThis error is generated when a fatal error is detected on
the PCIE bus.
POST Err
POST fatal error #<number>
Memory Spared
redundancy lost
Memory Mirrored
redundancy lost
Memory RAID
redundancy lost
Err Reg Pointer
OEM Diagnostic data event was
asserted
System Board PFault Fail
Safe state asserted
System Board PFault Fail
Safe state deasserted
Memory Add
(BANK# DIMM#) presence was
asserted
CriticalThis event is generated when an error accrues during
system boot. See the system documentation for more
information on the error code.
CriticalThis event is generated when memory spare is no
longer redundant.
CriticalThis event is generated when memory mirroring is no
longer redundant.
CriticalThis event is generated when memory RAID is no
longer redundant.
InformationThis event is generated when an OEM event accrues.
CriticalThis event is generated when the system board
voltages are not at normal levels.
InformationThis event is generated when earlier PFault Fail Safe
system voltages returns to a normal level.
InformationThis event is generated when memory is added to the
system.
52System Event Log Messages for IPMI Systems
Table 3-12. BIOS Generated System Events (continued)
Event MessageSeverityCause
Memory Removed
(BANK# DIMM#) presence was
asserted
Memory Cfg Err
configuration error (BANK#
DIMM#) was asserted
Mem Redun Gain
redundancy regained
Mem ECC Warning
transition to non-critical
from OK
Mem ECC Warning
transition to critical from
less severe
Mem CRC Err
transition to non-recoverable
Mem Fatal SB CRC
uncorrectable ECC was
asserted
Mem Fatal NB CRC
uncorrectable ECC was
asserted
Mem Overtemp
critical over temperature
was asserted
USB Over-current
transition to non-recoverable
Hdwr version err
hardware incompatibility
(BMC Firmware and CPU
mismatch) was asserted
InformationThis event is generated when memory is removed from
the system.
CriticalThis event is generated when memory configuration is
incorrect for the system.
InformationThis event is generated when memory redundancy is
regained.
WarningThis event is generated when correctable ECC errors
have increased from a normal rate.
CriticalThis event is generated when correctable ECC errors
reach a critical rate.
CriticalThis event is generated when CRC errors enter a
non-recoverable state.
CriticalThis event is generated when CRC errors occur while
storing to memory.
CriticalThis event is generated when CRC errors occur while
removing from memory.
CriticalThis event is generated when system memory reaches
critical temperature.
CriticalThis event is generated when the USB exceeds a
predefined current level.
CriticalThis event is generated when there is a mismatch
between the BMC firmware and the processor in use
or vice versa.
System Event Log Messages for IPMI Systems53
Table 3-12. BIOS Generated System Events (continued)
Event MessageSeverityCause
Hdwr version err
hardware incompatibility
(BMC Firmware and CPU
mismatch) was deasserted
Hdwr version err
hardware incompatibility
(BMC Firmware and other
mismatch) was asserted
Hdwr version err
hardware incompatibility
(BMC Firmware and CPU
mismatch) was deasserted
SBE Log Disabled
correctable memory error
logging disabled was asserted
CPU Protocol Err
transition to non-recoverable
CPU Bus PERR
transition to non-recoverable
CPU Init Err
transition to non-recoverable
CPU Machine Chk
transition to non-recoverable
Logging Disabled
all event logging disabled was
asserted
Unknown system event sensor
unknown system hardware
failure was asserted
InformationThis event is generated when the earlier mismatch
between the BMC firmware and the processor is
corrected.
CriticalThis event is generated when there is a mismatch
between the BMC firmware and the processor in use or
vice versa.
InformationThis event is generated when an earlier hardware
mismatch is corrected.
CriticalThis event is generated when the ECC single bit error
rate is exceeded.
CriticalThis event is generated when the processor protocol
enters a non-recoverable state.
CriticalThis event is generated when the processor bus PERR
enters a non-recoverable state.
CriticalThis event is generated when the processor
initialization enters a non-recoverable state.
CriticalThis event is generated when the processor machine
check enters a non-recoverable state.
CriticalThis event is generated when all event logging is
disabled.
CriticalThis event is generated when an unknown hardware
failure is detected.
54System Event Log Messages for IPMI Systems
R2 Generated System Events
Table 3-13. R2 Generated Events
DescriptionSeverityCause
System Event: OS stop event OS
graceful shutdown detected
OEM Event data record (after
OS graceful shutdown/restart event)
System Event: OS stop event runtime
critical stop
OEM Event data record (after OS
bugcheck event)
InformationThe OS was shutdown/restarted
normally.
InformationComment string accompanying an
OS shutdown/restart.
CriticalThe OS encountered a critical error and
was stopped abnormally.
InformationOS bugcheck code and paremeters.
Cable Interconnect Events
The cable interconnect messages are used for detecting errors in the hardware cabling.
Table 3-14. Cable Interconnect Events
DescriptionSeverityCause
<Cable sensor Name/Location>
Configuration error was asserted.
<Cable sensor Name/Location>
Connection was asserted.
CriticalThis event is generated when the cable is
not connected or is incorrectly
connected.
InformationThis event is generated when the earlier
cable connection error was corrected.
Battery Events
Table 3-15. Battery Events
DescriptionSeverityCause
<Battery sensor Name/Location>
Failed was asserted
<Battery sensor Name/Location>
Failed was deasserted
<Battery sensor Name/Location>
is low was asserted
<Battery sensor Name/Location>
is low was deasserted
CriticalThis event is generated when the sensor
detects a failed or missing battery.
InformationThis event is generated when the earlier
failed battery was corrected.
WarningThis event is generated when the sensor
detects a low battery condition.
InformationThis event is generated when the earlier
low battery condition was corrected.
System Event Log Messages for IPMI Systems55
Entity Presence Events
The entity presence messages are used for detecting different hardware devices.
Table 3-16. Entity Presence Events
DescriptionSeverityCause
<Device Name>
presence was asserted
<Device Name>
absent was asserted
InformationThis event is generated when the device was detected.
CriticalThis event is generated when the device was not detected.
56System Event Log Messages for IPMI Systems
Storage Management Message Reference
The Dell OpenManage™ Server Administrator Storage Management’s alert or event management
features let you monitor the health of storage resources such as controllers, enclosures, physical
disks, and virtual disks.
Alert Monitoring and Logging
The Storage Management Service performs alert monitoring and logging. By default, the Storage
Management Service starts when the managed system starts up. If you stop the Storage
Management Service, the alert monitoring and logging stops. Alert monitoring does the following:
•Updates the status of the storage object that generated the alert.
•Propagates the storage object’s status to all the related higher objects in the storage hierarchy. For
example, the status of a lower-level object will be propagated up to the status displayed on the
Health tab for the top-level storage object.
•Logs an alert in the Alert log and the operating system (OS) application log.
•Sends an SNMP trap if the operating system’s SNMP service is installed and enabled.
NOTE: Dell OpenManage Server Administrator Storage Management does not log alerts regarding the data
I/O path. These alerts are logged by the respective RAID drivers in the system alert log.
See the Storage Management Online Help and the Dell OpenManage Server Administrator Storage
Management User’s Guide for updated information.
Alert Message Format with Substitution Variables
When you view an alert in the Server Administrator alert log, the alert identifies the specific
components such as the controller name or the virtual disk name to which the alert applies. In an
actual operating environment, a storage system can have many combinations of controllers and disks
as well as user-defined names for virtual disks and other components. Because each environment is
unique in its storage configuration and user-defined names, an accurate alert message requires that
the Storage Management Service be able to insert the environment-specific names of storage
components into an alert message.
This environment-specific information is inserted after the alert message text as shown for
alert
2127 in Ta b l e 4-1.
Storage Management Message Reference57
For other alerts, the alert message text is constructed from information passed directly from the
controller (or another storage component) to the Alert Log. In these cases, the variable information is
represented with a % (percent sign) in the Storage Management documentation. An example of such an
alert is shown for alert 2334 in
Table 4-1. Alert Message Format
Ta b l e 4-1.
Alert IDMessage Text Displayed in the Storage
Management Service Documentation
2127Background Initialization startedBackground Initialization started: Virtual Disk 3 (Virtual
2334Controller event log %Controller event log: Current capacity of the battery is
Message Text Displayed in the Alert Log with Variable
Information Supplied
Disk 3) Controller 1 (PERC 5/E Adapter)
above threshold.: Controller 1 (PERC 5/E Adapter)
The variables required to complete the message vary depending on the type of storage object and
whether the storage object is in a SCSI or SAS configuration. The following table identifies the possible
variables used to identify each storage object.
NOTE: Some alert messages relating to an enclosure or an enclosure component, such as a fan or EMM, are
generated by the controller when the enclosure or enclosure component ID cannot be determined.
Table 4-2. Message Format with Variables for Each Storage Object
Storage Object Message Variables
A, B, C and X, Y, Z in the following examples are variables representing the storage object
name or number.
ControllerMessage Format: Controller A (Name)
Message Format: Controller A
Example: 2326 A foreign configuration has been detected.: Controller 1 (PERC 5/E
Adapter)
NOTE: The controller name is not always displayed.
BatteryMessage Format: Battery X Controller A
Example: 2174 The controller battery has been removed: Battery 0 Controller 1
SCSI Physical DiskMessage Format: Physical Disk X:Y Controller A, Connector B
Example: 2049 Physical disk removed: Physical Disk 0:14 Controller 1, Connector 0
SAS Physical DiskMessage Format: Physical Disk X:Y:Z Controller A, Connector B
Example: 2049 Physical disk removed: Physical Disk 0:0:14 Controller 1, Connector 0
58Storage Management Message Reference
Table 4-2. Message Format with Variables for Each Storage Object (continued)
Storage Object Message Variables
A, B, C and X, Y, Z in the following examples are variables representing the storage object
name or number.
Virtual DiskMessage Format: Virtual Disk X (Name) Controller A (Name)
Message Format: Virtual Disk X Controller A
Example: 2057 Virtual disk degraded: Virtual Disk 11 (Virtual Disk 11) Controller 1
(PERC 5/E Adapter)
NOTE: The virtual disk and controller names are not always displayed.
Enclosure:Message Format: Enclosure X:Y Controller A, Connector B
SCSI Power SupplyMessage Format: Power Supply X Controller A, Connector B, Target ID C
where "C" is the SCSI ID number of the enclosure management module (EMM)
managing the power supply.
Example: 2122 Redundancy degraded: Power Supply 1, Controller 1, Connector 0, Target
ID 6
SAS Power SupplyMessage Format: Power Supply X Controller A, Connector B, Enclosure C
Example: 2312 A power supply in the enclosure has an AC failure.: Power Supply 1,
Controller 1, Connector 0, Enclosure 2
SCSI Temperature
Probe
SAS Temperature
Probe
SCSI FanMessage Format: Fan X Controller A, Connector B, Target ID C
SAS FanMessage Format: Fan X Controller A, Connector B, Enclosure C
SCSI EMMMessage Format: EMM X Controller A, Connector B, Target ID C
Message Format: Temperature Probe X Controller A, Connector B, Target ID C
where "C" is the SCSI ID number of the EMM managing the temperature probe.
Example: 2101 Temperature dropped below the minimum warning threshold:
Temperature Probe 1, Controller 1, Connector 0, Target ID 6
Message Format: Temperature Probe X Controller A, Connector B, Enclosure C
Example: 2101 Temperature dropped below the minimum warning threshold:
Temperature Probe 1, Controller 1, Connector 0, Enclosure 2
where "C" is the SCSI ID number of the EMM managing the fan.
Example: 2121 Device returned to normal: Fan 1, Controller 1, Connector 0, Target ID 6
Example: 2121 Device returned to normal: Fan 1, Controller 1, Connector 0, Enclosure 2
where "C" is the SCSI ID number of the EMM.
Example: 2121 Device returned to normal: EMM 1, Controller 1, Connector 0, Target
ID 6
Storage Management Message Reference59
Table 4-2. Message Format with Variables for Each Storage Object (continued)
Storage Object Message Variables
A, B, C and X, Y, Z in the following examples are variables representing the storage object
name or number.
SAS EMMMessage Format: EMM X Controller A, Connector B, Enclosure C
Example: 2121 Device returned to normal: EMM 1, Controller 1, Connector 0,
Enclosure 2
Alert Message Change History
The following table describes changes made to the Storage Management alerts from the previous release
of Storage Management to the current release.
Table 4-3. Alert Message Change History
Alert Message Change History
Storage Management 2.3Comments
Product Versions to
which Changes
Apply
New Alerts2369
Modified Alerts2095Added SNMP traps 751 and 851.
Obsolete Alerts2317
Documentation
Changes
Storage Management 2.3
Server Administrator 3.2
Dell OpenManage™ 5.3
2294Removed SNMP traps 752, 802, 852, 902, 952,
1002, 1052, 1102, 1152, and 1202. Added
SNMP trap 851.
2295Removed SNMP traps 754, 804, 904, 954, 1004,
1054, 1104, 1154, and 1204. Remaining SNMP
trap is 854.
2363
Documentation updated to indicate
related alerts and Local Response Agent
(LRA) alerts.
2095Changed documentation for cause.
60Storage Management Message Reference
Table 4-3. Alert Message Change History
Alert Message Change History
2305Changed documentation for cause and
corrective action.
Changed SNMP trap number to 903. This
change only made in the Dell OpenManage Server Administrator Messages Reference Guide
to reflect existing Storage Management online
help.
2312Changed documentation for corrective action
in the Storage Management online help. The
Dell OpenManage Server Administrator
Messages Reference Guide already has updated
corrective action.
2367Changed documentation for cause and
corrective action.
Storage Management 2.2Comments
Product Versions to
which Changes
Apply
Reduction of
unnecessary alert
generation
Modified Alerts2095Severity changed to Informational. SNMP trap
Storage Management 2.2
Server Administrator 3.2
Dell OpenManage™ 5.2
Enhancements to Storage Management
avoid numerous redundant or
inappropriate alerts posted to the Alert
Log after an unexpected system
shutdown.
2153Severity changed to Informational. SNMP trap
2188Severity changed to Informational. SNMP trap
2192Changed documentation for cause and
2202Severity changed to Informational. SNMP trap
2204Severity changed to Informational. SNMP trap
2205Severity changed to Informational. SNMP trap
In previous versions of Storage Management,
an unexpected system shutdown may have
caused the controller to repost a large number
of alerts to the Alert Log when restarting the
system.
changed to 901.
changed to 851.
changed to 1151.
corrective action.
changed to 901.
changed to 901.
changed to 901.
Storage Management Message Reference61
Table 4-3. Alert Message Change History
Alert Message Change History
2266SNMP traps changed to 751, 801, 851, 901,
2272Severity changed to Critical. SNMP trap
2273Changed alert message text and
2279 Changed alert message text.
2299Changed corrective action information in the
2305Changed severity to Warning. Changed SNMP
2331Changed severity to Informational. Changed
2367Changed severity to Warning. Changed SNMP
Obsolete Alerts2333
23542354 replaced by 2368.
2355
2365
2370
Documentation
Changes
Severity for alert 2163 changed from
Ok/Normal to Critical/Failure/Error.
Severity for alert 2318 changed from
Critical/Failure/Error to Warning/Noncritical.
Removed alert 2344. Replaced by alert
2070.
951, 1001, 1051, 1101, 1151, 1201.
changed to 904. Changed corrective action
information in the documentation.
documentation for cause and corrective action.
documentation.
trap number to 903.
SNMP trap number to 901.
trap number to 903.
Documentation change only made in the Dell
OpenManage Server Administrator Messages
Reference Guide to reflect the severity
displayed in the Server Administrator Alert Log
and documented in the Storage Management
online help.
Documentation change only made in the Dell
OpenManage Server Administrator Messages
Reference Guide to reflect the severity
displayed in the Server Administrator Alert Log
and documented in the Storage Management
online help.
Documentation change only made in the Dell
OpenManage Server Administrator Messages
Reference Guide to reflect existing Storage
Documentation updated to indicate clear
alert status.
Reference to SNMP trap variables
removed.
Corresponding Array Manager event
numbers removed (see comments).
Documentation change only made in the Dell
OpenManage Server Administrator Messages
Reference Guide to reflect existing Storage
Management online help.
The alert numbers for the new alerts
2062–2260 were previously unassigned.
Alert numbers 2370 and 2371 are new.
NOTE: Alerts 2062 and 2260 were previously
undocumented in the Storage Management
online help, Dell OpenManage Server
Administrator Storage Management User’s
Guide, and the Dell OpenManage Server
Administrator Messages Reference Guide.
The term “array disk” has been changed to
“physical disk” throughout Storage
Management. This change affects the message
text of the modified alerts.
2160 replaced by 2195.
2161 replaced by 2196.
Starting with Dell OpenManage 5.0, Array
Manager is no longer an installable option. If
you have an Array Manager installation and
wish to see how the Array Manager events
correspond to the Storage Management alerts,
refer to the product documentation prior to
Storage Management 2.1 or Dell OpenManage
5.1.
Storage Management Message Reference63
Alert Descriptions and Corrective Actions
The following sections describe alerts generated by the RAID or SCSI controllers supported by Storage
Management. The alerts are displayed in the Server Administrator Alert subtab or through Windows
Event Viewer. These alerts can also be forwarded as SNMP traps to other applications.
SNMP traps are generated for the alerts listed in the following sections. These traps are included in the
Dell OpenManage Server Administrator Storage Management management information base (MIB).
The SNMP traps for these alerts use all of the SNMP trap variables. For more information on SNMP
support and the MIB, see the SNMP Reference Guide.
To locate an alert, scroll through the following table to find the alert number displayed on the Server
Administrator Alert tab or search this file for the alert message text or number. See
Event Messages" for more information on severity levels.
For more information regarding alert descriptions and the appropriate corrective actions, see the online
help.
Table 4-4. Storage Management Messages
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
2048Device failedCritical /
Failure /
Error
Cause: A storage component such as
a physical disk or an enclosure has
failed. The failed component may
have been identified by the controller
while performing a task such as a
rescan or a check consistency.
Action: Replace the failed
component. You can identify which
disk has failed by locating the disk
that has a red “X” for its status.
Perform a rescan after replacing the
disk.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2049Physical disk
removed
2050Physical disk
offline
Warning /
Non-critical
Warning /
Non-critical
Cause: A physical disk has been
removed from the disk group. This
alert can also be caused by loose or
defective cables or by problems with
the enclosure.
Action: If a physical disk was removed
from the disk group, either replace
the disk or restore the original disk.
On some controllers, a removed disk
has a red "X" for its status. On other
controllers, a removed disk may have
an Offline status or is not
displayed on the user interface.
Perform a rescan after replacing or
restoring the disk. If a disk has not
been removed from the disk group,
then check for problems with the
See the
cables.
information on checking
Make sure that the enclosure is
powered on. If the problem persists,
check the enclosure documentation
for further diagnostic information.
Cause: A physical disk in the disk
group is offline. A user may have
manually put the physical disk
offline.
Action: Perform a rescan. You can also
select the offline disk and perform a
Make Online operation.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2051Physical disk
degraded
2052Physical disk
inserted
Warnin g /
Non-critical
Ok / Normal Cause: This alert is for informational
Cause: A physical disk has reported
an error condition and may be
degraded. The physical disk may have
reported the error condition in
response to a consistency check or
other operation.
Action: Replace the degraded
physical disk. You can identify which
disk is degraded by locating the disk
that has a red "X" for its status.
Perform a rescan after replacing the
disk.
purposes.
Action: None
Clear Alert Number: None.
Related Alert Number: 2070
LRA Number: None.
Clear Alert Number: None.
Related Alert Number: 2065,
2305, 2367
LRA Number: None.
903
901
2053Virtual disk
created
2054Virtual disk
deleted
2055Virtual disk
configuration
changed
Ok / Normal Cause: This alert is for informational
purposes.
Action: None
Warnin g /
Non-critical
Ok / Normal Cause: This alert is for informational
Cause: A virtual disk has been
deleted. "Performing a Reset
Configuration" may detect that a
virtual disk has been deleted and
generate this alert.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2056Virtual disk
failed
Critical /
Failure /
Error
Cause: One or more physical disks
included in the virtual disk have
failed. If the virtual disk is nonredundant (does not use mirrored or
parity data), then the failure of a
single physical disk can cause the
virtual disk to fail. If the virtual disk is
redundant, then more physical disks
have failed than can be rebuilt using
mirrored or parity information.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2057Virtual disk
degraded
2058Virtual disk
check
consistency
started
2059Virtual disk
format started
Warnin g /
Non-critical
Ok / Normal Cause: This alert is for informational
Ok / Normal Cause: This alert is for informational
Cause 1: This alert message occurs
when a physical disk included in a
redundant virtual disk fails. Because
the virtual disk is redundant (uses
mirrored or parity information) and
only one physical disk has failed, the
virtual disk can be rebuilt.
Action 1: Configure a hot spare for
the virtual disk if one is not already
configured. Rebuild the virtual disk.
When using an Expandable RAID
Controller (PERC) PERC 3/SC,
3/DCL, 3/DC, 3/QC, 4/SC, 4/DC,
4e/DC, 4/Di, CERC ATA100/4ch,
PERC 5/E, PERC 5/i or a Serial
Attache SCSI (SAS) 5/iR controller,
rebuild the virtual disk by first
configuring a hot spare for the disk,
and then initiating a write operation
to the disk. The write operation will
initiate a rebuild of the disk.
Cause 2: A physical disk in the disk
group has been removed.
Action 2: If a physical disk was
removed from the disk group, either
replace the disk or restore the original
disk. You can identify which disk has
been removed by locating the disk
that has a red “X” for its status.
Perform a rescan after replacing the
disk.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2067Virtual disk
check
consistency
cancelled
2070Virtual disk
initialization
cancelled
2074Physical disk
rebuild
cancelled
Ok / Normal Cause: The check consistency
operation cancelled because a
physical disk in the array has failed or
because a user cancelled the check
consistency operation.
Action: If the physical disk failed, then
replace the physical disk. You can
identify which disk failed by locating
the disk that has a red “X” for its
status. Perform a rescan after
replacing the disk. When performing
a consistency check, be aware that
the consistency check can take a long
time. The time it takes depends on
the size of the physical disk or
the virtual disk.
Ok / Normal Cause: The virtual disk initialization
cancelled because a physical disk
included in the virtual disk has failed
or because a user cancelled the virtual
disk initialization.
Action: If a physical disk failed, then
replace the physical disk. You can
identify which disk has failed by
locating the disk that has a red “X”
for its status. Perform a rescan after
replacing the disk. Restart the format
physical disk operation. Restart the
virtual disk initialization.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2076Virtual disk
check
consistency
failed
2077Virtual disk
format failed.
2079Virtual disk
initialization
failed
2080Physical disk
initialize failed
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Cause: A physical disk included in
the virtual disk failed or there is an
error in the parity information. A
failed physical disk can cause errors in
parity information.
Action: Replace the failed physical
disk. You can identify which disk has
failed by locating the disk that has a
red “X” for its status. Rebuild the
physical disk. When finished, restart
the check consistency operation.
Cause: A physical disk included in
the virtual disk failed.
Action: Replace the failed physical
disk. You can identify which physical
disk has failed by locating the disk
that has a red "X" for its status.
Rebuild the physical disk. When
finished, restart the virtual disk
format operation.
Cause: A physical disk included in
the virtual disk has failed or a user has
cancelled the initialization.
Action: If a physical disk has failed,
then replace the physical disk.
Cause: The physical disk has failed or
is corrupt.
Action: Replace the failed or corrupt
disk. You can identify a disk that has
failed by locating the disk that has a
red “X” for its status. Restart the
initialization.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2081Virtual disk
reconfiguratio
n failed
2082Virtual disk
rebuild failed
2083Physical disk
rebuild failed
2085Virtual disk
check
consistency
completed
Critical /
Failure /
Error
Critical /
Failure /
Error
Critical /
Failure /
Error
Ok / Normal Cause: This alert is for informational
Cause: A physical disk included in
the virtual disk has failed or is
corrupt. A user may also have
cancelled the reconfiguration.
Action: Replace the failed or corrupt
disk. You can identify a disk that has
failed by locating the disk that has a
red “X” for its status.
If the physical disk is part of a
redundant array, then rebuild the
physical disk. When finished, restart
the reconfiguration.
Cause: A physical disk included in
the virtual disk has failed or is
corrupt. A user may also have
cancelled the rebuild.
Action: Replace the failed or corrupt
disk. You can identify a disk that has
failed by locating the disk that has a
red “X” for its status. Restart the
virtual disk rebuild.
Cause: A physical disk included in
the virtual disk has failed or is
corrupt. A user may also have
cancelled the rebuild.
Action: Replace the failed or corrupt
disk. You can identify a disk that has
failed by locating the disk that has a
red “X” for its status. Rebuild the
virtual disk rebuild.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2094Predictive
Failure
reported.
Warnin g /
Non-critical
Cause: The physical disk is predicted
to fail. Many physical disks contain
Self Monitoring Analysis and
Reporting Technology (SMART).
When enabled, SMART monitors the
health of the disk based on
indications such as the number of
write operations that have been
performed on the disk.
Action: Replace the physical disk.
Even though the disk may not have
failed yet, it is strongly recommended
that you replace the disk.
If this disk is part of a redundant
virtual disk, perform the Offline task
on the disk; replace the disk; and then
assign a hot spare and the rebuild will
start automatically.
If this disk is a hot spare, then
unassign the hot spare; perform the
Prepare to Remove task on the disk;
replace the disk; and assign the new
disk as a hot spare.
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: 2070
903
NOTICE: If this disk is part of a
nonredundant disk, back up your
data immediately. If the disk fails,
you will not be able to recover
the data.
2095SCSI sense
data.
2098Global hot
spare assigned
Ok / Normal Cause: A SCSI device experienced an
error, but may have recovered.
Action: None.
Ok / Normal Cause: A user has assigned a physical
disk as a global hot spare. This alert is
for informational purposes.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2099Global hot
spare
unassigned
2100Temperature
exceeded the
maximum
warning
threshold
2101Temperature
dropped below
the minimum
warning
threshold
Ok / Normal Cause: A user has unassigned a
physical disk as a global hot spare.
This alert is for informational
purposes.
Action: None
Warning /
Non-critical
Warning /
Non-critical
Cause: The physical disk enclosure is
too hot. A variety of factors can cause
the excessive temperature. For
example, a fan may have failed, the
thermostat may be set too high,
or the room temperature may be too
hot.
Action: Check for factors that may
cause overheating. For example,
verify that the enclosure fan is
working. You should also check the
thermostat settings and examine
whether the enclosure is located near
a heat source. Make sure the
enclosure has enough ventilation and
that the room temperature is not too
hot. See the physical disk enclosure
documentation for more diagnostic
information.
Cause: The physical disk enclosure is
too cool.
Action: Check if the thermostat
setting is too low and if the room
temperature is too cool.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2102Temperature
exceeded the
maximum
failure
threshold
2103Temperature
dropped below
the minimum
failure
threshold
2104Controller
battery is
reconditioning
Critical /
Failure /
Error
Critical /
Failure /
Error
Ok / Normal Cause: This alert is for informational
Cause: The physical disk enclosure is
too hot. A variety of factors can cause
the excessive temperature. For
example, a fan may have failed, the
thermostat may be set too high, or
the room temperature may be too
hot.
Action: Check for factors that may
cause overheating. For example,
verify that the enclosure fan is
working. You should also check the
thermostat settings and examine
whether the enclosure is located near
a heat source. Make sure the
enclosure has enough ventilation and
that the room temperature is not too
hot. See the physical disk enclosure
documentation for more diagnostic
information.
Cause: The physical disk enclosure is
too cool.
Action: Check if the thermostat
setting is too low and if the room
temperature is too cool.
purposes.
Action: None
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: 2091
Clear Alert Number: None.
Related Alert Number: 2112
LRA Number: 2091
Clear Alert Number: 2105.
Related Alert Number: None.
LRA Number: None.
1054
1054
1151
2105Controller
battery
recondition is
completed
Ok / Normal Cause: This alert is for informational
purposes.
Action: None
76Storage Management Message Reference
Clear Alert Status: Alert 2105 is
a clear alert for alert 2104.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2106Smart FPT
exceeded
2107Smart
configuration
change
Warning /
Non-critical
Critical /
Failure /
Error
Cause: A disk on the specified
controller has received a SMART
alert (predictive failure) indicating
that the disk is likely to fail in the
near future.
Action: Replace the disk that has
received the SMART alert. If the
physical disk is a member of a nonredundant virtual disk, then back up
the data before replacing the disk.
NOTICE: Removing a physical
disk that is included in a nonredundant virtual disk will cause
the virtual disk to fail and may
cause data loss.
Cause: A disk has received a SMART
alert (predictive failure) after a
configuration change. The disk is
likely to fail in the near future.
Action: Replace the disk that has
received the SMART alert. If the
physical disk is a member of a nonredundant virtual disk, then back up
the data before replacing the disk.
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: 2070
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: 2071
903
904
NOTICE: Removing a physical
disk that is included in a nonredundant virtual disk will cause
the virtual disk to fail and may
cause data loss.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2108Smart warning Warning /
Non-critical
Cause: A disk has received a SMART
alert (predictive failure). The disk is
likely to fail in the near future.
Action: Replace the disk that has
received the SMART alert. If the
physical disk is a member of a
non-redundant virtual disk, then back
up the data before replacing the disk.
NOTICE: Removing a physical
disk that is included in a nonredundant virtual disk will cause
the virtual disk to fail and may
cause data loss.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2109SMART
warning
temperature
Warning /
Non-critical
Cause: A disk has reached an
unacceptable temperature and
received a SMART alert (predictive
failure). The disk is likely to fail in the
near future.
Action 1: Determine why the physical
disk has reached an unacceptable
temperature. A variety of factors can
cause the excessive temperature. For
example, a fan may have failed, the
thermostat may be set too high, or
the room temperature may be too hot
or cold. Verify that the fans in the
server or enclosure are working. If the
physical disk is in an enclosure, you
should check the thermostat settings
and examine whether the enclosure is
located near a heat source. Make sure
the enclosure has enough ventilation
and that the room temperature is not
too hot. See the physical disk
enclosure documentation for more
diagnostic information.
Action 2: If you cannot identify why
the disk has reached an unacceptable
temperature, then replace the disk. If
the physical disk is a member of a
non-redundant virtual disk, then back
up the data before replacing the disk.
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: 2070
903
NOTICE: Removing a physical
disk that is included in a nonredundant virtual disk will cause
the virtual disk to fail and may
cause data loss.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2110SMART
warning
degraded
2111Failure
prediction
threshold
exceeded due
to test - No
action needed
2112Enclosure was
shut down
Warnin g /
Non-critical
Warnin g /
Non-critical
Critical /
Failure /
Error
Cause: A disk is degraded and has
received a SMART alert (predictive
failure). The disk is likely to fail in the
near future.
Action: Replace the disk that has
received the SMART alert. If the
physical disk is a member of a nonredundant virtual disk, then back up
the data before replacing the disk.
NOTICE: Removing a physical
disk that is included in a nonredundant virtual disk will cause
the virtual disk to fail and may
cause data loss.
Cause: A disk has received a SMART
alert (predictive failure) due to test
conditions.
Action: None
Cause: The physical disk enclosure is
either hotter or cooler than the
maximum or minimum allowable
temperature range.
Action: Check for factors that may
cause overheating or excessive cooling.
For example, verify that the enclosure
fan is working. You should also check
the thermostat settings and examine
whether the enclosure is located near
a heat source. Make sure the
enclosure has enough ventilation and
that the room temperature is not too
hot or too cold. See the enclosure
documentation for more diagnostic
information.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2114A consistency
check on a
virtual disk has
been paused
(suspended)
2115A consistency
check on a
virtual disk has
been resumed
2116A virtual disk
and its mirror
have been split
2117A mirrored
virtual disk has
been
unmirrored
Ok / Normal Cause: The check consistency
operation on a virtual disk was paused
by a user.
Action: To resume the check
consistency operation, right-click the
virtual disk in the tree view and select
Resume Check Consistency.
Ok / Normal Cause: This alert is for informational
purposes. The check consistency
operation on a virtual disk has
resumed processing after being
paused by a user.
Action: None
Ok / Normal Cause: This alert is for informational
purposes. A user has caused a
mirrored virtual disk to be split. When
a virtual disk is mirrored, its data is
copied to another virtual disk in order
to maintain redundancy. After being
split, both virtual disks retain a copy
of the data, although because the
mirror is no longer intact, updates to
the data are no longer copied to the
mirror.
Action: None
Ok / Normal Cause: This alert is for informational
purposes. A user has caused a
mirrored virtual disk to be
unmirrored. When a virtual disk is
mirrored, its data is copied to another
virtual disk in order to maintain
redundancy. After being unmirrored,
the disk formerly used as the mirror
returns to being a physical disk and
becomes available for inclusion in
another virtual disk.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2118Change write
policy
Ok / Normal Cause: This alert is for informational
purposes. A user has changed the
write policy for a virtual disk.
Action: None
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
1201
2120Enclosure
firmware
mismatch
2121Device
returned to
normal
Warnin g /
Non-critical
Ok / Normal Cause: This alert is for informational
Cause: The firmware on the EMM is
not the same version. It is required
that both modules have the same
version of the firmware. This alert
may be caused when a user attempts
to insert an EMM module that has a
different firmware version than an
existing module.
Action: Download the same version
of the firmware to both EMM
modules.
purposes. A device that was
previously in an error state has
returned to a normal state.
For example, if an enclosure became
too hot and subsequently cooled
down, then you may receive this alert.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2122Redundancy
degraded
Warning /
Non-critical
Cause: One or more of the enclosure
components has failed.
For example, a fan or power supply
may have failed. Although the
enclosure is currently operational, the
failure of additional components
could cause the enclosure to fail.
Action: Identify and replace the failed
component. To identify the failed
component, select the enclosure in
the tree view and click the Health
subtab. Any failed component will be
identified with a red "X" on the
enclosure’s Health subtab.
Alternatively, you can select the
Storage object and click the Health
subtab. The controller status
displayed on the Health subtab
indicates whether a controller has a
failed or degraded component.
See the enclosure documentation for
information on replacing enclosure
components and for other diagnostic
information.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2123Redundancy
lost
Warnin g /
Non-critical
Cause: A virtual disk or an enclosure
has lost data redundancy. In the case
of a virtual disk, one or more physical
disks included in the virtual disk have
failed. Due to the failed physical disk
or disks, the virtual disk is no longer
maintaining redundant (mirrored or
parity) data. The failure of an
additional physical disk will result in
lost data. In the case of an enclosure,
more than one enclosure component
has failed. For example, the enclosure
may have suffered the loss of all fans
or all power supplies.
Action: Identify and replace the
failed components. To identify the
failed component, select the Storage
object and click the Health subtab.
The controller status displayed on the
Health subtab indicates whether a
controller has a failed or degraded
component. Click the controller that
displays a Warning or Failed status.
This action displays the controller
Health subtab which displays the
status of the individual controller
components. Continue clicking the
components with a Warning or
Health status until you identify the
failed component.
See the online help for more
information. See the enclosure
documentation for information on
replacing enclosure components and
for other diagnostic information.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2124Redundancy
normal
Ok / Normal Cause: This alert is for informational
purposes. Data redundancy has been
restored to a virtual disk or an
enclosure that previously suffered a
loss of redundancy.
Action: None
Clear Alert Number: Alert 2124
is a clear alert for alerts 2122 and
2123.
Related Alert Number: None.
LRA Number: None.
1304
2126SCSI sense
sector reassign
2127Background
initialization
(BGI) started
Warning /
Non-critical
Cause: A sector of the physical disk is
corrupted and data cannot be
maintained on this portion of the
disk. This alert is for informational
purposes.
NOTICE: Any data residing on
the corrupt portion of the disk
may be lost and you may need to
restore your data from backup.
Action: If the physical disk is part of a
nonredundant virtual disk, then back
up the data and replace the physical
disk.
NOTICE: Removing a physical
disk that is included in a
nonredundant virtual disk will
cause the virtual disk to fail and
may cause data loss.
If the disk is part of a redundant
virtual disk, then any data residing on
the corrupt portion of the disk will be
reallocated elsewhere in the virtual
disk.
Ok / Normal Cause: BGI of a virtual disk has
started. This alert is for informational
purposes.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2128BGI cancelled Ok / Normal Cause: BGI of a virtual disk has been
cancelled. A user or the firmware may
have stopped BGI.
Action: None
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
1201
2129BGI failedCritical /
Failure /
Error
2130BGI completed Ok / Normal Cause: BGI of a virtual disk has
2131Firmware
version
mismatch
Warnin g /
Non-critical
Cause: BGI of a virtual disk has
failed.
Action: None
completed. This alert is for
informational purposes.
Action: None
Cause: The firmware on the
controller is not a supported version.
Action: Install a supported version of
the firmware. If you do not have a
supported version of the firmware
available, it can be downloaded from
the Dell support site at
support.dell.com. If you do not have
a supported version of the firmware
available, check with your support
provider for information on how to
obtain the most current firmware.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2132Driver version
mismatch
2135Array Manager
is installed on
the system
2136Virtual disk
initialization
Warning /
Non-critical
Warning /
Non-critical
Ok / Normal Cause: This alert is for informational
Cause: The controller driver is not a
supported version.
Action: Install a supported version of
the driver. If you do not have a
supported driver version available, it
can be downloaded from the
Dell support site at support.dell.com.
If you do not have a supported
version of the driver available, check
with your support provider for
information on how to obtain the
most current driver.
Cause: Storage Management has been
installed on a system that has an Array
Manager installation.
Action: Installing Storage
Management and Array Manager on
the same system is not a supported
configuration. Uninstall either
Storage Management or Array
Manager.
purposes. Virtual disk initialization is
in progress.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2137Communicatio
n timeout
2138Enclosure
alarm enabled
Warnin g /
Non-critical
Ok / Normal Cause: This alert is for informational
Cause: The controller is unable to
communicate with an enclosure.
There are several reasons why
communication may be lost. For
example, there may be a bad or loose
cable. An unusual amount of I/O may
also interrupt communication with
the enclosure. In addition,
communication loss may be caused
by software, hardware, or firmware
problems, bad or failed power
supplies, and enclosure shutdown.
When viewed in the Alert Log, the
description for this event displays
several variables. These variables are:
Controller and enclosure names, type
of communication problem, return
code, and SCSI status.
Action: Check for problems with the
cables. See the online help for more
information on checking the cables.
Yo u sho u ld a l so c heck to see if the
enclosure has degraded or failed
components. To do so, select the
enclosure object in the tree view and
click the Health subtab. The Health
subtab displays the status of the
enclosure components. Verify that
the controller has supported driver
and firmware versions installed and
that the EMMs are each running the
same version of supported firmware.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2139Enclosure
alarm disabled
Ok / Normal Cause: A user has disabled the
enclosure alarm.
Action: None
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
851
2140Dead disk
segments
restored
2141Physical disk
dead segments
recovered
2142Controller
rebuild rate has
changed
2143Controller
alarm enabled
Ok / Normal Cause: This alert is for informational
purposes. Disk space that was
formerly “dead” or inaccessible to a
redundant virtual disk has been
restored.
Action: None
Ok / Normal Cause: This alert is for informational
purposes. Portions of the physical
disk were formerly inaccessible. The
disk space from these dead segments
has been recovered and is now usable.
Any data residing on these dead
segments has been lost.
Action: None
Ok / Normal Cause: This alert is for informational
purposes. A user has changed the
controller rebuild rate.
Action: None
Ok / Normal Cause: This alert is for informational
purposes. A user has enabled the
controller alarm.
Action: None
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
1201
901
751
751
2144Controller
alarm disabled
Ok / Normal Cause: This alert is for informational
purposes. A user has disabled the
controller alarm.
Ok / Normal Cause: This alert is for informational
purposes. A user has renamed a
virtual disk.
When renaming a virtual disk on a
PERC 3/SC, 3/DCL, 3/DC, 3/QC,
4/SC, 4/DC, 4e/DC, 4/Di, CERC
ATA100/4ch, PERC 5/E, PERC 5/i or
SAS 5/iR controller, this alert displays
the new virtual disk name.
On the PERC 3/SC, 3/DCL, 3/DC,
3/QC, 4/SC, 4/DC, 4e/DC, 4/Di,
4/IM, 4e/Si, 4e/Di, and CERC ATA
100/4ch controllers, this alert displays
the original virtual disk name.
Action: None
Ok / Normal Cause: This alert is for informational
purposes. Communication with an
enclosure has been restored.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2163Rebuild
completed
with errors
Critical /
Failure /
Error
Cause: This alert is documented in the
Storage Management online help.
Action: See the online help for more
information.
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: 2071
904
2164See the
Readme file for
a list of
validated
controller
driver versions
2165The RAID
controller
firmware and
driver
validation was
not performed.
The
configuration
file cannot be
opened.
Ok / Normal Cause: This alert is for informational
purposes. Storage Management is
unable to determine whether the
system has the minimum required
versions of the RAID controller
drivers.
Action: See the Readme file for driver
and firmware requirements.
In particular, if Storage Management
experiences performance problems,
you should verify that you have the
minimum supported versions of the
drivers and firmware installed.
Warning /
Non-critical
Cause: Storage Management is
unable to determine whether the
system has the minimum required
versions of the RAID controller
firmware and drivers. This situation
may occur for a variety of reasons. For
example, the installation directory
path to the configuration file may not
be correct. The configuration file may
also have been removed or renamed.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2166The RAID
controller
firmware and
driver
validation was
not performed.
The
configuration
file is out of
date or
corrupted.
2167The current
kernel version
and the nonRAID SCSI
driver version
are older than
the minimum
required levels.
See readme.txt
for a list of
validated
kernel and
driver versions.
Warnin g /
Non-critical
Warnin g /
Non-critical
Cause: Storage Management is
unable to determine whether the
system has the minimum required
versions of the RAID controller
firmware and drivers. This situation
has occurred because a configuration
file is unreadable or missing data.
The configuration file may be
corrupted.
Action: Reinstall Storage
Management.
Cause: The version of the kernel and
the driver do not meet the minimum
requirements. Storage Management
may not be able to display the storage
or perform storage management
functions until you have updated the
system to meet the minimum
requirements.
Action: See the Readme file for a list
of validated kernel and driver
versions. Update the system to meet
the minimum requirements and then
reinstall Storage Management.
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: 2060
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: 2050
753
103
2168The non-RAID
SCSI driver
version is older
than the
minimum
required level.
See readme.txt
for the
validated
driver version.
Warnin g /
Non-critical
Cause: The version of the driver does
not meet the minimum
requirements. Storage Management
may not be able to display the storage
or perform storage management
functions until you have updated the
system to meet the minimum
requirements.
Action: See the Readme file for the
validated driver version. Update the
system to meet the minimum
requirements and then reinstall
Storage Management.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2169The controller
battery needs
to be replaced.
2170The controller
battery charge
level is normal.
Critical /
Failure /
Error
Ok / Normal Cause: This alert is for informational
Cause: The controller battery cannot
recharge. The battery may be old or it
may have been already recharged the
maximum number of times. In
addition, the battery charger may not
be working.
Action: Replace the battery pack.
purposes.
Action: None
Clear Alert Number: None.
Related Alert Number: 2118
LRA Number: 2101
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
1154
1151
2171The controller
battery
temperature is
above normal.
2172The controller
battery
temperature is
normal.
Warning /
Non-critical
Ok / Normal Cause: This alert is for informational
Cause: The battery may be
recharging, the room temperature
may be too hot, or the fan in the
system may be degraded or failed.
Action: If this alert was generated due
to a battery recharge, the situation
will correct when the recharge is
complete. You should also check if
the room temperature is normal and
that the system components are
functioning properly.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2173Unsupported
configuration
detected. The
SCSI rate of
the enclosure
management
modules
(EMMs) is not
the same.
EMM0 %1
EMM1 %2
2174The controller
battery has
been removed.
2175The controller
battery has
been replaced.
Warnin g /
Non-critical
Warnin g /
Non-critical
Ok / Normal Cause: This alert is for informational
Cause: The EMMs in the enclosure
have a different SCSI rate. This is an
unsupported configuration. All
EMMs in the enclosure should have
the same SCSI rate. The % (percent
sign) indicates a substitution
variable. The text for this
substitution variable is displayed with
the alert in the Alert Log and can vary
depending on the situation.
Action: The EMMs in the enclosure
have a different SCSI rate. This is an
unsupported configuration. All
EMMs in the enclosure should have
the same SCSI rate.
Cause: The controller cannot
communicate with the battery, the
battery may be removed, or the
contact point between the controller
and the battery may be burnt or
corroded.
Action: Replace the battery if it has
been removed. If the contact point
between the battery and the controller
is burnt or corroded, you will need to
replace either the battery or the
controller, or both. See the hardware
documentation for information on
how to safely access, remove, and
replace the battery.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2176The controller
battery Learn
cycle has
started.
Ok / Normal Cause: This alert is for informational
purposes.
Action: None
Clear Alert Number: 2177.
Related Alert Number: None.
LRA Number: None.
1151
2177The controller
battery Learn
cycle has
completed.
2178The controller
battery Learn
cycle has
timed out.
2179The controller
battery Learn
cycle has been
postponed.
Ok / Normal Cause: This alert is for informational
purposes.
Action: None
Warning /
Non-critical
Ok / Normal Cause: This alert is for informational
Cause: The controller battery must
be fully charged before the Learn
cycle can begin. The battery may be
unable to maintain a full charge
causing the Learn cycle to timeout.
Additionally, the battery must be able
to maintain cached data for a
specified period of time in the event
of a power loss. For example, some
batteries maintain cached data for
24 hours. If the battery is unable to
maintain cached data for the required
period of time, then the Learn cycle
will timeout.
Action: Replace the battery pack as
the battery is unable to maintain a
full charge.
purposes.
Action: None
Clear Alert Status: Alert 2177 is
a clear alert for alert 2176.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2180The controller
battery Learn
cycle will start
in %1 days.
2181The controller
battery Learn
cycle will start
in %1 hours.
2182An invalid SAS
configuration
has been
detected.
2186The controller
cache has been
discarded.
2187Single-bit
ECC error
limit exceeded.
Ok / Normal Cause: This alert is for informational
purposes. The %1 indicates a
substitution variable. The text for
this substitution variable is displayed
with the alert in the Alert Log and
can vary depending on the situation.
Action: None
Ok / Normal Cause: This alert is for informational
purposes. The %1 indicates a
substitution variable. The text for
this substitution variable is displayed
with the alert in the Alert Log and
can vary depending on the situation.
Action: None
Critical /
Failure /
Error
Warnin g /
Non-critical
Warnin g /
Non-critical
Cause: The controller and attached
enclosures are not cabled correctly.
Action: See the hardware
documentation for information on
correct cabling configurations.
Cause: The controller has flushed the
cache and any data in the cache has
been lost. This may happen if the
system has memory or battery
problems that cause the controller to
distrust the cache. Although user data
may have been lost, this alert does not
always indicate that relevant or user
data has been lost.
Action: Verify that the battery and
memory are functioning properly.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2188The controller
write policy
has been
changed to
Wri te
Through.
2189The controller
write policy
has been
changed to
Write Back.
Ok / Normal Cause: The controller battery is
unable to maintain cached data for
the required period of time. For
example, if the required period of
time is 24 hours, the battery is unable
to maintain cached data for 24 hours.
It is normal to receive this alert
during the battery Learn cycle as the
Learn cycle discharges the battery
before recharging it. When
discharged, the battery cannot
maintain cached data.
Action: Check the health of the
battery. If the battery is weak, replace
the battery pack.
Ok / Normal Cause: This alert is for informational
purposes.
Action: None
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
1151
1151
2191Multiple
enclosures are
attached to the
controller. This
is an
unsupported
configuration.
Critical /
Failure /
Error
Cause: Many enclosures are attached
to the controller port. When the
enclosure limit is exceeded, the
controller loses contact with all
enclosures attached to the port.
Action: Remove the last enclosure. You
must remove the enclosure that has
been added last and is causing the
enclosure limit to exceed.
Event IDDescriptionSeverityCause and ActionRelated Alert InformationSNMP
Tr ap
Numbers
2192The virtual
disk Check
Consistency
has made
corrections and
completed.
2193The virtual
disk
reconfiguratio
n has resumed.
Ok / Normal Cause: This alert is for informational
purposes. The virtual disk Check
Consistency has identified errors and
made corrections. For example, the
Check Consistency may have
encountered a bad disk block and
remapped the disk block to restore
data consistency.
Action: This alert is for informational
purposes only and no additional
action is required. As a precaution,
monitor the Alert Log for other errors
related to this virtual disk. If
problems persist, contact Dell
Technical Support.
Ok / Normal Cause: This alert is for informational
purposes.
Action: None
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
1203
1201
2194The virtual
disk Read
policy has
changed.
2195Dedicated hot
spare assigned.
Physical disk
%1
2196Dedicated hot
spare
unassigned.
Physical disk
%1
Ok / Normal Cause: This alert is for informational
purposes.
Action: None
Ok / Normal Cause: This alert is for informational
purposes.
Action: None.
Ok / Normal Cause: This alert is for informational
purposes.
Action: None.
100Storage Management Message Reference
Clear Alert Number: None.
Related Alert Number: None.
LRA Number: None.
Clear Alert Number: 2196.
Related Alert Number: None.
LRA Number: None.
Clear Alert Status: Alert 2196 is
a clear alert for alert 2195.
Related Alert Number: None.
LRA Number: None.
1201
1201
1201
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.