Dell OpenManage Server Administrator Version 6.5 Messages Reference Guide

Download

Dell OpenManage

Server Administrator

Version 6.5

Messages Reference

Guide

Notes and Cautions

NOTE:

A NOTE indicates important information that helps you make better use of

your computer.

CAUTION:

instructions are not followed.

____________________

Reproduction of these materials in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden.

Trademarks used in this text: Dell™, the DELL logo, PowerEdge™, PowerVault™, and OpenManage™ are trademarks of Dell Inc. Microsoft trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Red Hat Enterprise Linux Inc. in the United States and/or other countries. Novell trademark of Novell Inc. in the United States and other countries. Oracle of Oracle Corporation and/or its affiliates. Citrix trademarks or trademarks of Citrix Systems, Inc. in the United States and/or other countries. VMware is registered trademarks or trademarks of VMWare, Inc. in the United States or other countries.

Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own.

A CAUTION indicates potential damage to hardware or loss of data if

, Windows®, and Windows Server® are either

and Enterprise Linux® are registered trademarks of Red Hat,

is a registered trademark and SUSE ™ is a

, Xen®, and XenServer® are either registered

is a registered trademark

January 2011

1 Introduction

What’s New in this Release

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

Messages Not Described in This Guide

Understanding Event Messages

Sample Event Message Text

Viewing Alerts and Event Messages

. . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . .

Logging Messages to a Unicode Text File

Viewing Events in Windows Server 2003 and Windows Server 2008

. . . . . . . . . . . . . . .

Viewing Events in Red Hat Enterprise Linux and SUSE Linux Enterprise Server

Viewing Events in VMware ESX/ESXi

Viewing the Event Information

. . . . . . . . . . .

Understanding the Event Description

2 Server Management Messages

Server Administrator General Messages

. . . . . . . . .

. . . . .

. . . . . . . . .

. . . . . . .

. . . . . . . .

Temperature Sensor Messages

Cooling Device Messages

Voltage Sensor Messages

Current Sensor Messages

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . .

Contents

Chassis Intrusion Messages

. . . . . . . . . . . . . .

Redundancy Unit Messages

Power Supply Messages

Memory Device Messages

Fan Enclosure Messages

AC Power Cord Messages

Hardware Log Sensor Messages

Processor Sensor Messages

Pluggable Device Messages

Battery Sensor Messages

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

Secure Digital (SD) Card Device Messages

Chassis Management Controller Messages

3 Storage Management Message

Reference

. . . . . .

Alert Monitoring and Logging

Alert Message Format with Substitution Variables

Alert Message Change History

Alert Descriptions and Corrective Actions

. . . . . . . . . . . . . .

. .

. . . . . . . . . . . . .

. . . . . . .

4 System Event Log Messages for

IPMI Systems

Temperature Sensor Events

Contents

. . . . . . . . . . . . . . .

221

Voltage Sensor Events

. . . . . . . . . . . . . . . . .

222

Fan Sensor Events

Processor Status Events

Power Supply Events

Memory ECC Events

BMC Watchdog Events

Memory Events

Hardware Log Sensor Events

Drive Events

Intrusion Events

BIOS Generated System Events

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . .

Operating System Generated System Events

Cable Interconnect Events

Battery Events

. . . . . . . . . . . . . . . . . . . . . .

Power And Performance Events

. . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . .

223

225

226

229

230

231

232

233

234

240

241

242

Index

Entity Presence Events

Miscellaneous

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

Contents

244

247

Contents

Introduction

Dell OpenManage Server Administrator generates event messages stored primarily in the operating system or Server Administrator event logs and sometimes in Simple Network Management Protocol (SNMP) traps. This document describes the event messages that are created by Server Administrator version 6.5 and displayed in the Server Administrator alert log.

Server Administrator creates events in response to sensor status changes and other monitored parameters. The Server Administrator event monitor uses these status change events to add descriptive messages to the operating system event log or the Server Administrator alert log.

Each event message that Server Administrator adds to the alert log consists of a unique identifier called the event ID for a specific event source category and a descriptive message. The event message includes the severity, cause of the event, and other relevant information, such as the event location and the previous state of the monitored item.

The tables in this guide list all Server Administrator event IDs in numeric order. Each entry includes the description, severity level, and cause of the event ID. The message text in angle brackets (for example, <State>) describes the event-specific information provided by the Server Administrator.

Introduction

What’s New in this Release

No new alerts have been added. The existing alerts 2081, 2347, and 2388 are modified to include additional information.

Messages Not Described in This Guide

This guide describes only event messages logged by Server Administrator and Storage Management that are displayed in the Server Administrator alert log. For information on other messages generated by your system, see one of the following sources:

•The

Installation and Troubleshooting Guide or Hardware Owner's Manual

shipped with your system

• Operating system documentation

• Application program documentation

Understanding Event Messages

This section describes the various types of event messages generated by the Server Administrator. When an event occurs on your system, Server Administrator sends information about one of the following event types to the systems management console:

Table 1-1. Understanding Event Messages

Icon Alert Severity Component Status

OK /Normal / Informational

War n in g / Non-critical

Critical / Failure / Error

Introduction

An event that describes the successful operation of a unit. The alert is provided for informational purposes and does not indicate an error condition. For example, the alert may indicate the normal start or stop of an operation, such as power supply or a

An event that is not necessarily significant, but may indicate a possible future problem. alert may indicate that a component (such as a temperature probe in an enclosure) has crossed a warning threshold.

A significant event that indicates actual or imminent loss of data or loss of function. threshold or a hardware failure such as

sensor reading returning to normal.

For example, a Warning/Non-critical

For ex a mple,

crossing a failure

an array disk.

Server Administrator generates events based on status changes in the following sensors:

•

Temperature Sensor

— Helps protect critical components by alerting the systems management console when temperatures become too high inside a chassis; also monitors the temperature in a variety of locations in the chassis and in attached system(s).

•

Fan Sensor

— Monitors fans in various locations in the chassis and in

attached system(s).

•

Volt ag e Se nso r

— Monitors voltages across critical components in various

chassis locations and in attached system(s).

•

Current Sensor

— Monitors the current (or amperage) output from the

power supply (or supplies) in the chassis and in attached system(s).

•

Chassis Intrusion Sensor

— Monitors intrusion into the chassis and

attached system(s).

•

Redundancy Unit Sensor

— Monitors redundant units (critical units such as fans, AC power cords, or power supplies) within the chassis; also monitors the chassis and attached system(s). For example, redundancy allows a second or

th fan to keep the chassis components at a safe temperature when another fan has failed. Redundancy is normal when the intended number of critical components are operating. Redundancy is degraded when a component fails, but others are still operating. Redundancy is lost when there is one less critical redundancy device than required.

•

Power Supply Sensor

— Monitors power supplies in the chassis and in

attached system(s).

•

Memory Prefailure Sensor

— Monitors memory modules by counting the

number of Error Correction Code (ECC) memory corrections.

•

Fan Enclosure Sensor

— Monitors protective fan enclosures by detecting their removal from and insertion into the system, and by measuring how long a fan enclosure is absent from the chassis. This sensor monitors the chassis and in attached system(s).

•

AC Power Cord Sensor

— Monitors the presence of AC power for an

AC power cord.

•

Hardware Log Sensor

•

Processor Sensor

— Monitors the size of a hardware log.

— Monitors the processor status in the system.

Introduction

•

Pluggable Device Sensor

or configuration errors for some pluggable devices, such as memory cards.

•

Battery Sensor

the system.

•

SD Card Device Sensor

card devices in the system.

— Monitors the status of one or more batteries in

— Monitors the addition, removal,

— Monitors instrumented Secure Digital (SD)

Sample Event Message Text

The following example shows the format of the event messages logged by Server Administrator.

EventID: 1000

Source: Server Administrator

Category: Instrumentation Service

Type: Information

Date and Time: Mon Oct 21 10:38:00 2002

Computer: <computer name>

Description:

Server Administrator starting

Data: Bytes in Hex

Viewing Alerts and Event Messages

An event log is used to record information about important events.

Server Administrator generates alerts that are added to the operating system event log and to the Server Administrator alert log. To view these alerts in Server Administrator:

Select the

You can also view the event log using your operating system’s event viewer. Each operating system’s event viewer accesses the applicable operating system event log.

System

Logs

Alert

Introduction

object in the tree view.

tab.

The location of the event log file depends on the operating system you are using.

• On systems running the Microsoft Windows operating systems, event messages are logged in the operating system event log and the Server Administrator event log. The Server Administrator event log file is named

dcsys32.xml

The default

• On systems running the Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Citrix XenServer, VMware ESX, and VMware ESXi operating systems, the event messages are logged in the operating system log file and the Server Administrator event log. The default name of the operating system log file is operating system log file using a text editor such as Administrator event log file is named or

Linux, SUSE Linux Enterprise Server, Citrix XenServer and VMware ESX operating systems, the Server Administrator event log file is located in the

/opt/dell/srvadmin/var/log/openmanage

operating system, the Server Administrator event log file is located in the

/etc/cim/dell/srvadmin/log/openmanage

and is located in the

install_path

bit depending on the operating system. In the Red Hat Enterprise

<install_path>\omsa\log

C:\Program Files\Dell\SysMgt

/var/log/messages

, and you can view the

emacs

dcsys<xx>.xml

directory. In the VMware ESXi

directory.

, where xx is either 32

directory.

. The Server

Logging Messages to a Unicode Text File

Logging messages to a Unicode text file is optional. By default, the feature is disabled in the Server Administrator. To enable this feature, modify the

Event Manager section of the dcemdy<xx>.ini configuration file where xx is 32 or 64 bit depending on the operating system, as follows:

• On systems running Microsoft Windows operating systems, you can locate the configuration file in the the property

C:\Program Files\Dell\SysMgt

service to enable the setting. The Server Administrator Unicode text event log file is named <

install_path>\omsa\log directory

• On systems running the Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Citrix XenServer and VMware ESX operating systems, you can locate the configuration file in the

srvadmin-deng/ini

UnitextLog.enabled=true

dcsys32.log

directory and set the property

<install_path>\dataeng\ini

. The default

. Restart the

and is located in the

. Run the

DSM SA Event Manager

/opt/dell/srvadmin/etc/

/etc/init.d/dataeng

directory and set

install_path

Introduction

restart

command to restart the Server Administrator Event Manager service and enable the setting. This also restarts the Server Administrator Data Manager and SNMP services. The Server Administrator Unicode text event log file is named on the operating system and is located in the

openmanage

directory.

dcsys<xx>.log

where xx is 32 or 64 bit depending

/opt/dell/srvadmin/var/log/

The following sub-sections explain how to launch the Windows Server 2003, Windows Server 2008, Red Hat Enterprise Linux, SUSE Linux Enterprise Server, VMware ESX, and VMware ESXi event viewers.

Viewing Events in Windows Server 2003 and Windows Server 2008

Click the

Double-click

In the

The

To view the details of an event, double-click one of the event items.

Start

button, point to

Administrative Tools

Event Viewer

System Log

NOTE:

You can also look up the dcsys<xx>.xml file, in the <install_path>\omsa\log directory, to view the separate event log file, where the default install_path is C:\Program Files\Dell\SysMgt and xx is 32 or 64 depending on the operating system that is installed.

window, click the

window displays a list of recently logged events.

Settings

, and click

Control Panel

, and then double-click

Tr ee

tab and then click

Event Viewer

System Log

Viewing Events in Red Hat Enterprise Linux and SUSE Linux Enterprise Server

Use a text editor such as vi or

/var/log/messages

The following example shows the Red Hat Enterprise Linux and SUSE Linux Enterprise Server message log, /var/log/messages. The text in boldface type indicates the message text.

root

emacs

to view the file named

NOTE:

These messages are typically displayed as one long line. In the following example, the message is displayed using line breaks to help you see the message text more clearly.

...

Introduction

Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service EventID: 1000

Server Administrator starting

Feb 6 14:20:51 server01 Server Administrator: Instrumentation Service EventID: 1001

Server Administrator startup complete

Feb 6 14:21:21 server01 Server Administrator: Instrumentation Service EventID: 1254 Chassis

intrusion detected Sensor location: Main chassis intrusion Chassis location: Main System Chassis Previous state was: OK (Normal) Chassis intrusion state: Open

Feb 6 14:21:51 server01 Server Administrator: Instrumentation Service EventID: 1252 Chassis

intrusion returned to normal Sensor location: Main chassis intrusion Chassis location: Main System Chassis Previous state was: Critical (Failed) Chassis intrusion state: Closed

Viewing Events in VMware ESX/ESXi

Click

ViewAdministrationSystem Logs

Select

Server Log

/var/log/messages

entry from the drop-down list.

Viewing the Event Information

The event log for each operating system contains some or all of the following information:

•

Date

— The date the event occurred.

•

Time

— The local time the event occurred.

•

Ty p e

— A classification of the event severity: Information, Warning,

or Error.

•

User

— The name of the user on whose behalf the event occurred.

•

Computer

— The name of the system where the event occurred.

Introduction

•

Source

— The software that logged the event.

•

Server Management Messages

The following tables lists in numerical order each event ID and its corresponding description, along with its severity and cause.

NOTE:

For corrective actions, see the appropriate documentation.

Server Administrator General Messages

The messages in Table 2-1 indicate that certain alert systems are up and working.

Table 2-1. Server Administrator General Messages

Event IDDescription Severity Cause

0000 Log was cleared Information User cleared the log from

Server Administrator.

A user can clear the OpenManage Server Administrator log. This operation does not clear the operating system event log. Therefore, this event is not logged in the operating system event log. This is logged in the OpenManage System Administrator alert log.

0001 Log backup created Information The log was full, copied to

backup, and cleared.

1000 Server Administrator

starting

1001 Server Administrator

startup complete

Information Server Administrator is

beginning to initialize.

Information Server Administrator

completed its initialization.

Server Management Messages

Table 2-1. Server Administrator General Messages

Event IDDescription Severity Cause

(continued)

1002 A system BIOS update

has been scheduled for the next reboot

1003 A previously scheduled

system BIOS update has been canceled

1004 Thermal shutdown

protection has been initiated

1005 SMBIOS data is absent Error The system does not contain

1006 Automatic System

Recovery (ASR) action was performed Action performed was: <Action> Date and time of action: <Date and time>

Information The user has chosen to update

the flash basic input/output system (BIOS).

Information The user decides to cancel the

flash BIOS update, or an error occurs during the flash.

Error This message is generated

when a system is configured for thermal shutdown due to an error event. If a temperature sensor reading exceeds the error threshold for which the system is configured, the operating system shuts down and the system powers off. This event may also be initiated on certain systems when a fan enclosure is removed from the system for an extended period of time.

the required systems management BIOS version 2.2 or higher, or the BIOS is corrupted.

Error This message is generated

when an automatic system recovery action is performed due to a hung operating system. The action performed and the time of action is provided.

Server Management Messages

Table 2-1. Server Administrator General Messages

Event IDDescription Severity Cause

(continued)

1007 User initiated host

system control action Action requested was: <Action>

1008 Systems Management

Data Manager Started

1009 Systems Management

Data Manager Stopped

1011 RCI table is corrupt Error This message is generated

1012 IPMI Status

Interface: <the IPMI interface being used>, <additional information if available and applicable>

Information User requested a host system

control action to reboot, power off, or power cycle the system. Alternatively, the user had indicated protective measures to be initiated in the event of a thermal shutdown.

Information Systems Management

Data Manager services were started.

Information Systems Management

Data Manager services were stopped.

when the BIOS Remote Configuration Interface (RCI) table is corrupted or cannot be read by the systems management software.

Information This message is generated

to indicate the Intelligent Platform Management Interface (IPMI)) status of the system.

Additional information, when available, includes Baseboard Management Controller (BMC) not present, BMC not responding, System Event Log (SEL) not present, and SEL Data Record (SDR) not present.

Server Management Messages

Table 2-1. Server Administrator General Messages

Event IDDescription Severity Cause

(continued)

1013 System Peak Power

detected new peak value Peak value (in Watts):<Reading>

1014 System software

event:<Description> Date and time of action:<Date and time>

Information The system peak power sensor

detected a new peak value in power consumption. The new peak value in Watts is provided.

Warning This event is generated when

the systems management agent detects a critical system software generated event in the system event log which could have been resolved.

Temperature Sensor Messages

The temperature sensors listed in Table 2-2 help protect critical components by alerting the systems management console when temperatures become too high inside a chassis. The temperature sensor messages use additional variables: sensor location, chassis location, previous state, and temperature sensor value or state.

Server Management Messages

Table 2-2. Temperature Sensor Messages

Event IDDescription Severity Cause

1050 Temperature sensor has failed

Sensor location: <Location in chassis> Chassis location: <Name of chassis> Previous state was: <State>

If sensor type is not discrete:

Temperature sensor value (in degrees Celsius): <Reading>

If sensor type is discrete:

Discrete temperature state:

<State>

1051 Temperature sensor value

unknown

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

If sensor type is not discrete:

Temperature sensor value (in degrees Celsius): <Reading>

If sensor type is discrete:

Discrete temperature state:

<State>

Error A temperature

sensor on the backplane board, system board, or the carrier in the specified system failed. The sensor location, chassis location, previous state, and temperature sensor value are provided.

Information A temperature

sensor on the backplane board, system board, or drive carrier in the specified system could not obtain a reading. The sensor location, chassis location, previous state, and anominal temperature sensor value information is provided.

Server Management Messages

Table 2-2. Temperature Sensor Messages

Event IDDescription Severity Cause

(continued)

1052 Temperature sensor returned

to a normal value

Sensor location: <Location in

chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Temperature sensor value (in degrees Celsius): <Reading>

If sensor type is discrete:

Discrete temperature state:

<State>

1053 Temperature sensor detected

a warning value

Sensor location: <Location in

chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Temperature sensor value (in degrees Celsius): <Reading>

If sensor type is discrete:

Discrete temperature state:

<State>

Information A temperature

sensor on the backplane board, system board, or drive carrier in the specified system returned to a valid range after crossing a failure threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided.

Warning A temperature

sensor on the backplane board, system board, CPU, or drive carrier in the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided.

Server Management Messages

Table 2-2. Temperature Sensor Messages

Event IDDescription Severity Cause

(continued)

1054 Temperature sensor detected

a failure value

Sensor location: <Location in

chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Temperature sensor value (in degrees Celsius): <Reading>

If sensor type is discrete:

Discrete temperature state:

<State>

1055 Temperature sensor detected

a non-recoverable value

Sensor location: <Location in

chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Temperature sensor value (in degrees Celsius): <Reading>

If sensor type is discrete:

Discrete temperature state:

<State>

Error A temperature

sensor on the backplane board, system board, or drive carrier in the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and temperature sensor value are provided.

Error A temperature

sensor on the backplane board, system board, or drive carrier in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and temperature sensor value information is provided.

Server Management Messages

Cooling Device Messages

The cooling device sensors listed in Table 2-3 monitor how well a fan is functioning. Cooling device messages provide status and warning information for fans in a particular chassis.

Table 2-3. Cooling Device Messages

Event IDDescription Severity Cause

1100 Fan sensor has failed

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Fan sensor value: <Reading>

1101 Fan sensor value unknown

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Fan sensor value: <Reading>

Error A fan sensor in the

specified system is not functioning. The sensor location, chassis location, previous state, and fan sensor value information is provided.

Error A fan sensor in the

specified system could not obtain a reading. The sensor location, chassis location, previous state, and a nominal fan sensor value information is provided.

Server Management Messages

Table 2-3. Cooling Device Messages

Event IDDescription Severity Cause

(continued)

1102 Fan sensor returned to a

normal value

Sensor location: <Location in

chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Fan sensor value: <Reading>

1103 Fan sensor detected a warning

value

Sensor location: <Location in

chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Fan sensor value: <Reading>

1104 Fan sensor detected a failure

value

Sensor location: <Location in

chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Fan sensor value: <Reading>

Information A fan sensor

reading on the specified system returned to a valid range after crossing a warning threshold. The sensor location, chassis location, previous state, and fan sensor value information is provided.

Warning A fan sensor

reading in the specified system exceeded a warning threshold. The sensor location, chassis location, previous state, and fan sensor value information is provided.

Error A fan sensor in the

specified system detected the failure of one or more fans. The sensor location, chassis location, previous state, and fan sensor value information is provided.

Server Management Messages

Table 2-3. Cooling Device Messages

Event IDDescription Severity Cause

(continued)

1105 Fan sensor detected a

non-recoverable value

Sensor location: <Location in

chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Fan sensor value: <Reading>

Error A fan sensor

detected an error from which it cannot recover. The sensor location, chassis location, previous state, and fan sensor value information is provided.

Server Management Messages

Voltage Sensor Messages

The voltage sensors listed in Table 2-4 monitor the number of volts across critical components. Voltage sensor messages provide status and warning information for voltage sensors in a particular chassis.

Table 2-4. Voltage Sensor Messages

Event IDDescription Severity Cause

1150 Voltage sensor has failed

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Voltage sensor value (in Volts): <Reading>

If sensor type is discrete:

Discrete voltage state:

<State>

1151 Voltage sensor value unknown

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Voltage sensor value (in Volts): <Reading>

If sensor type is discrete:

Discrete voltage state:

<State>

Error A voltage sensor in

the specified system failed. The sensor location, chassis location, previous state, and voltage sensor value information is provided.

Information A voltage sensor in

the specified system could not obtain a reading. The sensor location, chassis location, previous state, and a nominal voltage sensor value are provided.

Server Management Messages

Table 2-4. Voltage Sensor Messages

Event IDDescription Severity Cause

(continued)

1152 Voltage sensor returned to

a normal value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Voltage sensor value (in Volts): <Reading>

If sensor type is discrete:

Discrete voltage state:

<State>

1153 Voltage sensor detected a

warning value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Voltage sensor value (in Volts): <Reading>

If sensor type is discrete:

Discrete voltage state:

<State>

Information A voltage sensor in

the specified system returned to a valid range after crossing a failure threshold. The sensor location, chassis location, previous state, and voltage sensor value information is provided.

Warning A voltage sensor in

the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and voltage sensor value information is provided.

Server Management Messages

Table 2-4. Voltage Sensor Messages

Event IDDescription Severity Cause

(continued)

1154 Voltage sensor detected

a failure value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Voltage sensor value (in Volts): <Reading>

If sensor type is discrete:

Discrete voltage state:

<State>

1155 Voltage sensor detected a

non-recoverable value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Voltage sensor value (in Volts): <Reading>

If sensor type is discrete:

Discrete voltage state:

<State>

Error A voltage sensor in

the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and voltage sensor value information is provided.

Error A voltage sensor in

the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and voltage sensor value information is provided.

Server Management Messages

Current Sensor Messages

The current sensors listed in Table 2-5 measure the amount of current (in amperes) that is traversing critical components. Current sensor messages provide status and warning information for current sensors in a particular chassis.

Table 2-5. Current Sensor Messages

Event IDDescription Severity Cause

1200 Current sensor has failed

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Current sensor value (in Amps): <Reading> OR

Current sensor value (in Watts): <Reading>

If sensor type is discrete:

Discrete current state: <State>

Error A current sensor

in the specified system failed. The sensor location, chassis location, previous state, and current sensor value are provided.

Server Management Messages

Table 2-5. Current Sensor Messages

Event IDDescription Severity Cause

(continued)

1201 Current sensor value unknown

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Current sensor value (in Amps): <Reading> OR

Current sensor value (in Watts): <Reading>

If sensor type is discrete:

Discrete current state: <State>

1202 Current sensor returned to

a normal value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Current sensor value (in Amps): <Reading> OR

Current sensor value (in Watts): <Reading>

If sensor type is discrete:

Discrete current state: <State>

Error A current sensor

in the specified system could not obtain a reading. The sensor location, chassis location, previous state, and a nominal current sensor value information is provided.

Information A current sensor

in the specified system returned to a valid range after crossing a failure threshold. The sensor location, chassis location, previous state, and current sensor value information is provided.

Server Management Messages

Table 2-5. Current Sensor Messages

Event IDDescription Severity Cause

(continued)

1203 Current sensor detected a

warning value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Current sensor value (in Amps): <Reading> OR

Current sensor value (in Watts): <Reading>

If sensor type is discrete:

Discrete current state: <State>

1204 Current sensor detected a

failure value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Current sensor value (in Amps): <Reading> OR

Current sensor value (in Watts): <Reading>

If sensor type is discrete:

Discrete current state: <State>

Warning A current sensor

in the specified system exceeded its warning threshold. The sensor location, chassis location, previous state, and current sensor value are provided.

Error A current sensor

in the specified system exceeded its failure threshold. The sensor location, chassis location, previous state, and current sensor value are provided.

Server Management Messages

Table 2-5. Current Sensor Messages

Event IDDescription Severity Cause

(continued)

1205 Current sensor detected a

non-recoverable value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

If sensor type is not discrete:

Current sensor value (in Amps): <Reading> OR

Current sensor value (in Watts): <Reading>

If sensor type is discrete:

Discrete current state: <State>

Error A current sensor

in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and current sensor value are provided.

Chassis Intrusion Messages

The chassis intrusion messages listed in Table 2-6 are a security measure. Chassis intrusion means that someone is opening the cover to a system’s chassis. Alerts are sent to prevent unauthorized removal of parts from a chassis.

Server Management Messages

Table 2-6. Chassis Intrusion Messages

Event IDDescription Severity Cause

1250 Chassis intrusion

sensor has failed

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Chassis intrusion state:

1251 Chassis intrusion

sensor value unknown

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Chassis intrusion state:

1252 Chassis intrusion

returned to normal

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Chassis intrusion state:

Error A chassis intrusion sensor

in the specified system failed. The sensor location, chassis location, previous state, and chassis intrusion state are provided.

Error A chassis intrusion sensor

in the specified system could not obtain a reading. The sensor location, chassis location, previous state, and chassis intrusion state are provided.

Information A chassis intrusion sensor

in the specified system detected that a cover was opened while the system was operating but has since been replaced. The sensor location, chassis location, previous state, and chassis intrusion state information is provided.

Server Management Messages

Table 2-6. Chassis Intrusion Messages

Event IDDescription Severity Cause

(continued)

1253 Chassis intrusion in

progress

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Chassis intrusion state:

1254 Chassis intrusion

detected

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Chassis intrusion state:

1255 Chassis intrusion

sensor detected a non-recoverable value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Chassis intrusion state:

Warning A chassis intrusion sensor

in the specified system detected that a system cover is currently being opened and the system is operating. The sensor location, chassis location, previous state, and chassis intrusion state information is provided.

Critical A chassis intrusion sensor

in the specified system detected that the system cover was opened while the system was operating. The sensor location, chassis location, previous state, and chassis intrusion state information is provided.

Error A chassis intrusion sensor

in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and chassis intrusion state information is provided.

Server Management Messages

Redundancy Unit Messages

Redundancy means that a system chassis has more than one of certain critical components. Fans and power supplies, for example, are so important for preventing damage or disruption of a computer system that a chassis may have “extra” fans or power supplies installed. Redundancy allows a second or nth fan to keep the chassis components at a safe temperature when the primary fan has failed. Redundancy is normal when the intended number of critical components are operating. Redundancy is degraded when a component fails but others are still operating. Redundancy is lost when the number of components functioning falls below the redundancy threshold. Table 2-7 lists the redundancy unit messages.

The number of devices required for full redundancy is provided as part of the message, when applicable, for the redundancy unit and the platform. For details on redundancy computation, see the respective platform documentation.

Table 2-7. Redundancy Unit Messages

EventIDDescription Severity Cause

1300 Redundancy sensor has

failed

Redundancy unit: <Redundancy

location in chassis>

Chassis location: <Name of chassis>

Previous redundancy state was: <State>

1301 Redundancy sensor value

unknown

Redundancy unit: <Redundancy

location in chassis>

Chassis location: <Name of chassis>

Previous redundancy state was: <State>

Server Management Messages

Warning A redundancy sensor in

the specified system failed. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided.

Warning A redundancy sensor in

the specified system could not obtain a reading. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided.

Table 2-7. Redundancy Unit Messages

EventIDDescription Severity Cause

(continued)

1302 Redundancy not applicable

Redundancy unit:

Chassis location: <Name of chassis>

Previous redundancy state was: <State>

1303 Redundancy is offline

Redundancy unit:

Chassis location: <Name of chassis>

Previous redundancy state was: <State>

Information A redundancy sensor in

the specified system detected that a unit was not redundant. The redundancy location, chassis location, previous redundancy state, and the number of devices required for full redundancy information is provided.

Information A redundancy sensor in

the specified system detected that a redundant unit is offline. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy information is provided.

Server Management Messages

Table 2-7. Redundancy Unit Messages

EventIDDescription Severity Cause

(continued)

1304 Redundancy regained

Redundancy unit:

Chassis location: <Name of chassis>

Previous redundancy state was: <State>

1305 Redundancy degraded

Redundancy unit:

Chassis location: <Name of chassis>

Previous redundancy state was: <State>

Information A redundancy sensor in

the specified system detected that a “lost” redundancy device has been reconnected or replaced; full redundancy is in effect. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy information is provided.

Warning A redundancy sensor in

the specified system detected that one of the components of the redundancy unit has failed but the unit is still redundant. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy information is provided.

Server Management Messages

Table 2-7. Redundancy Unit Messages

EventIDDescription Severity Cause

(continued)

1306 Redundancy lost

Redundancy unit:

Chassis location: <Name of chassis>

Previous redundancy state was: <State>

Error A redundancy sensor in

the specified system detected that one of the components in the redundant unit has been disconnected, has failed, or is not present. The redundancy unit location, chassis location, previous redundancy state, and the number of devices required for full redundancy are provided.

Server Management Messages

Power Supply Messages

The power supply sensors monitor how well a power supply is functioning. The power supply messages listed in Table 2-8 provide status and warning information for power supplies present in a particular chassis.

Table 2-8. Power Supply Messages

Event IDDescription Severity Cause

1350 Power supply sensor has

failed

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Power Supply type: <type of

power supply>

If in configuration error state:

Configuration error type: <type of configuration error>

Error A power supply sensor

in the specified system failed. The sensor location, chassis location, previous state, power supply type, additional power supply status, and configuration error type information are provided.

Server Management Messages

Table 2-8. Power Supply Messages

Event IDDescription Severity Cause

(continued)

1351 Power supply sensor value

unknown

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Power Supply type: <type of power supply>

If in configuration error state:

Configuration error type: <type of configuration error>

1352 Power supply returned to

normal Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Power Supply type: <type of power supply>

If in configuration error state:

Configuration error type: <type of configuration error>

Information A power supply sensor

in the specified system could not obtain a reading. The sensor location, chassis location, previous state, power supply type, additional power supply status, and configuration error type information are provided.

Information A power supply has

been reconnected or replaced. The sensor location, chassis location, previous state, power supply type, additional power supply status, and configuration error type information are provided.

Server Management Messages

Table 2-8. Power Supply Messages

Event IDDescription Severity Cause

(continued)

1353 Power supply detected a

warning Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Power Supply type: <type of power supply>

If in configuration error state:

Configuration error type: <type of configuration error>

1354 Power supply detected a

failure

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Power Supply type: <type of power supply>

If in configuration error state:

Configuration error type: <type of configuration error>

Warning A power supply sensor

reading in the specified system exceeded a user-definable warning threshold. The sensor location, chassis location, previous state, power supply type, additional power supply status, and configuration error type information are provided.

Error A power supply has

been disconnected or has failed. The sensor location, chassis location, previous state, power supply type, additional power supply status, and configuration error type information are provided.

Server Management Messages

Table 2-8. Power Supply Messages

Event IDDescription Severity Cause

(continued)

1355 Power supply sensor detected

a non-recoverable value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Power Supply type: <type of power supply>

If in configuration error state:

Configuration error type: <type of configuration error>

Error A power supply sensor

in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, power supply type, additional power supply status, and configuration error type information is provided.

Server Management Messages

Memory Device Messages

The memory device messages listed in Table 2-9 provide status and warning information for memory modules present in a particular system. Memory devices determine health status by monitoring the ECC memory correction rate and the type of memory events that have occurred.

NOTE:

A critical status does not always indicate a system failure or loss of data. In some instances, the system has exceeded the ECC correction rate. Although the system continues to function, you should perform system maintenance as described in Table 2-9.

NOTE:

Table 2-9. Memory Device Messages

Event IDDescription Severity Cause

In Table 2-9, <status> can be either

critical

non-critical

1403 Memory device status is

Memory device location:

Possible memory module event cause: <list of

causes>

1404 Memory device status is

Memory device location:

Possible memory module event cause: <list of causes>

Warning A memory device correction

rate exceeded an acceptable value. The memory device status and possible memory module event cause information is provided.

Error A memory device correction

rate exceeded an acceptable value, a memory spare bank was activated, or a multibit ECC error occurred. The system continues to function normally (except for a multibit error). Replace the memory module identified in the message during the system’s next scheduled maintenance. Clear the memory error on multibit ECC error. The memory device status and possible memory module event cause information is provided.

Server Management Messages

Fan Enclosure Messages

Some systems are equipped with a protective enclosure for fans. Fan enclosure messages listed in Table 2-10 monitor whether foreign objects are present in an enclosure and how long a fan enclosure is missing from a chassis.

Table 2-10. Fan Enclosure Messages

Event IDDescription Severity Cause

1450 Fan enclosure sensor

has failed

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

1451 Fan enclosure sensor

value unknown

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

1452 Fan enclosure inserted

into system

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

1453 Fan enclosure removed

from system

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Critical/ Failure / Error

Warning The fan enclosure sensor in

Information A fan enclosure has been

Warning A fan enclosure has been

The fan enclosure sensor in the specified system failed. The sensor and chassis location information is provided.

the specified system could not obtain a reading. The sensor and chassis location information is provided.

inserted into the specified system. The sensor and chassis location information is provided.

removed from the specified system. The sensor and chassis location information is provided.

Server Management Messages

Table 2-10. Fan Enclosure Messages

Event IDDescription Severity Cause

(continued)

1454 Fan enclosure removed

from system for an extended amount of time

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

1455 Fan enclosure sensor

detected a nonrecoverable value

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Error A fan enclosure has been

removed from the specified system for a user-definable length of time. The sensor and chassis location information is provided.

Error A fan enclosure sensor in the

specified system detected an error from which it cannot recover. The sensor and chassis location are provided.

Server Management Messages

AC Power Cord Messages

The AC power cord messages listed in Table 2-11 provide status and warning information for power cords that are part of an AC power switch, if your system supports AC switching.

Table 2-11. AC Power Cord Messages

Event IDDescription Severity Cause

1500 AC power cord sensor

has failed

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

1501 AC power cord is not

being monitored

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

1502 AC power has been

restored

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Critical/ Failure/ Error

Information The AC power cord status is

Information Power is restored in an AC

An AC power cord sensor in the specified system failed. The AC power cord status cannot be monitored. The sensor and chassis location information is provided.

not being monitored. This occurs when a system’s expected AC power configuration is set to nonredundant. The sensor and chassis location information is provided.

power cord that did not have AC power. The sensor and chassis location information is provided.

Server Management Messages

Table 2-11. AC Power Cord Messages

Event IDDescription Severity Cause

(continued)

1503 AC power has been lost

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

1504 AC power has been lost

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

1505 AC power has been lost

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Critical/ Failure/ Error

Error Power supply is disrupted to

Error An AC power cord sensor in

Hardware Log Sensor Messages

Power supply is disrupted to the AC power cord or an AC power cord is not transmitting power, but there is sufficient redundancy to classify this as a warning. The sensor and chassis location information is provided.

the AC power cord or an AC power cord is not transmitting power, and lack of redundancy requires this to be classified as an error. The sensor and chassis location information is provided.

the specified system failed. The AC power cord status cannot be monitored. The sensor and chassis location information is provided.

The hardware logs provide hardware status messages to systems management software. On certain systems, the hardware log is implemented as a circular queue. When the log becomes full, the oldest status messages are overwritten when new status messages are logged. On some systems, the log is not circular. On these systems, when the log becomes full, subsequent hardware status messages are lost. Hardware log sensor messages listed in Table 2-12 provide status and warning information about the noncircular logs that may fill up, resulting in lost status messages.

Server Management Messages

Table 2-12. Hardware Log Sensor Messages

Event IDDescription Severity Cause

1550 Log monitoring has

been disabled

Log type: <Log type>

1551 Log status is unknown

Log type: <Log type>

1552 Log size is no longer

near or at capacity

Log type: <Log type>

1553 Log size is near

capacity

Log type: <Log type>

1554 Log size is full

Log type: <Log type>

1555 Log sensor has failed

Log type: <Log type>

Warning A hardware log sensor in the

specified system is disabled. The log type information is provided.

Information A hardware log sensor in the

specified system could not obtain a reading. The log type information is provided.

Information The hardware log on the

specified system is no longer near or at its capacity, usually as the result of clearing the log. The log type information is provided.

Warning The size of a hardware log on the

specified system is near or at the capacity of the hardware log. The log type information is provided.

Error The size of a hardware log on

the specified system is full. The log type information is provided.

Error A hardware log sensor in the

specified system failed. The hardware log status cannot be monitored. The log type information is provided.

Server Management Messages

Processor Sensor Messages

The processor sensors monitor how well a processor is functioning. Processor messages listed in Table 2-13 provide status and warning information for processors in a particular chassis.

Table 2-13. Processor Sensor Messages

Event IDDescription Severity Cause

1600 Processor sensor has

failed

Sensor Location:

Chassis Location:

Previous state was:

<State>

Processor sensor status: <status>

1601 Processor sensor value

unknown

Sensor Location:

Chassis Location:

Previous state was:

<State>

Processor sensor status: <status>

Critical/ Failure/ Error

A processor sensor in the specified system is not functioning. The sensor location, chassis location, previous state and processor sensor status information is provided.

A processor sensor in the specified system could not obtain a reading. The sensor location, chassis location, previous state and processor sensor status information is provided.

Server Management Messages

Table 2-13. Processor Sensor Messages

Event IDDescription Severity Cause

(continued)

1602 Processor sensor

returned to a normal value

Sensor Location:

Chassis Location:

Previous state was:

<State>

Processor sensor status: <status>

1603 Processor sensor

detected a warning value

Sensor Location:

Chassis Location:

Previous state was:

<State>

Processor sensor status: <status>

Information A processor sensor in the

specified system transitioned back to a normal state. The sensor location, chassis location, previous state and processor sensor status are provided.

Warning A processor sensor in the

specified system is in a throttled state. The sensor location, chassis location, previous state and processor sensor status information is provided.

Server Management Messages

Table 2-13. Processor Sensor Messages

Event IDDescription Severity Cause

(continued)

1604 Processor sensor

detected a failure value

Sensor Location:

Chassis Location:

Previous state was:

<State>

Processor sensor status: <status>

1605 Processor sensor

detected a nonrecoverable value

Sensor Location:

Chassis Location:

Previous state was:

<State>

Processor sensor status: <status>

Error A processor sensor in the

specified system is disabled, has a configuration error, or experienced a thermal trip. The sensor location, chassis location, previous state and processor sensor status are provided.

Error A processor sensor in the

specified system has failed. The sensor location, chassis location, previous state and processor sensor status are provided.

Server Management Messages

Pluggable Device Messages

The pluggable device messages listed in Table 2-14 provide status and error information when some devices, such as memory cards, are added or removed.

Table 2-14. Pluggable Device Messages

Event IDDescription Severity Cause

1650 <Device plug event

type unknown>

Device location:

Chassis location:

Additional details:

1651 Device added to

system

Device location:

Chassis location:

Additional details:

Information A pluggable device event message

of unknown type was received. The device location, chassis location, and additional event details, if available, are provided.

Information A device was added in the

specified system. The device location, chassis location, and additional event details, if available, are provided.

Server Management Messages

Table 2-14. Pluggable Device Messages

Event IDDescription Severity Cause

(continued)

1652 Device removed from

system

Device location:

Chassis location:

Additional details:

1653 Device configuration

error detected

Device location:

Chassis location:

Additional details:

Information A device was removed from the

specified system. The device location, chassis location, and additional event details, if available, are provided.

Error A configuration error was

detected for a pluggable device in the specified system. The device may have been added to the system incorrectly.

Server Management Messages

Battery Sensor Messages

The battery sensors monitor how well a battery is functioning. The battery messages listed in Table 2-15 provide status and warning information for batteries in a particular chassis.

Table 2-15. Battery Sensor Messages

Event IDDescription Severity Cause

1700 Battery sensor has failed

Sensor location: <Location in chassis>

Chassis location: <Name of chassis>

Previous state was: <State>

Battery sensor status: <status>

1701 Battery sensor value unknown

Sensor Location: <Location in

chassis>

Chassis Location: <Name of chassis>

Previous state was: <State>

Battery sensor status:

Critical/ Failure/ Error

Warning A battery sensor in

A battery sensor in the specified system is not functioning. The sensor location, chassis location, previous state, and battery sensor status information is provided.

the specified system could not retrieve a reading. The sensor location, chassis location, previous state, and battery sensor status information is provided.

Server Management Messages

Table 2-15. Battery Sensor Messages

Event IDDescription Severity Cause

(continued)

1702 Battery sensor returned to a

normal value

Sensor Location: <Location in

chassis>

Chassis Location: <Name of chassis>

Previous state was: <State>

Battery sensor status:

1703 Battery sensor detected a

warning value

Sensor Location: <Location in

chassis>

Chassis Location: <Name of chassis>

Previous state was: <State>

Battery sensor status:

1704 Battery sensor detected a

failure value

Sensor Location: <Location in

chassis>

Chassis Location: <Name of chassis>

Previous state was: <State>

Battery sensor status:

Information A battery sensor in

the specified system detected that a battery transitioned back to a normal state. The sensor location, chassis location, previous state, and battery sensor status information is provided.

Warning A battery sensor in

the specified system detected that a battery is in a predictive failure state. The sensor location, chassis location, previous state, and battery sensor status information is provided.

Error A battery sensor in

the specified system detected that a battery has failed. The sensor location, chassis location, previous state, and battery sensor status information is provided.

Server Management Messages

Table 2-15. Battery Sensor Messages

Event IDDescription Severity Cause

(continued)

1705 Battery sensor detected a

non-recoverable value

Sensor Location: <Location in

chassis>

Chassis Location: <Name of chassis>

Previous state was:

Battery sensor status:

<State>

Error A battery sensor in

the specified system could not retrieve a value. The sensor location, chassis location, previous state, and battery sensor status information is provided.

Secure Digital (SD) Card Device Messages

The SD card device sensors monitor instrumented SD card devices in the system. Table 2-16 lists the messages that provide status and error information for SD card devices present in a chassis.

Table 2-16. SD Card Device Messages

Event IDDescription Severity Cause

1750 SD card device sensor has

failed

Sensor location: <Location

in chassis>

Chassis location: <Name of chassis>

Previous state was:

<State>

SD card device type: <Type of SD card device>

SD card state: <State of SD card>

Error An SD card device

sensor in the specified system failed. The sensor location, chassis location, previous state, and SD card device type information is provided. The SD card state is provided if an SD card is present in the SD card device.

Server Management Messages

Table 2-16. SD Card Device Messages

Event IDDescription Severity Cause

1751 SD card device sensor

value unknown

Sensor location: <Location

in chassis>

Chassis location: <Name of chassis>

Previous state was:

<State>

SD card device type: <Type of SD card device>

SD card state: <State of SD card>

1752 SD card device returned to

normal

Sensor location: <Location

in chassis>

Chassis location: <Name of chassis>

Previous state was:

<State>

SD card device type: <Type of SD card device>

SD card state: <State of SD card>

Information An SD card device

sensor in the specified system could not obtain a reading. The sensor location, chassis location, previous state, and SD card device type information is provided. The SD card state is provided if an SD card is present in the SD card device.

Information An SD card device

sensor in the specified system detected that an SD card transitioned back to a normal state. The sensor location, chassis location, previous state, and SD card device type information is provided. The SD card state is provided if an SD card is present in the SD card device.

Server Management Messages

Table 2-16. SD Card Device Messages

Event IDDescription Severity Cause

1753 SD card device detected a

warning

Sensor location: <Location

in chassis>

Chassis location: <Name of chassis>

Previous state was:

<State>

SD card device type: <Type of SD card device>

SD card state: <State of SD card>

1754 SD card device detected a

failure

Sensor location: <Location

in chassis>

Chassis location: <Name of chassis>

Previous state was:

<State>

SD card device type: <Type of SD card device>

SD card state: <State of SD card>

Warning An SD card device

sensor in the specified system detected a warning condition. The sensor location, chassis location, previous state, and SD card device type information is provided. The SD card state is provided if an SD card is present in the SD card device.

Error An SD card device

sensor in the specified system detected an error. The sensor location, chassis location, previous state, and SD card device type information is provided. The SD card state is provided if an SD card is present in the SD card device.

Server Management Messages

Table 2-16. SD Card Device Messages

Event IDDescription Severity Cause

1755 SD card device sensor

detected a non-recoverable value

Sensor location: <Location

in chassis>

Chassis location: <Name of chassis>

Previous state was:

<State>

SD card device type: <Type of SD card device>

SD card state: <State of SD card>

Error An SD card device

sensor in the specified system detected an error from which it cannot recover. The sensor location, chassis location, previous state, and SD card device type information is provided. The SD card state is provided if an SD card is present in the SD card device.

Server Management Messages

Chassis Management Controller Messages

The Alerts sent by Dell M1000e Chassis Management Controller (CMC) are organized by severity. That is, the event ID of the CMC trap indicates the severity (informational, warning, critical, or non-recoverable) of the alert. Each CMC alert includes the originating system name, location, and event message text. The alert message text matches the corresponding Chassis Event Log message text that is logged by the sending CMC for that event.

Table 2-17. Chassis Management Controller Messages

EventID Description Severity Cause

2000 CMC generated a

test trap

2002 CMC reported a

return-to-normal or informational event

2003 CMC reported a

warning

2004 CMC reported a

critical event

2005 CMC reported a

non-recoverable event

Informational A user-initiated test trap

was issued, through the CMC GUI or RACADM CLI.

Informational CMC informational

event, as described in the drsCAMessage variable binding supplied with the alert.

Warning CMC warning event, as

described in the drsCAMessage variable supplied with the alert.

Critical CMC critical event, as

described in the drsCAMessage variable binding supplied with the alert.

Non-Recoverable CMC non-recoverable

event, as described in the drsCAMessage variable binding supplied with the alert.

Server Management Messages

Storage Management Message Reference

The Dell OpenManage Server Administrator Storage Management’s alert or event management features let you monitor the health of storage resources such as controllers, enclosures, physical disks, and virtual disks.

Alert Monitoring and Logging

The Storage Management Service performs alert monitoring and logging. By default, the Storage Management service starts when the managed system starts up. If you stop the Storage Management Service, then alert monitoring and logging stops. Alert monitoring does the following:

• Updates the status of the storage object that generated the alert.

• Propagates the storage object’s status to all the related higher objects in the storage hierarchy. For example, the status of a lower-level object is propagated up to the status displayed on the

Storage

• Logs an alert in the alert log and the operating system application log.

• Sends an SNMP trap if the operating system’s SNMP service is installed and enabled.

object.

Health

tab for the top-level

NOTE:

Dell OpenManage Server Administrator Storage Management does not log alerts regarding the data I/O path. These alerts are logged by the respective RAID drivers in the system alert log.

See the Dell OpenManage Server Administrator Storage Management Online Help for updated information.

Storage Management Message Reference

Alert Message Format with Substitution Variables

When you view an alert in the Server Administrator alert log, the alert identifies the specific components such as the controller name or the virtual disk name to which the alert applies. In an actual operating environment, a storage system can have many combinations of controllers and disks as well as user-defined names for virtual disks and other components. Each environment is unique in its storage configuration and user-defined names. To receive an accurate alert message, that the Storage Management service must be able to insert the environment-specific names of storage components into an alert message.

This environment-specific information is inserted after the alert message text as shown for alert 2127 in Table 3-1.

For other alerts, the alert message text is constructed from information passed directly from the controller (or another storage component) to the alert log. In these cases, the variable information is represented with a percent symbol in the Storage Management documentation. An example of such an alert is shown for alert 2334 in Table 3-1.

Table 3-1. Alert Message Format

Alert ID Message Text Displayed in the

Storage Management Service Documentation

2127 Background Initialization

started

2334 Controller event log% Controller event log: Current capacity of the

Message Text Displayed in the Alert Log with Variable Information Supplied

Background Initialization started: Virtual Disk 3 (Virtual Disk 3) Controller 1 (PERC 5/E Adapter)

battery is above threshold.: Controller 1 (PERC 5/E Adapter)

The variables required to complete the message vary depending on the type of storage object and whether the storage object is in a SCSI or SAS configuration. The following table identifies the possible variables used to identify each storage object.

NOTE:

Some alert messages relating to an enclosure or an enclosure component, such as a fan or EMM, are generated by the controller when the enclosure or enclosure component ID cannot be determined.

Storage Management Message Reference

NOTE:

A, B, C and X, Y, Z in the following examples are variables representing the

storage object name or number.

Table 3-2. Message Format with Variables for Each Storage Object

Storage Object Message Variables

Controller Message Format: Controller A (Name)

Message Format: Controller A

For example, 2326 A foreign configuration has been detected: Controller 1 (PERC 5/E Adapter)

NOTE:

The controller name is not always displayed.

Battery Message Format: Battery X Controller A

For example, 2174 The controller battery has been removed: Battery 0 Controller 1

SCSI Physical Disk

SAS Physical Disk

Virtual Disk Message Format: Virtual Disk X (Name) Controller A (Name)

Enclosure: Message Format: Enclosure X:Y Controller A, Connector B

SCSI Power Supply

Message Format: Physical Disk X:Y Controller A, Connector B

For example, 2049 Physical disk removed: Physical Disk 0:14 Controller 1, Connector 0

Message Format: Physical Disk X:Y:Z Controller A, Connector B

For example, 2049 Physical disk removed: Physical Disk 0:0:14 Controller 1, Connector 0

Message Format: Virtual Disk X Controller A

For example, 2057 Virtual disk degraded: Virtual Disk 11 (Virtual Disk 11) Controller 1 (PERC 5/E Adapter)

NOTE:

The virtual disk and controller names are not always displayed.

For example, 2112 Enclosure shutdown: Enclosure 0:2 Controller 1, Connector 0

Message Format: Power Supply X Controller A, Connector B, Tar ge t I D C

where "C" is the SCSI ID number of the enclosure management module (EMM) managing the power supply.

For example, 2122 Redundancy degraded: Power Supply 1, Controller 1, Connector 0, Target ID 6

Storage Management Message Reference

Table 3-2. Message Format with Variables for Each Storage Object

Storage Object Message Variables

SAS Power Supply

SCSI Temperature Probe

SAS Temperature Probe

SCSI Fan Message Format: Fan X Controller A, Connector B, Target ID C

SAS Fan Message Format: Fan X Controller A, Connector B, Enclosure C

SCSI EMM Message Format: EMM X Controller A, Connector B, Target ID C

SAS EMM Message Format: EMM X Controller A, Connector B, Enclosure C

Message Format: Power Supply X Controller A, Connector B, Enclosure C

For example, 2312 A power supply in the enclosure has an AC failure: Power Supply 1, Controller 1, Connector 0, Enclosure 2

Message Format: Temperature Probe X Controller A, Connector B, Tar g et ID C

where C is the SCSI ID number of the EMM managing the temperature probe.

For example, 2101 Temperature dropped below the minimum warning threshold: Temperature Probe 1, Controller 1, Connector 0, Tar g et ID 6

Message Format: Temperature Probe X Controller A, Connector B, Enclosure C

For example, 2101 Temperature dropped below the minimum warning threshold: Temperature Probe 1, Controller 1, Connector 0, Enclosure 2

where C is the SCSI ID number of the EMM managing the fan.

For example, 2121 Device returned to normal: Fan 1, Controller 1, Connector 0, Target ID 6

For example, 2121 Device returned to normal: Fan 1, Controller 1, Connector 0, Enclosure 2

where C is the SCSI ID number of the EMM.

For example, 2121 Device returned to normal: EMM 1, Controller 1, Connector 0, Target ID 6

For example, 2121 Device returned to normal: EMM 1, Controller 1, Connector 0, Enclosure 2

(continued)

Storage Management Message Reference

Alert Message Change History

The following table describes the changes made to the Storage Management alerts from the previous release of Storage Management to the current release.

Table 3-3. Alert Message Change History

Storage Management 3.5

Product Versions to which changes apply

New Alerts None

Deleted Alerts None

Modified Alerts 2388, 2347, 2081

Storage Management 3.4

Product Versions to which changes apply

New Alerts 2405, 2406, 2407, 2408, 2409, 2410, 2411,

NOTE:

The Dell Key Manager (DKM) and CacheCade features are available from

calendar year 2011.

Deleted Alerts None

Modified Alerts None

Storage Management 3.3

Product Versions to which changes apply

New Alerts 2394, 2395, 2396, 2397, 2398, 2399, 2400,

Deleted Alerts None

Modified Alerts Alert severity changed for 1151 and 1351

Storage Management 3.5.0

Server Administrator 4.5.0

Dell OpenManage 6.5.0

Storage Management 3.4.0

Server Administrator 4.4.0

Dell OpenManage 6.4.0

2412, 2413, 2414, 2415, 2416, 2417, 2418

Storage Management 3.3.0

Server Administrator 4.3.0

Dell OpenManage 6.3.0

2401, 2402, 2403, 2404

Storage Management Message Reference

Table 3-3. Alert Message Change History

Storage Management 3.2

Product Versions to which changes apply

New Alerts 2387, 2388, 2389, 2390, 2392, 2393

Deleted Alerts None

Modified Alerts None

(continued)

Storage Management 3.2.0

Server Administrator 4.2.0

Dell OpenManage 6.2.0

Alert Descriptions and Corrective Actions

The following sections describe alerts generated by the RAID or SCSI controllers supported by Storage Management. The alerts are displayed in the Server Administrator Alert tab or through Windows Event Viewer. These alerts can also be forwarded as SNMP traps to other applications.

SNMP traps are generated for the alerts listed in the following sections. These traps are included in the Dell OpenManage Server Administrator Storage Management management information base (MIB). The SNMP traps for these alerts use all of the SNMP trap variables. For more information on SNMP support and the MIB, see the Dell OpenManage SNMP Reference Guide.

To locate an alert, scroll through the following table to find the alert number displayed on the Server Administrator Alert tab or search this file for the alert message text or number. See “Understanding Event Messages” on page 8 for more information on severity levels.

For more information regarding alert descriptions and the appropriate corrective actions, see the online help.

Storage Management Message Reference

Table 3-4. Storage Management Messages

EventIDDescription Severity Cause and Action Related Alert

Information

2048 Device failed Critical /

Failure / Error

Cause: A storage component such as a physical disk or an enclosure has failed. The failed component may have been identified by the controller while performing a task such as a rescan or a check consistency.

Action: Replace the failed component. You can identify which disk has failed by locating the disk that has a red “X” for its status. Perform a rescan after replacing the failed component.

Clear Alert Number:

2121.

Related Alert Number:

2095, 2201, 2203

Local Response Agent (LRA) Number:

2051, 2061, 2071, 2081, 2091, 2101

SNMP Trap Numbers

754 804 854 904 954 1004 1054 1104 1154 1204

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2049 Physical disk

removed

War n in g / Non-critical

Cause: A physical disk has been removed from the disk group. This alert can also be caused by loose or defective cables or by problems with the enclosure.

Action: If a physical disk was removed from the disk group, either replace the disk or restore the original disk. On some controllers, a removed disk has a red X for its status. On other controllers, a removed disk may have an Offline status or is not displayed on the user interface. Perform a rescan after replacing or restoring the disk. If a disk has not been removed from the disk group, then check for problems with the cables. See the

online help more information on checking Ensure that the enclosure is powered on. If the problem persists, check the enclosure documentation for further diagnostic information.

the cables.

for

Clear Alert Number:

2052.

Related Alert Number:

2054, 2057, 2056, 2076, 2079, 2081, 2083, 2129, 2202, 2204, 2270, 2292, 2299, 2369

LRA Number: 2070

SNMP Tra p Numbers

903

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2050 Physical disk

offline

2051 Physical disk

degraded

War n in g / Non-critical

Cause: A physical disk in the disk group is offline. The user may have manually put the physical disk offline.

Action: Perform a rescan. You can also select the offline disk and perform a Make

Online operation.

Cause: A physical disk

has reported an error condition and may be degraded. The physical disk may have reported the error condition in response to a consistency check or other operation.

Action: Replace the degraded physical disk. You can identify which disk is degraded by locating the disk that has a red X for its status. Perform a rescan after replacing the disk.

Clear Alert Number:

2158.

Related Alert Number:

2099, 2196

LRA Number: 2070

Clear Alert:

None

Related Alert Number: 2070

LRA Number:

None

SNMP Trap Numbers

903

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2052 Physical disk

inserted

2053 Virtual disk

created

2054 Virtual disk

deleted

2055 Virtual disk

configuration changed

OK / Normal / Informational

War n in g / Non-critical

OK / Normal / Informational

Cause: This alert is for informational purposes.

Action: None

Cause: This alert is for

informational purposes.

Action: None

Cause: A virtual disk

has been deleted. Perfor ming a Reset Configuration may detect that a virtual disk has been deleted.

Action: None

Cause: This alert is for

informational purposes.

Action: None

Clear Alert:

None

Related Alert Number:

2065, 2305, 2367

LRA Number:

None

Clear Alert:

None

Related Alert:

None

LRA Number:

None

Clear Alert:

None

Related Alert:

None

LRA Number: 2080

Clear Alert:

None

Related Alert:

None

LRA Number:

None

SNMP Tra p Numbers

901

1201

1203

1201

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2056 Virtual disk

failed

Critical / Failure / Error

Cause: One or more physical disks included in the virtual disk have failed. If the virtual disk is non-redundant (does not use mirrored or parity data), then the failure of a single physical disk can cause the virtual disk to fail. If the virtual disk is redundant, then more physical disks have failed than can be rebuilt using mirrored or parity information.

Create a new

Action: virtual disk and restore from a backup.

The disk controller rebuilds the virtual disk by first configuring a hot spare for the disk, and then initiating a write operation to the disk. The write operation initiates a rebuild of the disk.

Clear Alert:

None

Related Alert Number:

2048, 2049, 2050, 2076, 2079, 2081, 2129, 2346

LRA Number: 2081

SNMP Trap Numbers

1204

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2057 Virtual disk

degraded

War n in g / Non-critical

Cause 1: This alert message occurs when a physical disk included in a redundant virtual disk fails. Because the virtual disk is redundant (uses mirrored or parity information) and only one physical disk has failed, the virtual disk can be rebuilt.

Action 1: Configure a hot spare for the virtual disk, if one is not already configured. Rebuild the virtual disk. If you are using an Expandable RAID Controller (PERC) PERC 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, CERC ATA100/4ch, PERC 5/E, PERC 5/i or a Serial Attached SCSI (SAS) 5/iR controller, rebuild the virtual disk by first configuring a hot spare for the disk, and then initiating a write operation to the disk. The write operation initiates a rebuild of the disk.

Clear Alert Number:

None

Related Alert Number:

2048, 2049, 2050, 2076, 2079, 2081, 2123, 2129, 2346

LRA Number: 2080

SNMP Tra p Numbers

1203

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2057 contd.

2058 Virtual disk

check consistency started

2059 Virtual disk

format started

OK / Normal / Informational

Cause 2: A physical disk in the disk group has been removed.

Action 2: If a physical disk was removed from the disk group, either replace the disk or restore the original disk. You can identify which disk has been removed by locating the disk that has a red “X” for its status. Perform a rescan after replacing the disk.

Cause: This alert is for informational purposes.

Action: None

Cause: This alert is for

informational purposes.

Action: None

Clear Alert Number:

2085.

Related Alert Number:

None

LRA Number:

None

Clear Alert Number:

2086.

Related Alert Number:

None

LRA Number:

None

SNMP Trap Numbers

1201

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2060 Copy of data

started on physical disk 1 from physical disk 2.

2061 Virtual disk

initialization started

2062 Physical disk

initialization started

OK / Normal /Informationa l

OK / Normal / Informational

Cause: This alert is for informational purposes.

Action: None

Cause: This alert is for

informational purposes.

Action: None

Cause: This alert is for

informational purposes.

Action: None

Clear Alert Number:

None

Related Alert Number: 2075

LRA Number:

None

Clear Alert Number:

2088.

Related Alert Number:

None

LRA Number:

None

Clear Alert Number:

2089.

Related Alert Number:

None

LRA Number:

None

SNMP Tra p Numbers

901

1201

901

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2063 Virtual disk

reconfiguratio n started

2064 Virtual disk

rebuild started

2065 Physical disk

rebuild started

OK / Normal / Informational

Cause: This alert is for informational purposes.

Action: None

Cause: This alert is for

informational purposes.

Action: None

Cause: This alert is for

informational purposes.

Action: None

Clear Alert Number:

2090.

Related Alert Number:

None

LRA Number:

None

Clear Alert Number:

2091.

Related Alert Number:

None

LRA Number:

None

Clear Alert Number:

2092.

Related Alert Number:

2099, 2121, 2196

LRA Number:

None

SNMP Trap Numbers

1201

901

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2067 Virtual disk

check consistency cancelled

OK / Normal / Informational

Cause: The check consistency operation was cancelled because a physical disk in the array has failed or because a user cancelled the check consistency operation.

Action: If the physical disk failed, then replace the physical disk. You can identify which disk failed by locating the disk that has a red “X” for its status. Perform arescan after replacing the disk. When performing a consistency check, be aware that the consistency check can take a long time. The time it takes depends on the size of the physical disk or the virtual disk.

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

None

SNMP Tra p Numbers

1201

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2070 Virtual disk

initialization cancelled

2074 Physical disk

rebuild cancelled

OK / Normal / Informational

Cause: The virtual disk initialization cancelled because a physical disk included in the virtual disk has failed or because a user cancelled the virtual disk initialization.

Action: If a physical disk failed, then replace the physical disk. You can identify which disk has failed by locating the disk that has a red “X” for its status. Perform a rescan after replacing the disk. Restart the format physical disk operation. Restart the virtual disk initialization.

Cause: The user has cancelled the rebuild operation.

Action: Restart the rebuild operation.

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

None

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

None

SNMP Trap Numbers

1201

901

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2075 Copy of data

completed on physical disk %2 from physical disk %1

2076 Virtual disk

Check Consistency failed

OK / Normal / Informational

Critical / Failure / Error

Cause: This alert is provided for informational purposes.

Action: None

Cause: A physical disk

included in the virtual disk failed or there is an error in the parity information. A failed physical disk can cause errors in parity information.

Action: Replace the failed physical disk. You can identify which disk has failed by locating the disk that has a red “X” for its status. Rebuild the physical disk. When finished, restart the check consistency operation.

Clear Alert Number:

None

Related Alert Number:

2060.

LRA Number:

None

Clear Alert Number:

None

Related Alert Number:

None

LRA Number: 2081

SNMP Tra p Numbers

901

1204

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2077 Virtual disk

format failed

2079 Virtual disk

initialization failed

2080 Physical disk

initialization failed

Critical / Failure / Error

Cause: A physical disk included in the virtual disk failed.

Action: Replace the failed physical disk. You can identify which physical disk has failed by locating the disk that has a red X for its status. Rebuild the physical disk. When finished, restart the virtual disk format operation.

Cause: A physical disk included in the virtual disk has failed or a user has cancelled the initialization.

Action: If a physical disk has failed, then replace the physical disk.

Cause: The physical disk has failed or is corrupt.

Action: Replace the failed or corrupt disk. You can identify a disk that has failed by locating the disk that has a red “X” for its status. Restart the initialization.

Clear Alert Number:

None

Related Alert Number:

None

LRA Number: 2081

Clear Alert Number:

None

Related Alert Number:

None

LRA Number: 2081

Clear Alert Number:

None

Related Alert Number:

None

LRA Number: 2071

SNMP Trap Numbers

1204

904

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2081 Virtual disk

reconfiguratio n failed

Critical / Failure / Error

Hardware RAID:

Cause: A physical disk

included in the virtual disk has failed or is corrupt. A user may also have cancelled the reconfiguration.

Action: Replace the failed or corrupt disk. You can identify a disk that has failed by locating the disk that dispalys a red X in the status field.

If the physical disk is part of a redundant array, then rebuild the physical disk. When finished, restart the reconfiguration.

Clear Alert Number:

None

Related Alert Number:

None

LRA Number: 2081

SNMP Tra p Numbers

1204

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2081

Virtual disk

contd.

reconfiguratio n failed

2082 Virtual disk

rebuild failed

Critical / Failure / Error

Software RAID:

•Perform a backup with the Verify option.

• If the file backup fails, try to restore the failed file from a previous backup.

• When the backup with the Verify option is complete without any errors, delete the Virtual Disk.

• Recreate a new Virtual Disk with new drives.

• Restore the data from backup.

Cause: A physical disk included in the virtual disk has failed or is corrupt. A user may also have cancelled the rebuild.

Action: Replace the failed or corrupt disk. You can identify a disk that has failed by locating the disk that has a red “X” for its status. Restart the virtual disk rebuild.

Clear Alert Number:

None

Related Alert Number:

None

LRA Number: 2081

Clear Alert Number:

None

Related Alert Number: 2048

LRA Number:

2081

SNMP Trap Numbers

1204

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2083 Physical disk

rebuild failed

2085 Virtual disk

check consistency completed

2086 Virtual disk

format completed

Critical / Failure / Error

OK / Normal / Informational

Cause: A physical disk included in the virtual disk has failed or is corrupt. A user may also have cancelled the rebuild.

Action: Replace the failed or corrupt disk. You can identify a disk that has failed by locating the disk that has a red “X” for its status. Rebuild the virtual disk rebuild.

Cause: This alert is for informational purposes.

Action: None

Cause: This alert is for

informational purposes.

Action: None

Clear Alert Number:

None

Related Alert Number:

None

LRA Number: 2071

Clear Alert Status: Alert

2085 is a clear alert for alert

2058.

Related Alert Number:

None

LRA Number:

None

Clear Alert Status: Alert

2086 is a clear alert for alert

2059.

Related Alert Number:

None

LRA Number:

None

SNMP Tra p Numbers

904

1201

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2087 Copy of data

resumed from physical disk %2 to physical disk %1

2088 Virtual disk

initialization completed

2089 Physical disk

initialization completed

OK / Normal / Informational

Cause: This alert is for informational purposes.

Action: None

Cause: This alert is for

informational purposes.

Action: None

Cause: This alert is for

informational purposes.

Action: None

Clear Alert Status: None

Related Alert Number: 260.

LRA Number:

None

Clear Alert Status: Alert

2088 is a clear alert for alerts 2061 and

2136.

Related Alert Number:

None

LRA Number:

None

Clear Alert Status: Alert

2089 is a clear alert for alert

2062.

Related Alert Number:

None

LRA Number:

None

SNMP Trap Numbers

901

1201

901

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2090

Virtual disk reconfiguration completed

2091 Virtual disk

rebuild completed

2092 Physical disk

rebuild completed

OK / Normal / Informational

Cause:

This alert is for

informational purposes.

Action:

None

Cause:

This alert is for

informational purposes.

Action:

None

Cause:

This alert is for

informational purposes.

Action:

None

Clear Alert

Alert

Status:

2090 is a clear alert for alert

2063.

Related Alert Number:

None

LRA Number:

None

Clear Alert

Alert

Status:

2091 is a clear alert for alert

2064.

Related Alert Number:

None

LRA Number:

None

Clear Alert Status:

Alert 2092 is a clear alert for alert

2065.

Related Alert Number:

None

LRA Number:

None

SNMP Tra p Numbers

1201

901

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2094 Predictive

Failure reported.

War n in g / Non-critical

Cause:

The physical disk is predicted to fail. Many physical disks contain Self Monitoring Analysis and Reporting Technology (SMART). When enabled, SMART monitors the health of the disk based on indications such as the number of write operations that have been performed on the disk.

Action:

Replace the physical disk. Even though the disk may not have failed yet, it is strongly recommended that you replace the disk.

If this disk is part of a redundant virtual disk,

Offline

perform the task on the disk; replace the disk; and then assign a hot spare and the rebuild starts automatically.

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

2070

SNMP Trap Numbers

903

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2094 cond.

2095 SCSI sense

data.

2098 Global hot

spare assigned

OK / Normal / Informational

If this disk is a hot spare, then unassign the hot spare; perform the

Prepare to Remove

on the disk; replace the disk; and assign the new disk as a hot spare.

CAUTION:

disk is part of a nonredundant disk, back up your data immediately. If the disk fails, you cannot recover the data.

Cause:

A SCSI device experienced an error, but may have recovered.

Action:

None

Cause:

A user has assigned a physical disk as a global hot spare. This alert is for informational purposes.

Action:

None

task

If this

Clear Alert Number:

None

Related Alert Number:

LRA Number:

None

Clear Alert Number:

None

Related Alert Number:

LRA Number:

None

2273

2277

SNMP Tra p Numbers

751, 851, 901

901

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2099 Global hot

spare unassigned

2100 Temperature

exceeded the maximum warning threshold

OK / Normal / Informational

War n in g / Non-critical

Cause:

A user has unassigned a physical disk as a global hot spare. This alert is for informational purposes.

Action:

None

Cause:

The physical disk enclosure is too hot. A variety of factors can cause the excessive temperature. For example, a fan may have failed, the thermostat may be set too high, or the room temperature may be too hot.

Action:

Check for factors that may cause overheating. For example, verify that the enclosure fan is working. You should also check the thermostat settings and examine whether the enclosure is located near a heat source. Make sure the enclosure has enough ventilation and that the room temperature is not too hot. See the physical disk enclosure documentation for more diagnostic information.

Clear Alert Number:

None

Related Alert Number:

LRA Number:

None

Clear Alert Number:

2353.

Related Alert Number:

LRA Number:

2090

None

2112

SNMP Trap Numbers

901

1053

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2101 Temperature

dropped below the minimum warning threshold

2102 Temperature

exceeded the maximum failure threshold

War n in g / Non-critical

Critical / Failure / Error

Cause:

The physical disk enclosure is too cool.

Action:

Check if the thermostat setting is too low and if the room temperature is too cool.

Cause:

Action:

Clear Alert Number:

2353.

Related Alert Number:

None

LRA Number:

2090

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

2091

SNMP Tra p Numbers

1053

1054

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2103 Temperature

dropped below the minimum failure threshold

2104 Controller bat-

tery is reconditioning

2105 Controller

battery recondition is completed

Critical / Failure / Error

OK / Normal / Informational

Cause:

The physical disk enclosure is too cool.

Action:

Check if the thermostat setting is too low and if the room temperature is too cool.

Cause:

This alert is for

informational purposes.

Action:

None

Cause:

This alert is for

informational purposes.

Action:

None

Clear Alert Number:

None

Related Alert Number:

LRA Number:

2091

Clear Alert Number:

2105.

Related Alert Number:

None

LRA Number:

None

Clear Alert Status:

2105 is a clear alert for alert

2104.

Related Alert Number:

None

LRA Number:

None

2112

Alert

SNMP Trap Numbers

1054

1151

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2106 SMART FPT

exceeded

War n in g / Non-critical

Cause:

A disk on the specified controller has received a SMART alert (predictive failure) indicating that the disk is likely to fail in the near future.

Action:

Replace the disk that has received the SMART alert. If the physical disk is a member of a non-redundant virtual disk, then back up the data before replacing the disk.

CAUTION:

Removing a physical disk that is included in a non-redundant virtual disk causes the virtual disk to fail and may cause data loss.

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

2070

SNMP Tra p Numbers

903

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2107 SMART

configuration change

Critical / Failure / Error

Cause:

A disk has received a SMART alert (predictive failure) after a configuration change. The disk is likely to fail in the near future.

Action:

Replace the disk that has received the SMART alert. If the physical disk is a member of a nonredundant virtual disk, then back up the data before replacing the disk.

CAUTION:

Removing a physical disk that is included in a non-redundant virtual disk causes the virtual disk to fail and may cause data loss.

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

2071

SNMP Trap Numbers

904

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2108 SMART

warning

War n in g / Non-critical

Cause:

A disk has received a SMART alert (predictive failure). The disk is likely to fail in the near future.

Action:

Replace the disk that has received the SMART alert. If the physical disk is a member of a non-redundant virtual disk, then back up the data before replacing the disk.

CAUTION:

Removing a physical disk that is included in a non-redundant virtual disk causes the virtual disk to fail and may cause data loss.

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

2070

SNMP Tra p Numbers

903

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2109 SMART

warning temperature

War n in g / Non-critical

Cause:

A disk has reached an unacceptable temperature and received a SMART alert (predictive failure). The disk is likely to fail in the near future.

Action 1:

why the physical disk has reached an unacceptable temperature. A variety of factors can cause the excessive temperature. For e x a m ple, a fan may have failed, the thermostat may be set too high, or the room temperature may be too hot or cold. Verify that the fans in the server or enclosure are working. If the physical disk is in an enclosure, you should check the thermostat settings and examine whether the enclosure is located near a heat source.

Determine

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

2070

SNMP Trap Numbers

903

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2109 contd

Make sure the enclosure has enough ventilation and that the room temperature is not too hot. See the physical disk enclosure documentation for more diagnostic information.

Action 2:

identify why the disk has reached an unacceptable temperature, then replace the disk. If the physical disk is a member of a non-redundant virtual disk, then back up the data before replacing the disk.

Removing a physical disk that is included in a non-redundant virtual disk causes the virtual disk to fail and may cause data loss.

If you cannot

CAUTION:

SNMP Tra p Numbers

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2110 SMART

warning degraded

2111 Failure

prediction threshold exceeded due to test

War n in g / Non-critical

Cause:

A disk is degraded and has received a SMART alert (predictive failure). The disk is likely to fail in the near future.

Action:

Replace the disk that has received the SMART alert. If the physical disk is a member of a nonredundant virtual disk, then back up the data before replacing the disk.

CAUTION:

Removing a physical disk that is included in a non-redundant virtual disk causes the virtual disk to fail and may cause data loss.

Cause:

A disk has received a SMART alert (predictive failure) due to test conditions.

Action:

None

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

2070

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

2070

SNMP Trap Numbers

903

Storage Management Message Reference

Table 3-4. Storage Management Messages

(continued)

EventIDDescription Severity Cause and Action Related Alert

Information

2112 Enclosure was

shut down

Critical / Failure / Error

Cause:

The physical disk enclosure is either hotter or cooler than the maximum or minimum allowable temperature range.

Action:

Check for factors that may cause overheating or excessive cooling. For example, verify that the enclosure fan is working. You should also check the thermostat settings and examine whether the enclosure is located near a heat source. Make sure the enclosure has enough ventilation and that the room temperature is not too hot or too cold. See the enclosure documentation for more diagnostic information.

Clear Alert Number:

None

Related Alert Number:

None

LRA Number:

2091

SNMP Tra p Numbers

854

100

Storage Management Message Reference

Dell OpenManage Server Administrator Version 6.5 Messages Reference Guide

Specifications and Main Features

Frequently Asked Questions

User Manual

Contents

Introduction

What’s New in this Release

Messages Not Described in This Guide

Understanding Event Messages

Viewing Alerts and Event Messages

Server Management Messages

Server Administrator General Messages

Temperature Sensor Messages

Cooling Device Messages

Voltage Sensor Messages

Current Sensor Messages

Chassis Intrusion Messages

Redundancy Unit Messages

Power Supply Messages

Memory Device Messages

Fan Enclosure Messages

AC Power Cord Messages

Hardware Log Sensor Messages

Processor Sensor Messages

Pluggable Device Messages

Battery Sensor Messages

Secure Digital (SD) Card Device Messages

Chassis Management Controller Messages

Storage Management Message Reference

Alert Monitoring and Logging

Alert Message Format with Substitution Variables

Alert Message Change History

Alert Descriptions and Corrective Actions