This document supports the version of each product listed and
supports all subsequent versions until the document is
replaced by a new edition. To check for more recent editions
of this document, see http://www.vmware.com/support/pubs.
EN- 000929-01
VMware vCenter Operations Manager Getting Started Guide
You can find the most up-to-date technical documentation on the VMware Web site at:
http://www.vmware.com/support/
The VMware Web site also provides the latest product updates.
If you have comments about this documentation, submit your feedback to:
VMware is a registered trademark or trademark of VMware, Inc. in the United States and/or other jurisdictions. All other marks
and names mentioned herein may be trademarks of their respective companies.
VMware, Inc.
3401 Hillview Ave.
Palo Alto, CA 94304
www.vmware.com
2 VMware, Inc.
Contents
VMware vCenter Operations Manager Getting Started Guide5
Introducing Custom User Interface Features and Concepts7
1
Viewing Resources 7
Understanding How vCenter Operations Manager Collects Data 8
Understanding Alerts 8
Understanding Health Ratings 10
Monitoring Virtual Resources 11
Analyzing Performance and Capacity 13
Introducing Common Tasks17
2
Logging In and Using vCenter Operations Manager 17
Monitoring Day-to-Day Operations 19
Handling Alerts 24
Optimizing Your Resources 33
Designing Your Workspace41
3
Working with Dashboards 41
Working with Widgets 49
Using and Configuring Widgets53
4
Edit a Widget Configuration 54
Supported Widget Interactions 55
Configure Widget Interactions 61
Advanced Health Tree Widget 62
Alerts Widget 64
Application Detail Widget 66
Application Overview Widget 67
Configuration Overview Widget 68
Custom Relationship Widget 69
Data Distribution Analysis Widget 71
Generic Scoreboard Widget 73
GEO Widget 76
Health Status Widget 77
Health Tree Widget 78
Health-Workload Scoreboard Widget 80
Heat Map Widget 81
Mashup Charts Widget 84
Metric Graph Widget 86
Metric Graph (Rolling View) Widget 90
Metric Selector Widget 93
VMware, Inc.
3
VMware vCenter Operations Manager Getting Started Guide
Metric Sparklines Widget 94
Metric Weather Map Widget 97
Resources Widget 99
Root Cause Ranking Widget 101
Tag Selector Widget 102
Text Widget 103
Top-N Analysis Widget 104
VC Relationship Widget 107
VC Relationship (Planning) Widget 108
Define Metric Sets for a Widget 109
Index111
4 VMware, Inc.
VMware vCenter Operations Manager Getting
Started Guide
The VMware vCenter Operations Manager Getting Started Guide (Custom User Interface) introduces the
VMware® vCenter™ Operations Manager Custom user interface, explains key terms and concepts, and
describes how to set up dashboards and configure widgets.
Intended Audience
This information is intended for anyone who uses or administers vCenter Operations Manager by using the
Custom user interface.
VMware, Inc. 5
VMware vCenter Operations Manager Getting Started Guide
6 VMware, Inc.
Introducing Custom User Interface
Features and Concepts1
vCenter Operations Manager collects performance data from monitored software and hardware resources in
your enterprise and provides predictive analysis and real-time information about problems. The Custom
user interface presents data and analysis through alerts, in configurable dashboards, on predefined pages,
and in several predefined reports.
Before you start using the Custom user interface, become familiar with key features, concepts, and
terminology.
This chapter includes the following topics:
“Viewing Resources,” on page 7
n
“Understanding How vCenter Operations Manager Collects Data,” on page 8
n
“Understanding Alerts,” on page 8
n
“Understanding Health Ratings,” on page 10
n
“Monitoring Virtual Resources,” on page 11
n
“Analyzing Performance and Capacity,” on page 13
n
Viewing Resources
In vCenter Operations Manager, a resource is an entity in your computing environment for which
vCenter Operations Manager collects data. A resource can be a single entity, such as a router or database, or
a container that holds other resources.
Applications and tiers are types of container resources. An application defines an interdependent set of
hardware and software components. A tier is a group of resources that performs a specific task in an
application. An application can contain multiple tiers. With applications and tiers, you can combine, track,
and analyze metrics for related resources over a period of time.
To make resources easier to find in the Custom user interface, an administrator assigns resources to resource
tag values. A resource tag is a type of information, and a resource tag value is an individual instance of that
type of information.
You can view resources in many places in the Custom user interface, including the Environment Overview
page. The list on the left side of the Environment Overview page contains the resource tags and resource tag
values. The List tab contains the resources.
In the example, the resource tag value called VMware vCenter Operations is selected in the left pane under
the Application resource tag, and the List tab shows the resources assigned to the VMware vCenter
Operations resource tag value. The resources are tiers in the VMware vCenter Operations application.
VMware, Inc.
7
VMware vCenter Operations Manager Getting Started Guide
Figure 1‑1. Example of Resources on the Environment Overview Page
Understanding How vCenter Operations Manager Collects Data
vCenter Operations Manager can collect several types of data for a single resource. For example, for a
database server it might receive data on free disk space, CPU use, and average response time. Each type of
data that vCenter Operations Manager collects is called an attribute.
A metric is an instance of an attribute for a particular resource. For each metric,
vCenter Operations Manager collects and stores multiple readings over time. Each piece of data that
vCenter Operations Manager collects is called a metric observation or value.
A vCenter Operations Manager administrator creates attribute packages to define combinations of attributes
and assigns attribute packages to resources. An attribute package specifies the attributes to collect for the
resource to which it is assigned.
An administrator identifies the attributes that are most important in your enterprise as key performance
indicators (KPIs). A KPI is a high-priority metric that might indicate a severe problem in your infrastructure
if it exceeds its normal value range. vCenter Operations Manager treats KPIs differently from other
attributes.
A vCenter Operations Manager administrator might also create super metrics and super metric packages. A
super metric is useful when a single metric cannot tell you what you need to know about the behavior of
your enterprise. For example, you might have a super metric that tracks the average free disk space for all of
the database servers in your enterprise by averaging the free disk space metric for all servers. A super metric
package is similar to an attribute package, except that it defines combinations of super metrics.
Understanding Alerts
For each attribute, vCenter Operations Manager maintains thresholds of normal behavior and generates
anomalies when a metric violates a threshold. If vCenter Operations Manager determines that the current
combination of anomalies indicates a real problem, or if a KPI violates a threshold, it generates an alert.
An alert is a notification that informs you of an abnormal condition that might require attention. An alert
can describe a problem in a resource, including applications and tiers. Different combinations of conditions
cause different types of alerts.
For example, if CPU use for all of the servers in a tier exceeds a threshold, vCenter Operations Manager
generates an anomaly for each out-of-threshold metric value and sends an alert to notify you of the problem.
The alert lists all of the anomalies for each metric.
8 VMware, Inc.
Chapter 1 Introducing Custom User Interface Features and Concepts
You can view alerts in several places in the Custom user interface, including the Alerts Overview page. If a
vCenter Operations Manager administrator sets up the alert notification feature, you might also receive
alerts in email messages.
On the Alerts Overview page, the list on the left side of the page contains resource tags and resource tag
values and the right pane contains alerts. By default, the alert list includes alerts for all resources. If you
select a resource tag value in the left pane, the alert list contains only the alerts for the resources that have
the selected resource tag value.
Figure 1‑2. Example of Alerts on the Alerts Overview Page
Using Thresholds to Identify Abnormal Behavior
A threshold is a value that marks the boundary between normal and abnormal behavior for a metric. When
a metric crosses a threshold, vCenter Operations Manager generates an anomaly.
vCenter Operations Manager uses dynamic thresholds and hard thresholds.
With dynamic thresholds, vCenter Operations Manager determines whether to generate an anomaly based
on how often a metric has violated its thresholds and by the amount of the violation.
vCenter Operations Manager calculates and continually adjusts a degree of abnormality for the metric. If the
metric value is within this degree of abnormality, vCenter Operations Manager does not generate an
anomaly, even if the value is outside of the dynamic threshold.
A hard threshold is a value that a vCenter Operations Manager administrator defines for a metric. A hard
threshold changes only when an administrator changes it.
The dynamic threshold for a metric appears as a shaded area in a metric graph. Out-of-range values that
generated anomalies appear as yellow areas. You can view metric graphs in several places in the Custom
user interface, including the Metric Graph widget.
Figure 1‑3. Example of a Dynamic Threshold for a Metric
VMware, Inc. 9
VMware vCenter Operations Manager Getting Started Guide
Understanding KPI Alert Generation
When a KPI violates a threshold, vCenter Operations Manager always generates an alert. When they
configure attribute packages, vCenter Operations Manager administrators can identify any attribute for any
resource as a KPI.
The type of alert that vCenter Operations Manager generates depends on the type of threshold that the KPI
uses. When a KPI violates an internally calculated dynamic threshold, vCenter Operations Manager sends a
Smart KPI Breach alert. When a KPI violates a user-defined hard threshold, vCenter Operations Manager
sends a Classic KPI HT Breach alert.
Identifying the Root Causes of Alert Symptoms
The root cause of an alert is the condition or symptom that was the first step in the chain of events that led to
the alert. For example, a slowdown in network traffic through a particular router could lead to an increased
time per transaction for users of your Web site, which in turn could cause vCenter Operations Manager to
generate an alert for the Web server resource.
For each alert, vCenter Operations Manager lists the most likely root causes of the symptoms that caused the
alert and ranks the causes in order of importance.
You can view root causes in the Root Cause column on the Alerts Overview page and in the Root Cause
Ranking widget.
Figure 1‑4. Example of Root Causes on the Alerts Overview Page
Understanding Health Ratings
vCenter Operations Manager examines internally generated metrics and uses its proprietary analytics
formulas to determine an overall health rating for a resource. The health rating, which ranges from 0 to 100,
gives you a quick overview of the current state of a resource.
vCenter Operations Manager generates and stores internally generated metrics for every resource. Internally
generated metrics include the total number of alerts and anomalies and the number of active alerts.
The health rating appears as a numeric rating and as a colored indicator in the Custom user interface. The
color is based on the range of the health rating. You can view the health rating for a resource anywhere that
a resource is listed and in the Health widget. For resources that VMware vCenter Server™ manages, health
ratings appear in the VC Relationship widget and on the Resource Detail page.
10 VMware, Inc.
Chapter 1 Introducing Custom User Interface Features and Concepts
Figure 1‑5. Example of Health Ratings on the Environment Overview Page
Monitoring Virtual Resources
vCenter Operations Manager provides performance, relationship, and capacity data for objects in your
virtual environment. It uses badges to represent derived metrics and give you a high-level, broad view of
the performance and condition of your virtual environment.
Interpreting Workload Scores for Virtual Resources
vCenter Operations Manager combines the metrics that show demand on virtual machines and other virtual
objects in a single value called a workload. These metrics include CPU use and memory use.
vCenter Operations Manager indicates workload as a numeric score and as a colored circle. The color is
based on the range of the workload score.
Workload scores typically range from 0 to 100. A score of 0 indicates that an object is not being used. A score
greater than 100 indicates that an object is trying to access more resources than are currently available.
When an object's workload score is greater than 100, you might need to allocate more resources to the object
or move some tasks to other objects.
You can view workload scores for virtual objects in the VC Relationship widget and on the Resource Detail
page.
VMware, Inc. 11
VMware vCenter Operations Manager Getting Started Guide
Figure 1‑6. Example of a Workload Score on the Resource Detail Page
Interpreting Capacity Scores for Virtual Resources
The capacity score indicates how close a virtual object is to exhausting its available computing resources.
The computing resources include disk space, memory size, and network capacity.
To calculate the capacity score, vCenter Operations Manager combines applicable metrics from the object
and applies its analytics algorithms to determine long-term cycles and trends. It uses these results to
calculate when an object might run out of a type of resource.
vCenter Operations Manager indicates capacity as a numerical score and a colored hexagon. The color is
based on the range of the capacity score. You can view capacity scores for virtual objects in the VC
Relationship (Planning) widget.
Figure 1‑7. Example of a Capacity Score in the VC Relationship (Planning) Widget
Viewing Change Events for Virtual Resources
A change event is any change to the virtual infrastructure. It can include changes on virtual machines or
ESX hosts, such as adding, removing, connecting, or disconnecting an ESX host, and starting, stopping, or
reconfiguring a virtual machine.
vCenter Operations Manager can show change events on its performance graph on the Resource Detail page
for a virtual object. A vCenter Operations Manager administrator can configure whether change events
appear on performance graphs.
12 VMware, Inc.
Analyzing Performance and Capacity
With vCenter Operations Manager forensics features, you can analyze the performance and capacity of your
resources and use this information to balance the resources in your environment. Forensics features include
cross-silo analysis, top-n analysis, the problem fingerprint library, capacity analysis, and VC analysis.
Examining Anomalies with Cross-Silo Analysis
With cross-silo analysis, you can examine graphs that show the number of anomalies over time for
particular resources. You can zoom in on a graph to focus on a specific period of time, such as immediately
before an alert. You can also click on a location in a graph to see a ranking of the likely root causes for the
anomalies at that specific time.
Cross-silo analysis information appears on the Cross-Silo Analysis page. For applications, the anomaly
graph includes a line that shows the internally determined 90 percent threshold. If the number of anomalies
exceeds this threshold, vCenter Operations Manager generates an early warning Smart Alert.
Figure 1‑8. Example of Cross-Silo Analysis
Chapter 1 Introducing Custom User Interface Features and Concepts
Using Top-N Analysis to Determine Best and Worst Performers
With top-n analysis, you can identify the top or bottom resources, metrics, or alerts in selected categories,
such as the five most or least healthy resources in a tier. You can also select the resource tag and time frame
to analyze and the categories to show.
Top-n analysis information appears on the Top-N Analysis page and in the Top-N Analysis widget. The
Resource Health pane shows the health of the resource. The information that you select in the Tools pane
appears in the right pane.
VMware, Inc. 13
VMware vCenter Operations Manager Getting Started Guide
Figure 1‑9. Example of Top-N Analysis
Isolating and Resolving Issues with Problem Fingerprinting
When a KPI for an application or tier violates a threshold, vCenter Operations Manager examines the events
that preceded the violation. If it finds enough related information, it captures the set of events that preceded
the violation as a fingerprint. If vCenter Operations Manager finds a similar series of events in the future, it
can issue a predictive alert to warn of a likely KPI violation.
Fingerprinting helps you to quickly isolate and resolve problems by reducing the number of possible silos
and tiers in which a problem might have occurred, capturing the events that precede a problem for rootcause analysis, and notifying you of problems before they occur.
You can see fingerprint information on the Problem Fingerprint Library page. If you select a fingerprint
branch in the left pane, the details for that branch appear in the right pane.
Figure 1‑10. Example of Problem Fingerprints
14 VMware, Inc.
Chapter 1 Introducing Custom User Interface Features and Concepts
Performing Capacity Analysis
Some metrics, such as disk space use or network traffic measurements, frequently show long-term trends.
Short-term fluctuations can hide this data or make it difficult to calculate.
The capacity analysis feature looks at the overall value trend for a selected metric, tells you when a metric is
likely to reach a threshold, and indicates how confident vCenter Operations Manager is of the prediction.
This information can help you plan for infrastructure upgrades, such as adding additional storage and
network routers.
Capacity analysis information appears on the Capacity Analysis page.
Figure 1‑11. Example of Capacity Analysis Information
Analyzing and Balancing Virtual Resources
With the VC analysis feature, you can compare the metric values of different objects in your virtual
environment by using predefined heat maps or by creating your own custom heat maps.
A heat map contains rectangles of different colors and sizes, and each rectangle represents an object in your
virtual environment. The color of a rectangle represents the value of one metric, and the size of a rectangle
represents the value of another metric. For example, one of the predefined heat maps shows the total
memory and percentage of memory use for each virtual machine. Larger rectangles are virtual machines
that have more total memory. Green indicates low memory use and red indicates high memory use.
vCenter Operations Manager updates heat maps in real time as its collects new values for each object and
metric. The colored bar below a heat map is the legend. The legend identifies the values that the endpoints
represent and the midpoint of the color range.
Heat map objects are grouped by parent. A heat map that shows virtual machine performance groups
virtual machines by the ESX hosts on which they run.
VMware, Inc. 15
VMware vCenter Operations Manager Getting Started Guide
16 VMware, Inc.
Introducing Common Tasks2
With vCenter Operations Manager, you can perform a large number of monitoring and troubleshooting
tasks, including tracking anomalies, handling alerts, and optimizing resources based on predictive
information that vCenter Operations Manager generates. The best way to learn the full feature set of
vCenter Operations Manager is to get hands-on experience with the system.
Before you can perform these tasks, a vCenter Operations Manager administrator must configure the
resources that vCenter Operations Manager monitors and start data collection. For information about
configuring vCenter Operations Manager, see the VMware vCenter Operations Manager Administration Guide(Custom User Interface).
For information about all monitoring and troubleshooting tasks, see the vCenter Operations Manager online
help.
NOTE Your ability to use any vCenter Operations Manager feature depends on the access rights that a
vCenter Operations Manager administrator assigns to you. If you cannot use a feature, contact your
administrator to find out if your access rights should be adjusted.
This chapter includes the following topics:
“Logging In and Using vCenter Operations Manager,” on page 17
n
“Monitoring Day-to-Day Operations,” on page 19
n
“Handling Alerts,” on page 24
n
“Optimizing Your Resources,” on page 33
n
Logging In and Using vCenter Operations Manager
The vCenter Operations Manager client is a Web-based application. You use vCenter Operations Manager
by typing the URL of the Custom user interface in a Web browser.
Log In to the Custom User Interface
After you install vCenter Operations Manager and the vCenter Operations Manager services are running,
you can use a supported Web browser to connect to the vCenter Operations Manager server or vApp and
use the Custom user interface.
Prerequisites
Verify that you have a supported Web browser.
VMware, Inc.
17
VMware vCenter Operations Manager Getting Started Guide
Procedure
1In your Web browser, type the URL for the Custom user interface.
OptionDescription
Standalone version
vApp version
ip_address is the IP address or fully qualified host name of the vCenter Operations Manager server or
vApp.
2Type your user name and password.
3Click Login.
After you log in, the Home page appears in your browser window.
NOTE If your session is inactive for 30 minutes, it times out and you must log in again.
Using the Custom User Interface
After you log in to the Custom user interface, your Home page appears in the browser window.
Figure 2‑1. Example Home Page
https://ip_address
https://ip_address/vcops-custom
The Home page contains the following components.
Dashboards
The tabs near the top of the Home page are your dashboards. In the example,
the dashboards are Operations and Alerts by Type. The user groups to
which you belong determine which dashboards are available to you. A
vCenter Operations Manager administrator assigns you to one or more user
groups when he or she creates your user account. You can switch to a
different dashboard by clicking its tab or selecting it from the Dashboards
menu. You can click Home at any time to return to your Home page.
Widgets
The panes on a dashboard are called widgets. A widget is a collection of
related information about attributes, resources, applications, or the overall
processes in your environment. Each dashboard contains one or more
widgets. In the example, the Operations dashboard contains the Resource
Selector and Active Alerts widgets. If your user account has the necessary
access rights, you can customize dashboards and widgets.
18 VMware, Inc.
Chapter 2 Introducing Common Tasks
Menus
You use the menus at the top of your Home page to select and use
vCenter Operations Manager features. For example, you use the Reports
menu to generate reports, the Alerts menu to view alerts, and the Forensics
menu to use forensics features.
Icons
You click icons on pages and widgets to perform tasks in the Custom user
interface. For example, you can click the icons in a widget to expand,
collapse, resize, or remove the widget. When you point to an icon, a tooltip
appears that describes the function of the icon.
Using Breadcrumbs
When you move away from your Home page, a breadcrumb appears in the top left corner of the page under
the menu bar. Breadcrumbs help you navigate the Custom user interface and determine where you are.
You click the link in a breadcrumb to return to a specific page in the user interface.
Modify User Preferences
You can change the color scheme for your workspace, specify the time zone that
vCenter Operations Manager uses when it displays times in your workspace, or edit the password for your
user account.
Procedure
1Click User Preferences at the top of your Home page.
2Modify your user preferences on the Manage User Account Settings window.
OptionAction
Password
Scheme
Time Preference
Click change to change your password.
Select light or dark to change the color scheme for your workspace.
Select Browser to use the time settings on your computer or Host to sync
your vCenter Operations Manager session with the time on the
vCenter Operations Manager server.
You cannot change your user name, first name, last name, email address, or account description on the
Manage User Account Settings window. Only a vCenter Operations Manager administrator can change
these values for a user account.
3Click OK to save your changes.
Monitoring Day-to-Day Operations
Monitoring day-to-day operations involves evaluating the overall health of your enterprise and identifying
health problems for specific resources.
For each resource, vCenter Operations Manager determines a health score, which is a 0 to 100 ranking. One
of the ways that vCenter Operations Manager indicates the health of a resource is to show a colored
indicator. The color is based on the range of the health score.
VMware, Inc. 19
VMware vCenter Operations Manager Getting Started Guide
Table 2‑1. Default Health Color Ranges
ColorRange
Green76 to 100
Yellow51 to 75
Orange26 to 50
Red1 to 25
Blue0
View an Overview of Resource Health
You can view a graphical representation of the health of all vCenter Operations Manager resources that
have a specific resource tag value on the Environment Overview page.
Procedure
1Select Environment > Environment Overview to view the Environment Overview page.
2Select one or more resource tag values in the left pane.
3Click the Group tab.
The Group tab shows a colored icon that represents the current health of each resource that has the
selected tag value. If you point to a colored icon, a tooltip appears that describes the resource that the
icon represents.
OptionAction
View the health icons for any time
in the past six hours
View detailed information for a
resource
Move the slider at the bottom of the Group tab to the left. The slider moves
in five-minute increments.
Click the icon for the resource and click the Show Detail icon on the
toolbar.
Identify Health Problems for a Specific Resource
You can use the Resource Detail page to identify health problems for a specific resource. For a global
resource, the Resource Detail page contains information about the current state of the resource, its metrics,
and its place in the resource tree. For a virtual resource, the Resource Detail page shows information about
the main performance characteristics, key metrics, and events for the virtual resource.
Procedure
1Select Environment > Environment Overview.
2(Optional) In the left pane, expand a resource tag and select a tag value that is assigned to the resource.
Selecting a tag value can shorten the resource list and make it easier to find a specific resource.
3Select the resource on the List tab.
4Click the Show Detail icon.
The Resource Detail page appears for the resource.
20 VMware, Inc.
Chapter 2 Introducing Common Tasks
Resource Detail Information for Global Resources
For global resources, the Resource Detail page contains information about the current state of the resource,
its metrics, and its place in the resource tree. Because problems might be related to or caused by the
behavior of related resources, the Resource Detail page also shows details for child or parent resources.
For complete information about using the Resource Detail page, see the online help.
Health Status Pane
Located in the upper left of the Resource Detail page, this pane shows the current health score of the
resource. A health score is a 0 to 100 ranking that vCenter Operations Manager determines for each
resource. The Health Status pane shows the current health score of a global resource and a graph that shows
how the health score has changed over the last six hours.
NOTE If vCenter Operations Manager does not receive metrics for a resource at a particular time, it cannot
calculate the health score for that time and it shows a score of -1 on the graph. A -1 score can occur if
collection is turned off for a resource or if vCenter Operations Manager encounters a data gathering
problem.
Root Cause Ranking Pane
This pane shows information about metrics on related resources that contributed to alerts, including the
percentage likelihood that the metrics contributed to a root cause. vCenter Operations Manager bases the
percentage likelihood that a metric contributed to a root cause on the number of symptoms and when the
symptoms occurred relative to the alert.
For more information about the information in the Root Cause Ranking pane, see “Understanding Health
Symptoms,” on page 23.
Health Tree Pane
This pane shows the section of the resource hierarchy around a global resource, including all of the parent
container resources that hold the resource. If the resource is a container, the health tree also shows all of the
child resources that the container resource holds.
Metric Selector and Metric Graph Panes
When you click a resource in the Health Tree pane, the metric groups for the resource appear in the Metric
Selector pane. You can expand a metric group to view the individual metrics in the group. You can doubleclick a metric in the Metric Selector pane to view a graph for the metric in the Metric Graph pane. Metric
graphs show the recent performance and predicted future performance of metrics for a resource.
Resource Detail Information for Virtual Resources
For virtual resources, the Resource Detail page shows information about performance characteristics, key
metrics, and events. The information on the Resource Detail page is organized in several panes.
For resource pools and folders, the Resource Detail page shows the same information that it does for global
resources.
Status Pane
Located in the upper left of the Resource Detail page, this pane contains aggregated information about the
health, workload, anomalies, and faults of the selected resource. The default history graph period is six
hours. You can click metric icons to switch the metric to view.
Metric Details Pane
This pane occupies the middle of the Resource Detail page. The information that appears in this pane
depends on the metric that you select to view in the Status pane.
VMware, Inc. 21
VMware vCenter Operations Manager Getting Started Guide
Table 2‑2. Information in the Metric Details Pane
MetricDescription
HealthShows information for workload, anomalies, and faults.
WorkloadShows information for CPU and memory used by the currently selected resource, by the
hypervisor, and by the child objects of the selected resource. You can point to colored sections in
the bars to view information about the objects that they represent.
AnomaliesContains a list of symptoms for all child container objects of the currently selected resource. A
symptom is a metric that contributes to the health state of an objects. See “Understanding Health
Symptoms,” on page 23.
FaultsShows information for faults. A fault score indicates the degree of problems that the object is
experiencing. It includes events such as loss of redundancy in NICs or HBAs, memory checksum
errors, HA failover problems, and CIM events.
Workload Pane
This pane shows information about space use and read and write capacity. The Space bar represents the
amount of space used. The IOPs (input/output operations per second), Throughput, and Latency bars
provide read and write capacity information.
Key Metrics Pane
This pane contains the metrics of greatest interest related to the performance characteristic that you select.
In the Default key metrics view, vCenter Operations Manager selects the four most interesting metrics,
through bubbling, by following these criteria.
From all metrics, vCenter Operations Manager selects KPIs that are violating their thresholds. It sorts
n
KPIs by display order (CPU, memory, network I/O, and disk I/O).
From all remaining metrics, vCenter Operations Manager adds non-KPI metrics that are violating their
n
thresholds to the list by display order.
vCenter Operations Manager adds other metrics to the list by display order.
n
You can click All Metrics to view health status and resource hierarchy information, identify root causes of
health degradation, and view metric graphs that show recent performance and predicted performance of
metrics for the selected resource.
Resources Pane
This pane shows the current list of properties for the resource. You can click the More Details link to show
more available properties for the selected resource. The More Details link is not provided for certain virtual
resources.
Related Objects Pane
This pane shows values for the currently selected performance characteristic of all objects that are related to
the selected resource. Depending on the resource type that you selected (virtual machine, datacenter, or
vCenter Server system), this pane shows parent objects, peer objects, and child objects.
Events and Health Pane
This pane appears in the lower third of the Resource Detail page when you click Health in the Status pane.
You can expand this pane to view the graph of the current health metric values. If an administrator
configures it, the graph contains events that might affect the selected resource. You can use the icons at the
top of the pane to change the display.
22 VMware, Inc.
Chapter 2 Introducing Common Tasks
Events and Workload Pane
This pane appears in the lower third of the Resource Detail page when you click Workload in the Status
pane. You can expand this pane to view the graph of recent workload metric values. If an administrator
configures it, the graph shows events that might affect the selected resource. You can use the icons at the top
of the pane to change the display.
Events and Anomalies Pane
This pane appears in the lower third of the Resource Detail page when you click Anomalies in the Status
pane. You can expand this pane to view the graph of anomalies. If an administrator configures it, the graph
shows corresponding events that might affect the selected resource. You can use the icons at the top of the
pane to change the display.
Events and Faults Pane
This pane appears in the lower third of the Resource Detail page when you click Faults in the Status pane.
You can expand this pane to view a graph of faults. If an administrator configures it, the graph shows
corresponding events that might affect the selected resource. You can use the icons at the top of the pane to
change the display.
Storage and Network Pane
For objects that have storage and network resources, this pane shows basic storage-related metrics. The pie
chart uses both volume and color to present information. The volume of the pie chart represents the amount
of used disk space. The color coding visualizes the nearness of the moment when disk space is exhausted.
Understanding Health Symptoms
A symptom is a metric that contributes to the health state of an object. For a global resource, you view health
symptoms in the Root Cause Ranking pane on the Resource Detail page. For a virtual resource, you view
health symptoms in the Metric Details pane on the Resource Detail page.
The Resource Detail page lists symptoms by child resource kinds. The parentheses after the resource kind
name contain information about the number of symptoms that are violating their thresholds for the resource
group.
Figure 2‑2. Example of a Symptom Group
The example shows a portion of the type of information that you might see when you view health
symptoms. Metrics that are violating their thresholds appear in metric groups. The parentheses after the
metric group name contain the number of violations for the metrics in the metric group.
When you expand a metrics group, the list of metrics that are violating their thresholds appears. In each
metric row, you can check the percentage of objects that have threshold violations for the metric. A vertical
blue line represents the point in time when the first symptom became active.
Figure 2‑3. Example of an Expanded Symptom Group
VMware, Inc. 23
VMware vCenter Operations Manager Getting Started Guide
The icons in a metric row add information about the metric values that the row contains. When you point to
an icon, a tooltip appears that describes the meaning of the icon. You can double-click a metric row to view
details about the selected symptom.
Handling Alerts
An alert is a notification that informs you of an abnormal condition that might require attention. Handling
alerts involves viewing alerts, determining which alerts to respond to, managing alerts in the Custom user
interface, and identifying alert trends.
View Alerts on the Alerts Overview Page
By default, the Alerts Overview page shows alerts for all resources. You can filter the alert list by resource,
alert type and subtype, and alert status. You can also search for alerts generated for particular resources and
during a specific time period.
Procedure
1Select Alerts > Alerts Overview.
2(Optional) Filter the alert list.
OptionAction
Show alerts for resources that have
a specific value
Show alerts for resources that do
not have a specific value
Show alerts for resources that have
a specific resource name or
resource kind
Show alerts for a specific time
period
Show alerts that have a specific
type or subtype
Show alerts that have a specific
status
Remove an alert type or alert status
filter
Remove all resource filters
3(Optional) Click a column header and use the controls in the pop-up menu to sort the alert list or add or
remove columns from the display.
Select one or more resource tag values in the resource tag list. If you select
more than one value for the same tag, the list includes resources that have
either value. If you select values for two or more different tags, the list
includes only resources that have all of the selected values.
Select one or more resource tag values in the resource tag list and click the
Invert Result icon. For example, if you select New York and London, alerts
for all resources that are not in either city appear in the list.
Type a full or partial name in the Search text box and click the right angle
bracket (>).
Select a date from the Start Date and End Date menus and click the right
angle bracket (>).
Click one or more of the alert type icons at the top of the alert list. For
example, click the Smart (Early Warning), Smart (KPI Breach), and Smart(KPI Prediction) icons to show all smart alerts.
Click one or more of the alert status icons at the top of the alert list. For
example, click the Active Alerts icon and the Own Alerts icon to show the
active alerts assigned to you.
Click the icon again to toggle it off.
Click the Deselect All icon at the top of the resource tag list.
Alerts Overview Page Information
The information on the Alerts Overview page is organized in multiple columns.
Table 2‑3. Columns on the Alerts Overview Page
Column NameDescription
Critical LevelA color-coded icon that indicates the criticality level of the alert. See “Alert Criticality Levels,” on
page 25.
Sub-TypeAn icon that indicates the subtype of the alert. See “Alert Types and Subtypes,” on page 26.
24 VMware, Inc.
Chapter 2 Introducing Common Tasks
Table 2‑3. Columns on the Alerts Overview Page (Continued)
Column NameDescription
Start TimeThe date and time that vCenter Operations Manager first generated the alert.
DurationHow long the alert lasted.
StatusAn icon that indicates whether the alert is active or canceled.
Resource NameThe resource that is associated with the alert.
Resource KindThe kind of resource for which vCenter Operations Manager generated the alert.
MetricThe metric that violated a threshold and triggered the alert.
Root CauseThe symptom that most likely indicates the start of the chain of events that led to the alert.
Worst Sub
Containers
IdA sequential alert ID number.
TypeAn icon that indicates the type of alert. See “Alert Types and Subtypes,” on page 26.
Cancel TimeThe date and time that a user canceled the alert.
InfoFor external notification alerts, this column might contain additional information that was sent
Update TimeThe date and time that the alert was most recently updated.
Resource
Identifiers
Control StateAn icon that indicates whether the alert is open, assigned, suspended, or suppressed.
Resource IdThe ID of the resource that is associated with the alert.
User NameIf the alert is assigned to a user, the name of the user.
If the alert is for a resource that contains other container resources, such as an application that
contains tiers, this column contains icons that represent the subcontainers that most likely
contributed to the alert.
along with the alert message.
Up to five identifiers that uniquely identify the resource. Resource identifiers are often the same as
the resource name. Multiple resources can have the same name. The adapter type determines the
identifiers that appear.
Alert Criticality Levels
Every alert has a criticality level that specifies how serious the problem is and how quickly you should take
action.
Table 2‑4. Alert Criticality Levels
LevelColorValue
CriticalRed4
ImmediateOrange3
WarningYellow2
InfoBlue1
None0
vCenter Operations Manager predictive Smart Alerts, KPI prediction, and early warning alerts are always
critical alerts. Other types of alerts might be of any criticality, depending on the alert type and how the
attribute that trigged the alert is configured.
You can color code alerts by criticality on the Alerts Overview page and in the Alerts widget.
VMware, Inc. 25
VMware vCenter Operations Manager Getting Started Guide
Alert Types and Subtypes
vCenter Operations Manager generates several types of alerts, and each alert type has its own triggers and
contents. An alert might be caused by abnormal behavior of one or more metrics for a resource, including a
tier or application, or when a fingerprint predicts an upcoming problem.
By default, vCenter Operations Manager generates all types of alerts except for classic abnormality alerts.
Because they do not involve KPIs, classic abnormality alerts are unlikely to require any action.
Smart Alerts
A Smart Alert is triggered when an internally calculated value indicates that a problem occurs, or soon will
occur. The internally calculated value can be a correlated prediction of future behavior or a dynamic
threshold breach on a KPI. Except for metrics that are designated as KPIs, Smart Alerts do not depend on
user-defined values. Smart Alerts have certain subtypes.
Table 2‑5. Smart Alert Subtypes
SubtypeDescriptionDetermined By
KPI DT BreachA KPI breached one of its internally
calculated dynamic thresholds.
KPI PredictionvCenter Operations Manager predicts that a
KPI will soon breach a threshold. The current
combination of metrics might match a stored
metric fingerprint that predicts the breach, or
a metric that is correlated with the KPI has
breached its threshold.
KPI Prediction alerts are always critical.
Early WarningvCenter Operations Manager correlation
algorithms calculate a greater than 90 percent
chance above the noise threshold that there is
a problem with an application.
Early Warning alerts are always critical.
Analytics based on a user-defined
KPI.
Analytics based on a user-defined
KPI.
Analytics based on application
topology.
Classic Alerts
A classic alert is an alert that is generated by classic monitoring software. It relies on user-defined hard
thresholds, not vCenter Operations Manager dynamic thresholds or other advanced calculations. Classic
alerts have certain subtypes.
Table 2‑6. Classic Alert Subtypes
SubtypeDescriptionReason
KPI HT BreachA user-defined KPI breached a user-
defined hard threshold.
AbnormalityA non-KPI metric breached one of its
internally calculated dynamic
thresholds.
NotificationA non-KPI metric breached a user-
defined hard threshold. Typical system
and network management applications
send this type of alert.
A user-defined KPI breached one of its
user-defined hard thresholds.
A single non-KPI metric breached one of its
dynamic thresholds.
A single non-KPI metric breached one of its
internally calculated dynamic thresholds.
Administrative Alerts
An administrative alert indicates a problem with vCenter Operations Manager, the monitoring software
from which it collects data, or the enterprise network. Administrative alerts have certain subtypes.
26 VMware, Inc.
Chapter 2 Introducing Common Tasks
Table 2‑7. Administrative Alert Subtypes
SubtypeDescription
SystemA vCenter Operations Manager component failed.
EnvironmentvCenter Operations Manager stopped receiving data from
one or more resources. A problem might exist with the
resource, the monitoring software, or the network
infrastructure.
Resolve an Alert
How you resolve an alert depends on the alert's type and criticality level and your organization's rules,
procedures, and priorities. You can view information about the event that triggered an alert, its effects, and
its likely causes, on the Alert Summary page. The Alert Summary page usually contains enough information
to determine who in your enterprise should respond to the alert.
The top of the Alert Summary page shows the alert type, when the alert started, the duration of the alert,
and the associated resource or metric. The rest of the page is divided into separate panes.
NOTE Do not follow this procedure to resolve administrative system alerts.
Procedure
1Select Alerts > Alerts Overview and double-click the row for the alert in the alert list.
2Examine the information in the Reason pane on the Alert Summary page and determine the action to
take.
The Reason pane contains specific information about the alert, including the type of trigger that caused
it, the resource or metric for the trigger, and details about the trigger. You can click More next to the
resource name to view all of the properties for the resource.
OptionAction
The alert is for an application, tier,
or container
The alert is for a KPI
Identify the owner of the resource.
Identify the service level agreement (SLA) that is associated with the KPI.
3Examine the information in the Impact pane on the Alert Summary page.
The Impact pane shows health information for the last six hours for the resource for which the alert was
generated, including any subcontainers in the resource. For a tier, the pane shows a health graph for the
application that contains the tier and the current health of all of the tiers in the application. The pane
also shows a graph for up to five metrics. Breaching KPIs appear first, followed by breaching super
metrics, non-breaching KPIs, and non-breaching super metrics.
4If the information in the Impact pane indicates that KPIs were breached, identify the SLAs that are
associated with the KPIs.
VMware, Inc. 27
VMware vCenter Operations Manager Getting Started Guide
5Examine the information in the Root Cause pane on the Alert Summary page.
The Root Cause pane shows the likely root cause container resources, ranked by analytical algorithm on
the container resource. You can perform actions on the Root Cause pane to view symptom information.
OptionAction
Show the symptom groups for a
resource
Show the top five individual
symptoms in a symptom group
List the individual anomalies that
comprise a symptom
6Identify the owner of the root cause resource.
OptionAction
The Root Cause pane indicates
multiple tiers
The Root Cause pane indicates a
single resource
7Hand off resolution of the alert to the owner that you identified in Step 6.
Double-click the resource. Symptom groups are ranked by percentage of
possible resources that exhibit the symptoms in the group.
Double-click the symptom group. Symptoms are ranked by percentage of
possible resources that exhibit the symptom.
Double-click the symptom. A pop-up window opens that lists the
individual anomalies.
To copy the list to the clipboard, click the Copy to Clipboard icon.
n
This feature is useful if you are opening a problem ticket or sending an
email message about the alert.
To graph an anomaly, select an anomaly in the pop-up window and
n
click the Dynamic Dashboard icon.
Examine the first tier and identify its owner. Select all metrics that have 50
percent or greater probability.
Select all metrics that have 50 percent or greater probability.
Provide the highest-probability abnormal metrics from the Root Cause pane and the URL of the Alert
Summary page. For predictive alerts, also provide due time and probability information from the
Reason pane.
8Notify the owner that you identified in Step 2 and any other interested parties.
Provide SLA and KPI breach information, the URL of the Alert Summary page, and the name of the
person responsible for resolving the alert.
Resolving Administrative System Alerts
An administrative system alert indicates a problem with one of the vCenter Operations Manager
components. When you resolve an administrative system alert, follow certain recommended procedures.
For information about resolving administrative system alerts, see the VMware vCenter Operations ManagerAdministration Guide (Custom User Interface) or the online help.
Troubleshoot an Alert
The person responsible for fixing the condition that caused an alert typically uses the Alert Detail page to
troubleshoot the problem.
The Alert Detail page contains information that can help you diagnose the cause of the behavior that
resulted in the alert and determine how to prevent the alert from happening again.
Procedure
1Select Alerts > Alerts Overview and double-click the row for the alert in the alert list.
2On the Alert Summary page, click the Troubleshoot button.
The Alert Detail page appears for the alert.
28 VMware, Inc.
Chapter 2 Introducing Common Tasks
Alert Detail Information
The Alert Detail page contains information that can help you diagnose the cause of the behavior that
resulted in the alert and determine how to prevent the alert from happening again.
For detailed information about using each pane and tab on the Alert Detail page, see the online help.
Reason Pane
This pane contains specific information about the alert, including the type of trigger that caused it, the
resource or metric for the trigger, and details about the trigger.
Impact Pane
This pane shows health information for the last six hours for the resource for which the alert was generated,
including any subcontainers in the resource. For a tier, the pane shows a health graph for the application
that contains the tier and the current health of all of the tiers in the application. The pane also shows a graph
for up to five metrics. Breaching KPIs appear first, followed by breaching super metrics, non-breaching
KPIs, and non-breaching super metrics.
Mashup Tab
This tab contains mashup charts, which show different aspects of the behavior of a resource. Mashup charts
include a health chart, an anomaly count graph, and metric graphs.
The health chart includes each alert for the specified time period.
n
The anomaly count graph shows the number of anomalies for the resource and its children at a specific
n
time. For an application, the anomaly graph shows the anomaly count for tiers that contain root cause
metrics for the resource. A red line marks the noise threshold for the resource. An anomaly count
higher than the red line indicates a 90 percent probability of a problem and triggers an early warning
alert.
Metric graphs appear for all of the KPIs for any resource that is listed as a root-cause resource. For an
n
application, the root-cause resource is the application and tiers that contain root causes.
Timeline Tab
This tab shows all of the anomalies that contributed to an alert. The anomalies appear in a dual-drag
timeline that starts with the beginning of the first anomaly and ends with the current time or, if the alert was
canceled, the cancel time.
Relationships Tab
This tab shows a resource relationship chart for the resource for which an alert was generated. Resource
relationship charts show the structure of the topography around a specific resource, including parent and
child resources.
Metric Charts Tab
This tab shows metric graphs for the resource for which an alert was generated. Metric graphs show the
recent performance and predicted future performance of metrics.
Correlations Tab
You can show the behavior or anomaly correlations to the KPI metrics that contributed to an alert on this
tab. Each metric has a percentage correlation. The larger the number, the more closely the metrics are
related.
Notes Tab
You can add and view notes for an alert on this tab.
VMware, Inc. 29
VMware vCenter Operations Manager Getting Started Guide
Managing Alerts
You can suspend, suppress, and take or release ownership of multiple alerts on the Alerts Overview page.
When you perform an action on an alert, the alert's status changes. You can also manage a single alert on the
Alert Detail page.
Take Ownership of an Alert
When you take ownership of an alert, you acknowledge that the alert is yours. Taking ownership of an alert
is important if multiple operators manage alerts in your environment. You can take ownership of multiple
alerts on the Alerts Overview page.
You can also take ownership of alerts in the Alerts widget.
Procedure
1Select Alerts > Alerts Overview and select the alert or alerts in the alert list.
You can press Ctrl+click to select multiple alerts or Shift+click to select a range of alerts.
2Click the Take Ownership icon.
3Click Yes in the confirmation dialog box.
When you own the alert, the Assigned Alerts icon appears in the Control State column and your user name
appears in the User Name column in the alert list.
Suspend an Alert
When you suspend an alert, you can specify a number of minutes. If the problem condition exists after the
time elapses, vCenter Operations Manager reactivates the alert. You can suspend multiple alerts on the
Alerts Overview page.
When you suspend an alert, you take ownership of it. You cannot suspend an alert that another user owns.
You can also suspend alerts in the Alerts widget.
NOTE Suspending an alert does not cancel its cancel cycle. The alert is still canceled according to the cancel
cycle value set when the resource or application was configured.
Procedure
1Select Alerts > Alerts Overview and select the alert or alerts in the alert list.
You can press Ctrl+click to select multiple alerts or Shift+click to select a range of alerts.
2Click the Suspend icon.
3Type the number of minutes to suspend the alert and click OK.
When the alert is suspended, the Suspended Alerts icon appears in the Control State column in the alert list.
Suppress an Alert
When you suppress an alert, you can specify a specific number of days. If the problem condition exists after
the time has elapsed, vCenter Operations Manager reactivates the alert. You can suppress multiple alerts on
the Alerts Overview page.
When you suppress an alert, you take ownership of it. You cannot suppress an alert that another user owns.
30 VMware, Inc.
Loading...
+ 82 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.