VMware vCenter Operations Manager - 5.8.5 Getting Started Guide

VMware vCenter Operations Manager
Getting Started Guide
vSphere User Interface
vCenter Operations Manager 5.8.5
This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document, see http://www.vmware.com/support/pubs.
VMware vCenter Operations Manager Getting Started Guide
You can find the most up-to-date technical documentation on the VMware Web site at:
http://www.vmware.com/support/
The VMware Web site also provides the latest product updates.
If you have comments about this documentation, submit your feedback to:
docfeedback@vmware.com
Copyright © 2015 VMware, Inc. All rights reserved. Copyright and trademark information.
VMware, Inc.
3401 Hillview Ave. Palo Alto, CA 94304 www.vmware.com
2 VMware, Inc.

Contents

VMware vCenter Operations Manager Getting Started Guide 5
Introducing Key Features and Concepts 7
1
vCenter Operations Manager Features 7
Main Concepts of vCenter Operations Manager 8
Metric Concepts for vCenter Operations Manager 9
Beginning to Monitor the Virtual Environment 11
2
Object Type Icons in the Inventory Pane 11
Badge Concepts for vCenter Operations Manager 12
Major Badges in vCenter Operations Manager 12
Working with Metrics and Charts on the All Metrics Tab 22
Viewing Members and Relationships in the Monitored Environment 27
3
Check the Performance of Your Virtual Environment 28
Balancing the Resources in Your Virtual Environment 28
Find an ESX Host that Has Resources for More Virtual Machines 29
Find a Cluster that Has Resources Available for More Virtual Machines 30
Ranking the Health, Risk, and Efficiency Scores 30
View the Compliance Details 31
View a List of Members 33
Overview of Relationships 34
VMware, Inc.
Troubleshooting with vCenter Operations Manager 37
4
Troubleshooting Overview 37
Troubleshooting a Help Desk Problem 38
Troubleshooting an Alert 38
Finding Problems in the Virtual Environment 39
Finding the Cause of the Problem 41
Fix the Cause of the Problem 47
Assessing Risk and Efficiency in vCenter Operations Manager 49
5
Analyzing Data for Capacity Risk 49
Optimizing Data for Capacity 54
Forecasting Data for Capacity Risk 58
Working with Faults and Alerts 65
6
Events that Generate Faults 65
Monitoring Alerts in vCenter Operations Manager 66
3
VMware vCenter Operations Manager Getting Started Guide
Working with Groups 75
7
Create a Group Type 76
Edit a Group Type 76
Delete a Group Type 77
Create a Group 77
Managing Groups 81
Application Custom Group 84
Set How Data Appears in vCenter Operations Manager 87
8
Create a New Policy 88
Modify an Existing Policy 109
Modify Summary, Views, and Reports Settings 110
Monitor the Performance of vCenter Operations Manager 113
9
Check the Health State of vCenter Operations Manager 113
Monitor Specific Metrics for vCenter Operations Manager 114
Monitor Specific Metrics for a vCenter Operations Manager Component 114
Index 115
4 VMware, Inc.

VMware vCenter Operations Manager Getting Started Guide

The VMware vCenter Operations Manager Getting Started Guide provides information about using VMware vCenter™ Operations Manager to monitor your virtual environment.
Intended Audience
This guide is intended for administrators of VMware vSphere who want to familiarize themselves with workflow tasks to monitor and manage the performance of the vCenter Operations Manager virtual environment.
®
VMware, Inc. 5
VMware vCenter Operations Manager Getting Started Guide
6 VMware, Inc.
Introducing Key Features and
Concepts 1
vCenter Operations Managerprovides monitoring functionality for your virtual environment. Understanding important features and concepts of vCenter Operations Manager helps you use the product effectively.
This chapter includes the following topics:
“vCenter Operations Manager Features,” on page 7
n
“Main Concepts of vCenter Operations Manager,” on page 8
n
“Metric Concepts for vCenter Operations Manager,” on page 9
n

vCenter Operations Manager Features

vCenter Operations Manager collects performance data from each object at every level of your virtual environment, from individual virtual machines and disk drives to entire clusters and datacenters. It stores and analyzes the data, and uses that analysis to provide real-time information about problems, or potential problems, anywhere in your virtual environment.
vCenter Operations Manager works with existing VMware products to add the following functions:
Combines key metrics into single scores for environmental health and efficiency and capacity risk.
n
Calculates the range of normal behavior for every metric and highlights abnormalities. Adjusts the
n
dynamic thresholds as incoming data allows it to better define the normal values for a metric.
Presents graphical representations of current and historical states of your entire virtual environment or
n
selected parts of it.
Displays information about changes in the hierarchy of your virtual environment. For example, when a
n
virtual machine is moved to a different ESX host, you can see how these changes affect the performance of the objects involved.
Allows you to define "group" containers to organize monitored objects in accordance with the structure
n
of your environment.
VMware, Inc.
7
VMware vCenter Operations Manager Getting Started Guide

Main Concepts of vCenter Operations Manager

vCenter Operations Manager uses certain concepts that can help you understand the product, its interface, and how to use it.
Attributes and Metrics
vCenter Operations Manager collects several kinds of data for each inventory object. For example, for a virtual machine, vCenter Operations Manager might receive data about free disk space, CPU load, and available memory. Each type of data that vCenter Operations Manager collects is called an attribute. An instance of an attribute for a specific inventory object is called a metric. For example, free memory for a specific virtual machine is a metric.
For each metric, vCenter Operations Manager collects and stores multiple readings over time. For example, the vCenter Operations Manager server polls for information about the CPU load for each virtual machine once every five minutes. Each piece of data that vCenter Operations Manager collects is called a metric value.
Dynamic Thresholds
vCenter Operations Manager defines dynamic thresholds for every metric based on the current and historical values of the metric. The normal range of values for a metric can differ on different days at different times because of regular cycles of use and behavior. vCenter Operations Manager tracks these normal value cycles and sets the dynamic thresholds accordingly. High metric values that are normal at one time might indicate potential problems at other times. For example, high CPU use on Friday afternoons, when weekly reports are generated, is normal. The same value on Sunday morning, when nobody is at the office, might indicate a problem.
vCenter Operations Manager continuously adjusts the dynamic thresholds. The new incoming data allows vCenter Operations Manager to better define what value is normal for a metric. The dynamic thresholds add context to metrics that allows vCenter Operations Manager to distinguish between normal and abnormal behavior.
Dynamic thresholds eliminate the need for the manual effort required to configure hard thresholds for hundreds or thousands of metrics. More importantly, they are more accurate than hard thresholds. Dynamic thresholds allow vCenter Operations Manager to detect deviations based on the actual normal behavior of an object and not on an arbitrary set of limits.
The analytics algorithms take seven days to calculate the initial values for dynamic thresholds. Dynamic thresholds appear as line segments under the bar graphs for use metrics on the Details page and on the Scoreboard page. The length and the position of the dynamic threshold line segment depends on the calculated normal values for the selected use metrics. Dynamic thresholds also appear as shaded gray areas of the use metrics graphs on the All Metrics page.
Hard Thresholds
Unlike dynamic thresholds, hard thresholds are fixed values that you enter to define what is normal behaviour for an object. These arbitrary values do not change over time unless you change them manually. You cannot fix hard thresholds with vCenter Operations Manager.
Key Performance Indicators
vCenter Operations Manager defines attributes that are critical to the performance of an object as key performance indicators (KPI). KPI are weighted more heavily in the calculations that determine the health of an object. Graphs of KPI performance appear before other metrics in several areas of the product.
8 VMware, Inc.
Chapter 1 Introducing Key Features and Concepts
Alerts and Faults
vCenter Operations Manager generates alerts when events occur on the monitored objects, when data analysis indicates deviations from normal metric values, or when a problem occurs with one of the vCenter Operations Manager components.
Events that the vCenter Serverpublishes are the main source for faults. These events might originate in the vCenter Server itself, or ESX servers might generate them and the vCenter Server publishes them externally. Only a subset of vCenter events are considered as important for fault generation.

Metric Concepts for vCenter Operations Manager

Preparing to monitor your environment with vCenter Operations Manager involves some familiarity with metrics that help you to identify a problem.
vCenter Server presents a use-based model of metrics. vCenter Operations Manager presents a demand­based model of metrics. Some knowledge of the metrics that affect the data and graphs is useful to determine what to do next in a workflow.
Table 11. Major Metric Concepts
Metric Description
Provisioned Amount of a resource that the user configures.
The provisioned metric might apply to the amount of physical memory for a host or the number of vCPUs for a virtual machine.
Usable Actual amount of a resource that the object can use.
The usable amount is less than or equal to the provisioned amount. The difference between the provisioned amount and usable amount stems from virtualization overhead. This overhead might include the memory that an ESX host uses to run the host, to support reservations for virtual machines, and to add a buffer for high availability.
The usable metric does not apply to virtual machines.
Usage Amount of a resource that an object consumes.
The usage amount is less than or equal to the usable amount.
Demand Amount of a physical resource that the object might
consume without any existing constraints.
An object becomes constrained because of under­provisioning or contention with other consumers of the resource. A virtual machine might require 10GB of memory but can only get 5GB because the virtual machine must share resources with other virtual machines on the host.
When the demand amount is less than the usage amount, the environment might have wasted resources. When the demand amount is greater than the usage amount, the environment might incur latency and exhibit decreased performance.
Contention Difference between the amount of the resource that the
object requires and the amount of the resource that the object gets.
This metric measures the effect of conflict for a resource between consumers. Contention measures latency or the amount of time it takes to gain access to a resource. This measurement accounts for dropped packets for networking.
VMware, Inc. 9
VMware vCenter Operations Manager Getting Started Guide
Table 11. Major Metric Concepts (Continued)
Metric Description
Limit Maximum amount that an object can obtain from a
Reservation Guaranteed amount of resources for an object.
Entitlement Amount of a resource that a virtual machine can use based
resource.
The limit sets the upper bound for CPU, memory, or disk I/O resources that you allocate and configure in vCenter Server.
The usage amount is less than or equal to the limit amount.
The demand amount can be greater than the limit amount.
The limit amount is less than or equal to the provisioned amount.
The default limit amount is unlimited. Rules: Usage <= Limit
Demand can be greater than Limit .
The object does not start without this reserved amount. The default amount is 0.
on the relative priority of that consumer set by the virtualization configuration.
This metric is a function of provisioned, limit, reservation, shares, and demand. Shares involve proportional weighting that indicates the importance of a virtual machine.
The entitlement amount is less than or equal to the limit amount.
The entitlement metric applies only to virtual machines. Rules: Entitlement <= Limit <= Provisioned
Entitlement >= Reservations
10 VMware, Inc.
Beginning to Monitor the Virtual
Environment 2
To use vCenter Operations Manager to monitor your virtual environment, you must understand the icons, badges, and key metric concepts used in the product.
This chapter includes the following topics:
“Object Type Icons in the Inventory Pane,” on page 11
n
“Badge Concepts for vCenter Operations Manager,” on page 12
n
“Major Badges in vCenter Operations Manager,” on page 12
n
“Working with Metrics and Charts on the All Metrics Tab,” on page 22
n

Object Type Icons in the Inventory Pane

All objects that vCenter Operations Manager monitors are listed in the inventory pane.
vCenter Operations Manager uses specific icons so that you can distinguish between virtual machines, ESX hosts, and other objects in the inventory.
Table 21. Object Type Icons
Icon Description
World
The World object is a logical container for all monitored objects in vCenter Operations Manager.
vCenter Server system
Datacenter
Cluster
ESX host that is in powered-off state
ESX host that is in powered-on state
Datastore
Virtual machine that is in powered-off state
VMware, Inc. 11
VMware vCenter Operations Manager Getting Started Guide
Table 21. Object Type Icons (Continued)
Icon Description
Virtual machine that is in powered-on state
Custom group created by vCenter Operations Manager or by administrators
By default, objects in the inventory pane are grouped by hosts and clusters. You can select Datastores from the drop-down menu at the top of the inventory pane to switch the way objects are grouped.

Badge Concepts for vCenter Operations Manager

vCenter Operations Manager uses badges to illustrate derived metrics to provide an overview of the state of the virtual environment or an individual object. These badges serve as focus points to narrow the scope of a potential problem and provide details about the cause of the problem.
vCenter Operations Manager provides major and minor badges that are color coded and range from a healthy green to a potentially problematic yellow, orange, or red. Badges are organized in a simple hierarchy in which the scores of minor badges contribute to the scores of major badges.
Scores might reflect a healthy state or a potential problem depending on the type of badge. For example, low scores for health, time remaining, and capacity remaining might indicate potential problems, while low scores for faults, stress, or anomalies indicate a normal state.
You can use the Dashboard tab for an overview of the performance and condition of your virtual infrastructure. The information you see on the Dashboard tab depends on the object you select in the navigation tree. See “Object Type Icons in the Inventory Pane,” on page 11.
You can expand the panes on the Dashboard to view information about a specific badge. You can also double-click badges to view details about the metrics that affect badge scores.

Major Badges in vCenter Operations Manager

vCenter Operations Manager generates major badges that start a workflow and help you to identify health, capacity risk, and efficiency issues.
Each major badge contains minor badges. vCenter Operations Manager calculates major badges based on the state of the minor badges.

Using Health to Measure the Overall State of the Environment

The vCenter Operations Manager Health badge serves as the first high-level indicator of the state of the virtual environment.
The Health badge indicates immediate problems that might require your attention. It helps you identify the current health of your system. vCenter Operations Manager combines workload, anomalies, and faults to assess the overall health and to determine whether the workload level is expected in that environment. A low health score might indicate a potential problem.
vCenter Operations Manager calculates the Health score by using the scores of the sub-badges that the Health badge contains. Faults are given precedence in the Health score because they describe existing problems, while Workload and Anomalies are combined to identify performance problems. This approach ensures that the score of the Health badge reflects the actual state of the object, without exaggerating or underestimating problems.
The Health score ranges between 0 (bad) and 100 (good). The badge changes its color based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
12 VMware, Inc.
Chapter 2 Beginning to Monitor the Virtual Environment
Table 22. Object Health States
Badge Icon Description User Action
The health of the object is normal.
No attention required.
The object is experiencing some level of problems.
The object might have serious problems.
The object is either not functioning properly or will stop functioning soon.
No data is available.
The object is offline.
Check the Details tab and take appropriate action.
Check the Details tab and take appropriate action as soon as possible.
Check the Details tab to identify the most probable cause of the problem and act immediately.
A vCenter Operations Manager administrator can change the badge score thresholds. For example, a green Health badge can indicate a score above 80 instead of 75, as set by default.
The Health Weather Map
The Health Weather Map displays the health status of all the objects related to the object that you have selected in the inventory pane.
The Health Weather Map is available for all objects in the inventory except virtual machines. For virtual machines, vCenter Operations Manager displays a Health Trend graph.
Each square in the Weather Map represents a related object, directly or indirectly connected to the selected object. For example, if you selected an ESX host in the inventory pane, the number of squares in the Health Weather Map equals the sum of all virtual machines and datastores under this ESX host plus the Datacenter, vCenter Server, and the World object that are above the ESX host in the inventory pane. The squares in the Health Weather Map are not sized, so there is no visible correspondence between object types in the inventory and the squares in the map. You can use the Health Weather Map for a quick overview of the current condition and how it has changed during the past 6 hours. If you notice a red or yellow square, you can click it to navigate to the Details tab of the object that corresponds to this square, and view more information about its health.
By default, the Health Weather Map displays the current badge values. You can click the time line in the bottom of the map to switch to earlier periods.
NOTE Selecting an earlier time period that is one or more hours away from the current moment displays the condition of the monitored system as it was on the hour. For example, if the current time is 3:15 p.m. and you click -1, vCenter Operations Manager displays the Health Weather Map for 3:00 p.m. If you click -2, vCenter Operations Manager displays the Health Weather Map for 2:00 p.m.
Using the Workload Badge Under the Health Badge
The vCenter Operations Manager Workload badge measures how hard an object must work for resources. A workload score of 0 indicates that a resource is not being used and a score that approaches or exceeds 100 might cause performance problems.
Workload is an absolute measurement that calculates the demand for a resource divided by the capacity of an object. Resources might include CPU, memory, disk I/O, or network I/O.
VMware, Inc. 13
VMware vCenter Operations Manager Getting Started Guide
The Workload score ranges from 0 (good) to over 100 (bad). The badge changes its color based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
Table 23. Object Workload States
Badge Icon Description User Action
Workload on the object is not excessive.
No attention required.
The object is experiencing some high resource workloads.
Workload on the object is approaching its capacity in at least one area.
Workload on the object is at or over its capacity in one or more areas.
No data is available.
1 At the upper right of the
Health tab, click the View
details icon to open
the VMware vCenter Operations Manager user interface.
2 Check the Details tab to
identify which resources experience abnormal workload.
1 At the upper right of the
Health tab, click the View
details icon to open
the VMware vCenter Operations Manager user interface.
2 Check the Details tab to
identify which resources are limited, and take appropriate action as soon as possible.
1 At the upper right of the
Health tab, click the View
details icon to open
the VMware vCenter Operations Manager user interface.
2 Check the Details tab to
identify which resources are exhausted. Act immediately to avoid or correct problems.
The object is offline.
A vCenter Operations Manager administrator can change the badge score thresholds. For example, a green Workload badge can indicate a score below 80 instead of 85, as set by default.
14 VMware, Inc.
Chapter 2 Beginning to Monitor the Virtual Environment
Using the Anomalies Badge Under the Health Badge
The vCenter Operations Manager Anomalies badge measures the extent of abnormal behavior for an object based on historical metrics data. A high number of anomalies might indicate a potential issue.
A low Anomalies score indicates that an object is behaving in accordance with its established historical parameters. Most or all of the object metrics, especially its KPIs, are within their thresholds. Because changes in behavior often indicate developing problems, if the metrics of an object go outside the calculated thresholds, the anomalies score for the object grows. As more metrics breach the thresholds, anomalies continue to increase. Violations by KPI metrics increase the Anomalies score more than violations by non­KPI metrics. A high number of anomalies usually indicates a problem or at least a situation that requires your attention.
Anomalies involves the number of statistics that fall outside of the expected behavior trends while Workload involves an absolute measurement of how hard an object works for resources. Both Anomalies and Workload are useful when attempting to find a probable cause and troubleshooting performance problems.
The Anomalies score ranges between 0 (good) and 100 (bad). The badge changes its color based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
Table 24. Object Anomalies States
Badge Icon Description User Action
The Anomalies score is normal. No attention required.
The Anomalies score exceeds the normal range.
The Anomalies score is very high.
Most of the metrics are beyond their thresholds. This object might not be working properly or might stop working soon.
No data is available.
The object is offline.
Check the Details tab to identify what causes the abnormal number of anomalies, and take appropriate action.
Check the Details tab to identify the cause of the abnormal behaviour, and take appropriate action as soon as possible.
Check the Details tab, and act immediately to avoid or correct problems.
A vCenter Operations Manager administrator can change the badge score thresholds. For example, a green Anomalies badge can indicate a score below 60 instead of 50, as set be default.
VMware, Inc. 15
VMware vCenter Operations Manager Getting Started Guide
Using the Faults Badge Under the Health Badge
The Faults badge measures the degree of problems that the object might experience based on events retrieved from the vCenter Server.
The events that might generate faults include the loss of redundancy in NICs or HBAs, memory checksum errors, high availability failover, or Common Information Model (CIM) events, which require your immediate attention.
NOTE vCenter Operations Manager calculates the Faults score for the World object by using the Faults scores of the vCenter Server systems, the ESX hosts and clusters, and the datastores that belong to the World inventory.
Therefore, if you have one vCenter Server with a Faults score of 100 and another vCenter Server with a Faults score of 0, this might not necessarily result in a Faults score of 50 for the World object. The final Faults score for the World object also depends on the number of datastores, ESX hosts and clusters in the environment, and the Fault scores of these objects.
Fault scores of virtual machines are not taken into account when calculating the Faults score for the World object.
While the Faults score ranges between 0 to 100, the badge changes color based on the badge score thresholds that are set by the vCenter Operations Manager administrator. For example, a green Faults badge can indicate a score below 40 instead of a score below 25 (the system default).
Table 25. Object Faults States
Badge Icon Description User Action
No faults are registered on the selected object.
Faults of low importance are registered on the selected object.
Faults of high importance are registered on the selected object.
Faults of critical importance are registered on the selected object.
No data is available.
The object is offline.
No attention required.
Check the Details tab to find more information and take appropriate action.
Check the Details tab to find more information about the reported faults, and take appropriate action as soon as possible.
Check the Details tab to find more information about the reported faults, and act immediately to avoid or correct problems.
While the Faults score ranges between 0 to 100, the badge changes color based on the badge score thresholds that are set by the vCenter Operations Manager administrator. For example, a green Faults badge can indicate a score below 40 instead of a score below 25 (the system default).
16 VMware, Inc.
Chapter 2 Beginning to Monitor the Virtual Environment

Defining Risk to Assess Future Problems in vCenter Operations Manager

The vCenter Operations Manager Risk badge indicates a potential performance problem in the near future that might affect the virtual environment.
Risk involves the time remaining, capacity remaining, and stress factors that account for the time buffer, remaining virtual machines, and degree of habitual high workload.
vCenter Operations Manager calculates the risk score using the scores of the sub-badges that the Risk badge contains. The formula that is applied to calculate the risk score is inverse geometric weighted mean.
The overall risk score for an object ranges between 0 (no risk) to 100 (serious risk). The badge changes its color based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
Table 26. Object Risk States
Badge Icon Description User Action
The selected object has no current problems. No problems are expected in the future.
A low chance of future problems exists or a potential problem might occur in the far future.
A chance of a more serious problem exists or a problem might occur in the medium­term future.
The chances of a serious future problem are high or a problem might occur in the near future.
No data is available.
No attention required.
Navigate to the Scoreboard tab to check which resources are likely to exhaust and plan for appropriate actions.
Navigate to the Scoreboard tab to check which resources are close to exhausting and take appropriate actions as soon as possible.
Navigate to the Scoreboard tab in to check which resources are exhausted and act immediately to resolve or prevent problems.
The object is offline.
Using the Time Remaining Badge Under the Risk Badge
The vCenter Operations Manager Time Remaining badge measures the time before a resource associated with an object reaches capacity. This badge indicates the available timeframe to provision or load balance the physical or virtual resources for a selected object.
vCenter Operations Manager calculates the Time Remaining score as a percentage of time that is remaining for each compute resource compared to the provisioning buffer you set in the Configuration dialog box. By default, the Time Remaining score provisioning buffer is 30 days. If even one of the compute resources has less capacity than the provisioned buffer, the Time Remaining score is 0.
For example, if the provisioning buffer is set to 30 days, and the object that you selected has CPU resources for 81 days, memory resources for 5 days, disk I/O resources for 200 days, and network I/O resources for more than one year, the Time Remaining score is 0, because one of the resources has capacity for less than 30 days.
The Time Remaining score ranges between 0 (bad) and 100 (good). The badge changes its color based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
VMware, Inc. 17
VMware vCenter Operations Manager Getting Started Guide
Table 27. Time Remaining States
Badge Icon Description User Action
The number of days that remain is much higher than the score provisioning buffer you specified.
The number of days that remain is higher than the score provisioning buffer, but is less than two times the buffer you specified.
The number of days that remain is higher than the score provisioning buffer, but approaches the buffer you specified.
The number of days that remain is lower than the score provisioning buffer you specified. The selected object might have exhausted some of its resources or will exhaust them soon.
No data is available for the Time Remaining score.
No attention required.
Check and take appropriate action.
Check and take appropriate action as soon as possible.
Act immediately.
The object is offline.
Using the Capacity Remaining Badge Under the Risk Badge
The vCenter Operations Manager Capacity Remaining badge measures the number of additional virtual machines that the object can handle before reaching capacity.
The remaining virtual machines count represents the number of virtual machines that can be deployed on the selected object, based on the current amount of unused resources and the average virtual machine profile for the last "n" weeks. The remaining virtual machines count is a function of the same compute resources of CPU, Mem, Disk I/O, Net I/O, and Disk Space that are used to calculate the Time Remaining score.
vCenter Operations Manager calculates the Capacity Remaining score as a percentage of the remaining virtual machines count compared to the total number of virtual machines that can be deployed on the selected object.
The Capacity Remaining score ranges between 0 (bad) and 100 (good). The badge changes its color based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
Table 28. Object Capacity States
Icon Description User Action
The capacity remaining for the object is at normal level.
The capacity remaining for the object is less than the normal level.
No attention required.
Check and take appropriate action.
18 VMware, Inc.
Chapter 2 Beginning to Monitor the Virtual Environment
Table 28. Object Capacity States (Continued)
Icon Description User Action
The capacity remaining for the object is at seriously low level.
The object is expected to run out of capacity soon or has already run out of capacity.
No data is available for any of the metrics for the time period.
The object is offline.
Check and take appropriate action as soon as possible.
Act immediately.
Using the Stress Badge Under the Risk Badge
The vCenter Operations Manager Stress badge measures a long-term workload that might involve undersized virtual machines or ESX hosts or an excessive number of virtual machines. These conditions might generate performance problems over time.
While workload is based on an instantaneous value, stress measures statistics over a longer period of time.
The Stress score helps you identify hosts and virtual machines that do not have enough resources allocated, or hosts that are running too many virtual machines. A high Stress score does not imply a current performance problem, but highlights potential for future performance problems.
The Stress score ranges between 0 (good) and 100 (bad). The badge changes its color based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
Table 29. Stress States
Badge Icon Description User Action
The Stress score is normal. No attention required.
Some of the object resources are not enough to meet the demands.
The object is experiencing regular resource shortage.
Most of the resources on the object are constantly insufficient. The object might stop functioning properly.
No data is available for the Stress score.
Check and take appropriate action.
Check and take appropriate action as soon as possible.
Act immediately.
The object is offline.
VMware, Inc. 19
VMware vCenter Operations Manager Getting Started Guide

The Compliance Badge

The Compliance badge value is based on the results of compliance templates that are run in vCenter Configuration Manager and are pulled into vCenter Operations Manager to contribute to the Risk badge calculation.
The Compliance badge value is a score based on one or more compliance templates that you run in VCM against the data collected from vCenter Server, datacenter, cluster, host system, virtual machine objects that are managed by vCenter Operations Manager and by VCM. The scores are calculated based on configured VCM settings.
A VCM compliance template comprises one or more rules that define the configuration standards for different object groups or all your objects. A rule is defined with one or more conditions that are run against objects to determine if the configuration meets the required standards. The success or failure of the conditions determines whether the rule is successful or unsuccessful. The compliance templates are not run against each object, but are run against the collected configuration data for each object.
The Compliance score ranges from 0 (completely non-compliant) to 100 (completely compliant). The color of the badge changes based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
Table 210. Object Compliance States
Badge Icon Description User Action
The object is in compliance with the defined standards.
The object is not as compliant as you would like based on the default values you defined.
The object is seriously out of compliance.
The object is non-compliant. 1 Click the Compliance badge.
No data is available for any of the templates.
The object is offline.
No attention required.
1 Click the Compliance badge.
2 Check the template scores and take
appropriate action.
1 Click the Compliance badge.
2 Check the template scores and take
appropriate action as soon as possible.
2 Act immediately to correct non-
compliant states based on template results.

Defining Efficiency to Optimize the Environment

The vCenter Operations Manager Efficiency badge identifies the potential opportunities to improve the performance or cost of your virtual environment.
Efficiency accounts for the waste and infrastructure density in your environment. A large amount of wasted resources combined with a low density ratio generates a poor efficiency score.
The Efficiency score ranges between 0 (bad) and 100 (good). The badge changes its color based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
20 VMware, Inc.
Chapter 2 Beginning to Monitor the Virtual Environment
Table 211. Object Efficiency States
Badge Icon Description User Action
The resource use on the selected object is optimal.
No attention required.
The efficiency is good, but can be improved. Some resources are not fully used.
The resources on the selected object are not used in the most optimal way.
The efficiency is bad. Many resources are wasted.
No data is available.
The object is offline.
1 On the Planning tab, click
Views.
2 Use the views that are
available for the selected object to identify underused resources.
1 On the Planning tab, click
Views.
2 Use the views that are
available for the selected object to identify underused resources.
1 On the Planning tab, click
Views.
2 Try optimizing the resource
use to avoid resource waste.
Using the Reclaimable Waste Badge Under the Efficiency Badge
The vCenter Operations Manager Reclaimable Waste badge accounts for resource types such as CPU, memory, or disk, and measures the extent of excessive provisioning for an object. It also identifies the amount of resources that you can reclaim and provision to other objects in your virtual environment.
The Reclaimable Waste score ranges between 0 (good) and 100 (bad). The badge changes its color based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
Table 212. Reclaimable Waste States
Badge Icon Description User Action
No resources are wasted on the selected object.
Some resource can be used better.
Many resources are underused. Select Planning > Views to
Most of the resources on the selected object are wasted.
No data is available for any of the metrics for the time period.
The object is offline.
No attention required.
Select Planning > Views to identify underused resources.
identify underused resources.
Select Planning > Views to identify underused resources.
A vCenter Operations Manager administrator can change the badge score thresholds. For example, a green badge can indicate a score below 50 instead of 75, as set by default.
VMware, Inc. 21
VMware vCenter Operations Manager Getting Started Guide
Using the Density Badge Under the Efficiency Badge
The vCenter Operations Manager Density badge measures consolidation ratios to assess cost savings. You can assess the behavior and performance of a virtual machine and related applications to maximize the consolidation ratio without affecting the performance or service level agreements.
The density score is the ratio of the actual density to an ideal density based on the demand, the amount of virtual capacity, and the amount of physical usable capacity. Density calculates the amount of resources that you can provision before contention or conflict for a resource occurs between objects. The ratios account for the number of virtual machines to host, the number of virtual CPUs to physical CPU, and the amount of virtual memory to physical memory.
The Density score ranges between 0 (bad) and 100 (good). The badge changes its color based on the badge score thresholds that are set by the vCenter Operations Manager administrator.
Table 213. Object Density States
Badge Icon Description User Action
The resource consolidation is good.
No attention required.
Some resources are not fully consolidated.
The consolidation for many resources is low.
The resource consolidation is extremely low.
No data is available for any of the metrics for the time period.
The object is offline.
Select Planning > Views to identify resource consolidation opportunities.
Select Planning > Views to identify resource consolidation opportunities.
Select Planning > Views to identify resource consolidation opportunities.
A vCenter Operations Manager administrator can change the badge score thresholds. For example, a green Density badge can indicate a score above 40 instead of 25, as set by default.

Working with Metrics and Charts on the All Metrics Tab

You can check the location of an object in the hierarchy and select metrics to view graphs of their historic values for a period you define.
You can use the panes on the All Metrics tab under the Operations tab to search metrics and view metric graphs.
The Health Tree Pane
The Health Tree pane displays the location of the currently selected object in the hierarchy of your virtual infrastructure. You can check all parent and child objects related to the currently selected object.
For example, the sample hierarchy shows the parent and child objects in the virtual infrastructure. The red icon indicates a presence of a potential problem in the vCenter Server object. You can investigate the probable cause of the problem from the Dashboard tab.
22 VMware, Inc.
Chapter 2 Beginning to Monitor the Virtual Environment
The Metric Selector Pane
The Metric Selector pane contains a list of all metric groups that are applicable to the currently selected object. Metric groups contain all the metrics that are applicable to the currently selected object. The list of available metrics is updated depending on the object you selected in the Health Tree pane.
The Search text box allows you find metrics using part of their names and filter the search results by metric groups, instances or metric types.
For example, if you type % in the Search text box and select Metric from the drop-down menu, the search result contains only metrics that are calculated as percentage.
The Metric Chart Pane
The Metric Chart pane displays the graphs of the metrics you select from the Metric Selector pane. You can view as many graphs as you want. You can control the appearance of metric graphs and create screenshots by using the buttons in the Metric Chart pane.

Health Tree Pane Buttons

In the All Metrics tab under the Operations tab, you can use the buttons of the Health Tree pane to control the appearance of monitored objects in the health tree.
Button Tooltip Icon Description
Zoom to fit Resizes the view so all related objects fit in the health tree area. All previous zoom
operations are discarded.
Enter pan mode Allows you to pan the health tree.
Show values on point Enables metric value tooltips so that they appear when you point the graph with the
mouse pointer.
Enter zoom mode Allows you to enlarge sections of the health tree by drawing rectangles to enclose the area
to enlarge.
Zoom in Enlarges the middle of the health tree by one level.
Zoom out Reduces the middle of the health tree by one level.
Reset to initial object Reset the Health Tree pane to the original view for the selected object..
VMware, Inc. 23
VMware vCenter Operations Manager Getting Started Guide

Metric Chart Pane Buttons

On the All Metrics page, you can use the buttons of the Metric Chart pane to customize the appearance of charts, and add or delete charts.
Global Control Buttons
These buttons control the appearance of all graphs that you open in the Metric Chart pane.
Button Tooltip Icon Description
Separate graphs by period Splits the current metrics graph in separate graphs by periods based on your
selection in the time and date widget.
Show/hide Y-axis Displays or hides the Y axis of the graph to display metric values.
Show/hide metric line Displays or hides the line that connects the data points in the metric graph.
Show/hide trend line Displays or hides the line that represents the trend of the currently selected metric
in the graph.
Show 24-hour dynamic thresholds
Show entire period dynamic thresholds
Show/hide anomalies Displays or hides the anomalies that occurred during the selected period in the
Retrieve complete metric values
Enable X-axis zoom Allows you to enlarge the selected area of the graph only on the X axis while the Y
Enable Y-axis zoom Allows you to enlarge the selected area of the graph only on the Y axis while the X
Zoom to fit Resizes the charts so the entire graphs for all selected periods fit in the chart area.
Zoom Y-axis to dynamic thresholds
Compress Y-axis Shortens the graph.
Displays or hides the calculated dynamic threshold values for a 24-hour period in the graph.
Displays or hides the calculated dynamic threshold values for the entire monitoring period in the graph.
graph.
In zoom mode, displays the values of the selected metric when you move the mouse pointer over the graph.
axis remains static.
axis remains static.
All previous zooms are discarded.
Resizes the Y axis of the metric chart so that the highest and the lowest values on the axis are the highest and the lowest values of the dynamic threshold calculated for this metric.
Zoom all graphs together Resizes all metric graphs that are open in the Metric Chart pane.
Enter zoom mode Enables resizing of the metric graphs on both axis Y and axis X.
Enter pan mode When in zoom mode, allows you to drag the enlarged section of the graph around
to view higher, lower, earlier, or later values of the metric.
Show value on point Enables metric value tooltips so that they appear when you point the graph with
the mouse pointer.
Refresh Reloads all graphs in the Metric Chart.
Turn auto refresh on/off Activates or deactivates the auto refresh option for metric charts.
24 VMware, Inc.
Chapter 2 Beginning to Monitor the Virtual Environment
Button Tooltip Icon Description
Open date/time controls Opens the date and time widget for you to select the period to display on the
metric graph.
Remove all graphs Deletes all graphs from the Metric Chart pane.
Chart-Specific Buttons
These buttons control the specific chart to which they are attached. Some chart-specific buttons are available only when you view graphs split by period.
Button Tooltip Icon Description
Move up When multiple graphs are open in the Metric Chart pane, this button moves the selected
graph one place up. Available only for split graphs view.
Move down When multiple graphs are open in the Metric Chart pane, this button moves the selected
graph one place down. Available only for split graphs view.
Save a snapshot Creates a real-size snapshot of the selected graph and opens a File Download window for
you to open or save the PNG file.
Save a full screen snapshot
Download comma separated data
Close Deletes the selected graph from the Metric Chart pane. Available only for split graphs
Creates an enlarged snapshot of the selected graph and opens a File Download window for you to open or save the PNG file.
Creates a comma separated values file with the metric data for the selected graph and opens a File Download window for you to open or save the CSV file. Available only for split graphs view.
view.
VMware, Inc. 25
VMware vCenter Operations Manager Getting Started Guide
26 VMware, Inc.
Viewing Members and Relationships
in the Monitored Environment 3
The Environment tab allows you to look at the objects in your virtual environment from different perspectives.
The Overview Tab
The Overview tab provides a visual representation of the population of your virtual environment by object types. You can click objects to highlight their related parent and child objects. You can compare the scores of related objects to narrow down the possible causes for a bad badge score.
For example, if the Health badge of an ESX host is red, but the health badge of its parent cluster is green, the problem is either with the host itself, or with the virtual machines that run on that host. If the Health badges of one or two virtual machines are red, these virtual machines might be causing the problem on the ESX host. Therefore, you must check these virtual machines first. If the Health badges of all virtual machines are red, the host system is experiencing hardware problems or cannot meet the demands of the virtual machines that run on it.
The Scoreboard Tab
On the Scoreboard tab, you can view a population of data across multiple metrics dimensions to get a quick glance overview of where and why problems exist. Scoreboards are useful for any object that has a population, including group objects.
On the Scoreboard tab, you can compare the badge values of child objects of the currently selected object. For example, you can use the Scoreboard tab to compare objects and analyze the current distribution and availability of resources in your virtual environment.
The Members Tab
On the Members tab, you can vew a list of all objects under the currently selected object or group. In addition to the names of objects and their parents, the list displays information about the policy that is assigned to each object.
The Relationships Tab
The Relationships tab displays application related information provided by VMware vCenter Infrastructure Navigator. You can view the relationship graph and object properties of the selected object and its related objects.
This chapter includes the following topics:
“Check the Performance of Your Virtual Environment,” on page 28
n
“Balancing the Resources in Your Virtual Environment,” on page 28
n
VMware, Inc.
27
VMware vCenter Operations Manager Getting Started Guide
“Find an ESX Host that Has Resources for More Virtual Machines,” on page 29
n
“Find a Cluster that Has Resources Available for More Virtual Machines,” on page 30
n
“Ranking the Health, Risk, and Efficiency Scores,” on page 30
n
“View the Compliance Details,” on page 31
n
“View a List of Members,” on page 33
n
“Overview of Relationships,” on page 34
n

Check the Performance of Your Virtual Environment

On the Overview tab under the Environment tab, you can check the performance of your virtual infrastructure to identify objects with potential performance problems.
You can check the performance of all objects on the vCenter Server that you added for monitoring in vCenter Operations Manager.
Prerequisites
Verify that you are logged in to a vSphere Client, and that vCenter Operations Manager is open.
Procedure
1 On the Environment tab, click Overview, and click the badge that you want to inspect.
Option Description
Health (default)
Workload
Anomalies
Faults
Click to check the Health scores of all monitored objects.
Click to check the Workload scores of all monitored objects.
Click to check the Anomalies scores for all monitored objects.
Click to check the Faults scores for all monitored objects.
The states of all objects for the metric you selected appear in the Environment tab as colored badges.
2 (Optional) To filter objects by state, use the Status Filter buttons in the upper right of the Overview tab.
What to do next
How you proceed depends on your findings of the performance in your virtual environment.

Balancing the Resources in Your Virtual Environment

You can get an overview of where and why problems exist in your virtual environment, and how the resources are distributed among objects.
On the Scoreboard tab under the Environment tab, you can compare the performance and available resources of all child objects of the currently selected object, and look for answers to the following questions.
What is the current distribution and availability of resources in my virtual environment?
n
Which hosts have the resources to accommodate new virtual machines?
n
Which hosts require load balancing? Can I move virtual machines from an overloaded host to a less
n
loaded host?
Which child objects have the highest and the lowest scores for health, workload, anomalies, and faults?
n
28 VMware, Inc.
Chapter 3 Viewing Members and Relationships in the Monitored Environment
The Custom Overview Chart
The Custom Overview chart is a bubble chart that allows you to compare objects by their badge values. By using the Custom Overview chart, you can draw better inferences from the data that vCenter Operations Manager collects.
The chart presents data by four dimensions, using the following variables: color, size, x-axis, and y-axis. The value that is represented by each variable depends on the selection in the View drop-down menu. When Health, Risk, or Efficiency is selected in the View drop-down menu, the values for color, size, x-axis, and y­axis are populated automatically and cannot be modified. You can specify your own values only when Custom is selected in the View drop-down menu.
You can use the buttons above the Custom Overview chart to filter objects by type and badge status. Filtering allows you to slice the problem and narrow down the objects that are experiencing the problem to a certain degree.
The Members List
The Members list displays details for all objects that are visualized in the Custom Overview chart. This list is filtered based on the criteria that you specify for the Custom Overview chart. The columns contain object name, type, and spark lines for all badges related to the object. At the end of each spark line, the latest known value for the badge is displayed as a numeric label.
You can click objects in the Members list to highlight them in the Custom Overview chart, and you can click objects in the Custom Overview chart to highlight them in the Members list.
The procedure that you must follow when searching for a host that can accommodate a new virtual machine depends on the use of Distributed Resource Scheduler (DRS) on the vCenter Server.

Find an ESX Host that Has Resources for More Virtual Machines

If a vCenter Server host does not use Distributed Resource Scheduler (DRS), you can use the Scoreboard tab to check the available resources on the ESX hosts in a cluster and make decisions on moving virtual machines in your virtual infrastructure.
Prerequisites
Verify that you are logged in to a vSphere Client, and that vCenter Operations Manager is open.
Procedure
1 In the inventory view, click the datacenter or cluster that contains the ESX host that you want to assess.
2 On the Scoreboard tab, select Health from the View drop-down menu.
The colored bubbles in the Custom Overview chart represent the health scores for all object in the datacenter that are online. The workload is represented by the X-axis.
The objects with highest workload appear to the right on the X-axis.
3 (Optional) To filter object types out of the Custom Overview chart, click their icons.
4 In the Custom Overview chart, click the bubble for a host that you think might accommodate more
virtual machines.
Usually, this should be the ESX host that is situated closer to the Y-axis.
The name of the host becomes highlighted in the Members List pane.
5 In the Members List pane, click the host name to open its Details tab.
VMware, Inc. 29
VMware vCenter Operations Manager Getting Started Guide
6 On the Details tab, review the Resources pane and the Workload graphs to assess the potential capacity
for new virtual machines.
If one or more resources of the host are approaching their limits, you might not want to add a virtual machine to this ESX host.
What to do next
If the selected ESX host has enough resources, you can add the new virtual machines.

Find a Cluster that Has Resources Available for More Virtual Machines

If the vCenter Server host uses Distributed Resource Scheduler (DRS), you can use the Scoreboard tab to check the available resources in each cluster and make decisions on moving virtual machines in your virtual infrastructure.
Prerequisites
Verify that you are logged in to a vSphere Client, and that vCenter Operations Manager is open.
Procedure
1 In the inventory view, click the datacenter that contains the cluster that you want to assess, and click the
Environment tab.
2 On the Scoreboard tab, select Health from the View drop-down menu.
The colored bubbles in the Custom Overview chart represent the health scores for all objects in the datacenter that are online. The workload is represented by the X-axis.
The objects with highest workload appear to the right on the X-axis.
3 (Optional) To filter object types out of the Custom Overview chart, click their icons.
4 In the Custom Overview chart, click the bubble of a cluster that you think might accommodate more
virtual machines.
Usually, this is a cluster that is situated closer to the Y-axis.
The name of the cluster becomes highlighted in the Members List pane.
5 In the Members List pane, click the object name to open its Details tab.
6 On the Details tab, review the Resources pane and the Workload graphs to assess the potential capacity
for new virtual machines.
If one or more resources are approaching their limits, you might not want to add a virtual machine to this cluster.
What to do next
If the selected cluster has enough resources, you can add the new virtual machines.

Ranking the Health, Risk, and Efficiency Scores

On the Scoreboard tab, you can compare the scores of the Health, Risk, and Efficiency badges and their sub­badges for child objects that are directly related to the object selected in the inventory pane.
Each child object is displayed as a circle in the Custom Overview chart. Larger circle size means higher score for the badge that is selected in the Size drop-down menu.
30 VMware, Inc.
Loading...
+ 90 hidden pages