The information contained in this document is subject to change without notice.
Hewlett-Packard makes no warranty of any kind with regard to this manual,
including, but not limited to, the implied warranties of merchantability and fitness
for a particular purpose. Hewlett-Packard shall not be liable for errors contained
herein or direct, indirect, special, incidental or consequential damages in connection
with the furnishing, performance, or use of this material.
reserved. Reproduction, adaptation, or translation without prior written permission
is prohibited, except as allowed under the copyright laws.
Corporate Offices:
Hewlett-Packard Co.
3000 Hanover St.
Palo Alto, CA 94304
Use, duplication or disclosure by the U.S. Government Department of Defense is
subject to restrictions as set forth in paragraph (b)(3)(ii) of the Rights in Technical
Data and Software clause in FAR 52.227-7013.
Rights for non-DOD U.S. Government Departments and Agencies are as set forth in
FAR 52.227-19(c)(1,2).
Use of this manual and flexible disc(s), compact disc(s), or tape cartridge(s)
supplied for this pack is restricted to this product only. Additional copies of the
programs may be made for security and back-up purposes only. Resale of the
programs in their present form or with alterations, is expressly prohibited.
A copy of the specific warranty terms applicable to your Hewlett-Packard product
and replacement parts can be obtained from your local Sales and Service Office.
This edition documents material related to installing and configuring the Event
Monitoring Service (EMS).
This printing date and part number indicate the current edition. The printing date
changes when a new edition is printed. (Minor corrections and updates which are
incorporated at reprint do not cause the date to change.) The part number changes
when extensive technical changes are incorporated.
New editions of this manual will incorporate all material updated since the previous
edition.
HP Printing Division:
Enterprise Systems Division
Hewlett-Packard Co.
19111 Pruneridge Ave.
Cupertino, CA 95014
7
8
Preface
This guide describes how to install and configure the Event Monitoring Service to
monitor system health, and how to use EMS in conjunction with availability
software such as MC/ServiceGuard and IT/O:
•Chapter 1, “Installing and Using EMS” presents the exact steps required to
install and use the software on your system or cluster.
•Chapter 2, “Monitoring Disk Resources”, gives guidelines on using the disk
monitor, including using it with MC/ServiceGuard.
•Chapter 3, “Monitoring Cluster Resources”, gives guidelines on using the
cluster monitor.
•Chapter 4, “Monitoring Network Interfaces”, gives guidelines on using the
network interface monitor.
•Chapter 5, “Monitoring System Resources”, gives guidelines on using the
system resource monitor for monitoring users, job queues and available
filesystem space.
•Chapter 6, “Troubleshooting”, gives guidelines on reading log files, and testing
monitor requests.
Related
Publications
The following documents contain additional related information:
•Clusters for High Availability: A Primer of HP-UX Solutions (ISBN
0-13-494758-4). HP Press: Prentice Hall, Inc., 1996.
•Disk and File Manag ement Tasks on HP-UX (ISBN 0-13-518861-X). HP Press;
Prentice Hall, Inc., 1997.
•Managing MC/ServiceGuard (HP Part Number B3936-90019).
•Configuring OPS Clusters with MC/LockManager (HP Part Number
B5158-90001).
•Managing Highly Available NFS (HP Part Number B5125-90001)
•http://www.hp.com/go/ha external web site for information about
Hewlett-Packard’s high-availability technologies where you can documents
such as Writing Monitors for the Event Monitoring Service (EMS)
9
Problem Reporting If you have any problems with the software or documentation, please contact your
local Hewlett-Packard Sales Office or Customer Service Center.
10
1Installing and Using EMS
EMS HA Monitors (Event Monitoring Service High Availability Monitors) aids in
providing high availability in an HP-UX environment by monitoring particular
system resources and then informing target applications (e.g. MC/ServiceGuard)
when the resources they monitor are at critical user-defined values.
11
Installing and Using EMS
What are EMS HA Monitors?
What are EMS HA Monitors?
EMS HA Monitors (Event Monitoring Service High Availability Monitors) are a set
of monitors and a monitoring service that polls a local system or application
resource and sends messages when events occur. An event can simply be defined as
something you want to know about. For example, you may want to be alerted when
a disk fails or when available filesystem space falls below a certain level. EMS
allows you to configure what you consider an event for any monitored system
resource.
The advantage EMS has over built-in monitors is that requests can be made to send
events to a wide variety of softw are using multiple protocols (opcmsg, SNMP, TCP,
UDP). For example, you can configure EMS so that when a disk fails a message is
sent to MC/ServiceGuard and IT/Operations. These applications can then use that
message to trigger package failover and to send a message to an administrator to fix
the disk.
EMS HA Monitors consist of a framework, a collection of monitors, and a
configuration interface that runs under SAM (System Administration Manager). The
framework starts and stops the monitors, stores information used by the monitors,
and directs monitors where to send events. A standard API provides a way to add
new monitors as they become available, or to write your own monitors; see the
document Writing Monitors for the Event Monitoring Service (EMS) available from
the high availability web site: http://www.hp.com/go/ha
Figure 1-1Event Monitoring Services High Availability Monitors
Client, such as
SAM interface to EMS,
MC/ServiceGuard package configuration
Target, such as
IT/Operations,
MC/ServiceGuard
12Chapter 1
Framework
Registrar
API
MONITORS
Resource
dictionary
System and
application
resources
Installing and Using EMS
What are EMS HA Monitors?
Monitors are applications written to gather and report information about specific
resources on the system. They use system information stored in places like
/etc/lvmtab and the MIB database. When you make a request to a monitor, it
polls the system information and sends a message to the framework, which then
interprets the data to determine if an event has occurred and sends messages in the
appropriate format.
EMS HA Monitors work best in a high availability environment; it aids in quickly
detecting and eliminating single points of failure, and monitors resources that can be
used as MC/ServiceGuard package dependencies. However, EMS HA Monitors can
also be used outside a high availability environment for monitoring the status of
system resources.
A set of monitors is shipped with EMS: disk, cluster, network interface, and system
resource monitors. Other Hewlett-Packard products are bundled with monitors that
fit into the EMS framework, such as ATM and HP OSI Transport Service 9000. You
can also write your own monitor; see Writing Monitors for the Event MonitoringService (EMS).
Chapter 113
Installing and Using EMS
The Role of EMS HA Monitors in a High Availability Environment
The Role of EMS HA Monitors in a
High Availability Environment
The weakest link in a high availability system is the single point of f ailure. EMS HA
Monitors can be used to report information that helps you detect loss of redundant
resources, thus exposing single points of failure, a threat to data and application
availability.
Because EMS is a monitoring system, and does not do anything to modify the
system, it is best used with additional software that can take action based on the
events sent by EMS. Some examples are:
•EMS HA Monitors and MC/ServiceGuard
MC/ServiceGuard uses the EMS monitors to determine the health of resources,
such as disks, and may fail over packages based on that information.
Configuration of EMS monitoring requests for use with MC/ServiceGuard
packages is done from the Cluster area for Package Configuration in SAM, or by
editing the ASCII package configuration file.
However, if you also want to be alerted to what caused a package to fail over, or
you want to monitor events that affect high availability you need to create
requests from the SAM interface in the Resource Management area as described
in “Using EMS HA Monitors” on page 17, and in subsequent chapters.
MC/ServiceGuard may already be configured to monitor the health of nodes,
services, and subnets, and to make failover decisions based on resources status.
Configuring EMS monitors provides additional MC/ServiceGuard failover
criteria for certain network links and other resources.
•EMS HA Monitors with IT/Operations or Network Node Manager
EM HA Monitors S can be configured to send events to IT/Operations and
Network Node Manager.
•EMS HA Monitors with your choice of system management software
Because EMS can send events in a number of protocols, it can be used with any
system management software that supports either SNMP traps, or TCP, or UDP
messages.
14Chapter 1
Installing and Using EMS
Installing and Removing EMS HA Monitors
Installing and Removing EMS HA Monitors
NOTET o make best use of EMS HA Monitors, install and configure them on all systems in
your environment. Because EMS monitors resources for the local system only, you
need to install EMS on every system to monitor all systems.
EMS HA Monitors run on HP 9000 Series 800 systems running HP-UX version
10.20 or later.
Hardware, such as disks and LAN cards, should be configured and tested before
installing EMS HA Monitors.
Installing EMS HA Monitors
The EMS HA Monitor bundle (P/N B5735AA-APZ) and license
(P/NB5736AA-APZ) version A.01.00 contains these products:
EMS-Corethe EMS framework
EMS-Configthe SAM interface to EMS
EMS-Disk Monitorthe disk monitor, associated dictionary and files
EMS-MIB Monitorthe cluster, network, and system resource monitors,
associated dictionary and files
To install EMS product, use swinstall, or the Software Management area in
SAM.
If you have many systems, it may be easier to install o ver the netw ork from a central
location. Create a network depot according to the instructions in Managing HP-UXSoftware with SD-UX, rlogin or telnet to the remote host, and install over the
network from the depot.
When monitors are updated, or you install a monitor on top of an existing monitor,
your requests are retained. This is part of the functionality provided by the
persistence client; see “Making Sure Monitors are Running” in Chapter 6.
Note that updated monitors may have new status values that change the meaning of
your monitoring requests.
Chapter 115
Installing and Using EMS
Installing and Removing EMS HA Monitors
Removing EMS HA Monitors
Use swremove or the Software Management tools under SAM to remove EMS.
Note that because the monitors are persistent, that is, they are always automatically
started if they are stopped, it is likely you will have warnings in your removal log
file that say, “Could not shut down process” or errors that say “File
/etc/opt/resmon/lbin/p_client could not be removed.” Even if you see these
warnings, monitors are removed and any dirty files are cleaned up on reboot.
16Chapter 1
Installing and Using EMS
Using EMS HA Monitors
Using EMS HA Monitors
There are two ways to use EMS HA Monitors:
•Configure monitoring requests from the EMS interface in the Resource
Management area of SAM.
•Configure package dependencies in MC/ServiceGuard by using the Package
Configuration interface in the High Availability Clusters subarea of SAM or by
editing the package ASCII configuration file.
The following are prerequisites to using EMS:
•Disks need to be configured using the LVM (Logical Volume Manager).
•Network cards need to be configured.
•Filesystems need to have been created and mounted.
Resource classes are structured hierarchically, similar to a filesystem structure,
although they are not actually files and directories. The classes supplied with this
version of EMS are listed in Figure 1-2. Resource instances are listed in bold, and
instances that are replaced with an actual name are in bold italics.
Figure 1-2Event Monitoring Service Resource Class Hierarchy
/vg
contains all logical volume, disk
and PVlink, and volume group
summary status
/vgName
/lv
/copies
/status
/lvName
/lvName
/pv_summary
/lv_summary
/pv_pvlink
/status
/deviceName
The full path of a resource includes the class, subclasses, and instance. An example
of a full resource path for the physical volume status of the device
/dev/dsk/c0t1d2 belonging to volume group vgDataBase, would be
/vg/vgDataBase/pv_pvlink/status/c0t1d2.
Chapter 117
contains all package,
node, and cluster
status
/package
/localNode
/status
/packageName
/clusterName
/status
/clusterName
/status
/net/cluster
contains network
interface status
/interfaces
/lan
/status
/LANname
/jobQueue1Min
/jobQueue5Min
/jobQueue15Min
/system
contains all job
queue, user, and
filesystem status
/filesystem
/availMB
/numUsers
/fsName
Installing and Using EMS
Using EMS HA Monitors
Configuring EMS Monitoring Requests
Outside of MC/ServiceGuard
This section describes the steps from the SAM interface to EMS to create
monitoring requests that notify non-MC/ServiceGuard management applications
such as IT/Operations.This information for creating requests is also valid for
monitors sold with other products (ATM or OTS, for example) and for user-written
monitors written according to developer specifications in Writing Monitors for theEvent Monitoring Service (EMS).
To start the EMS configuration, double-click on the Event Monitoring Service icon
in the Resource Management area in SAM. The main screen, shown in Figure 1-3,
shows all requests configured on that system; if you haven’t created requests, the
screen will be empty.
Figure 1-3Event Monitoring Service Screen
18Chapter 1
Installing and Using EMS
Using EMS HA Monitors
Selecting a Resource to Monitor
All resources are divided into classes. When you double-click on Add Monitoring
Request in the Actions menu, the top-level classes for all installed monitors are
dynamically discovered and then listed.
Figure 1-4The Top Level of the Resource Hierarchy in the Add a Monitoring Request
Screen
Chapter 119
Installing and Using EMS
Using EMS HA Monitors
Some Hewlett-Packard products, such as ATM or HP OTS 9000, provide EMS
monitors. If those products are installed on the system, then their top-level classes
will also appear here. Similarly, top-level classes belonging to user-written
monitors, created using the EMS Developer’s Kit, will be discovered and displayed
here.
Traverse the hierarchy in the upper part of the screen in Figure 1-4 and select a
resource instance to monitor in the lower part of the screen as in Figure 1-5.
Figure 1-5Choosing a Resource Instance in the Add a Monitoring Request Screen
Using Wildcards
The * wildcard is a convenient way to create many requests at once. Most systems
have more than one disk or network card, and many have several disks. To avoid
having to create a monitor request for each disk, select * (All Instances) in the
Resource Instance box. See Figure 1-5.
20Chapter 1
Wildcards are available only when all instances of a subclass are the same resource
type.
Wildcards are not available for resource classes. So, for example, a wildcard is
available for the status instances in the /vg/vgName/pv_pvlink/status subclass, but
no wildcard appears for the volume group subclasses under the /vg resource class.
Creating a Monitoring Request
The screen in Figure 1-6 shows where you specify when and how to send events.
The following sections describe the monitoring parameters and some common
applications of them.
Figure 1-6Monitoring Request Parameters
Installing and Using EMS
Using EMS HA Monitors
Chapter 121
Installing and Using EMS
Using EMS HA Monitors
How Do I Tell EMS When to Send Events?
While the monitor may be polling disks every 5 minutes, for example, you may only
want to be alerted when something happens that requires your attention. When you
create a request, you specify the conditions under which you receive an alert. Here
are the terms under which you can be notified:
When value is...You define the conditions under which you wish to be
notified for a particular resource using an operator (e.g. =,
not equal, >, >=, <, <=) and a value returned by the
monitor (e.g. UP, DOWN, INACTIVE). Text values are
mapped to numerical values. Specific values are in the
chapters describing the individual monitors.
When value changesThis notification might be used for a resource that does
not change frequently, but you need to know each time it
does. For example, you would want notification each time
the number of mirrored copies of data changes from 2 to 1
and back to 2.
At each intervalThis sends notification at each polling interval. It would
most commonly be used for reminders or gathering data
for system analysis. Use this for only a small number of
resources at a time, and with long polling intervals of
several minutes or hours; there is a risk of affecting
system performance.
If you select conditional notification, you may select one or more of these options:
InitialUse this option as a baseline when monitoring resources such as
available filesystem space or system load. It can also be used to
test that events are being sent for a new request.
RepeatUse this option for urgent alerts. The Repeat option sends an
alert at each polling interval as long as the notify condition is
met. Use this option with caution; there is a risk of high CPU
use or filling log files and alert windows.
ReturnUse this option to track when emergency situations return to
normal.
22Chapter 1
Installing and Using EMS
Using EMS HA Monitors
NOTEUpdated monitors may have new status values that change the meaning of your
monitoring requests, or generate new alerts.
For example, assume you have a request for notification if status > 3 for a resource
with a values range of 1-7. You would get alerts each time the value equaled 4, 5, 6,
or 7. If the updated version of the monitor has a new status value of 8, you w ould see
new alerts when the resource equalled 8.
What is a Polling Interval?
The polling interval determines the maximum amount of elapsed time before a
monitor knows about a change in status for a particular resource. The shorter the
polling interval, the more likely you are to hav e recent data. Ho we v er, depending on
the monitor, a short polling interval may use more CPU and system resources. You
need to weigh the advantages and disadvantages between being able to quickly
respond to events and maintaining good system performance.
The minimum polling interval depends on the monitor’s ability to process quickly.
For most resource monitors the minimum is 30 seconds. Disk monitor requests can
be as short as 1 second.
MC/ServiceGuard monitors resources every few seconds. You may want to use a
short polling interval (30 seconds or less) when it is critical that you make a quick
failover decision.
You may want a polling interval of 5 minutes or so for monitoring less critical
resources.
You may want to set a very long polling interval (4 hours) to monitor failed disks
that are not essential to the system, but which should be replaced in the next few
days.
Which Protocols Can I Use to Send Events?
You specify the protocol the EMS framework uses to send events in the Notify via:
section of the screen in Figure 1-6. The options are:
•opcmsg ITO sends messages to ITO applications via the opcmsg daemon. EMS
defines normal and abnormal differently for each notification type:
•Conditional notification defines all events that meet the condition as
abnormal, and all others as normal.
•Change notification defines all events as abnormal.
•Notification at each polling interval defines all events as normal.
Chapter 123
Installing and Using EMS
Using EMS HA Monitors
You may specify the ITO message severity for both normal and abnormal events:
•Normal
•W arning
•Critical
•Minor
•Major
The ITO application group is EMS(HP), the message group, HA, and the object
is the full path of the resource being monitored.
See HP OpenView IT/Operations Administrators Task Guide (P/N
B4249-90003) for more information on configuring notification severity.
•SNMP traps
This sends messages to applications using SNMP traps, such as Network Node
Manager. See HP OpenView Using Network Node Manager (P/N J1169-90002)
for more information on configuring SNMP traps. The following traps are used
by EMS:
•TCP and UDP
This sends TCP or UDP encoded events to the target host name and port
indicated for that request. Thus the message can be directed to a user-written
socket program.
Templates for configuring IT/Operations and Network Node Manager to display
EMS events can be found on the Hewlett-P ackard High Availability public web page
at http://www.hp.com/go/ha.
What is a Notification Comment?
The notification comment is useful for sending task reminders to the recipients of an
event. For example, if you have a disk monitor request that reports an alert that an
entire mirror has failed, when that event shows up in IT/Operations, for example,
you may want it to have the name of the person to contact if disks fail. If you have
configured MC/ServiceGuard package dependencies, you may want to enter the
package name as a comment in the corresponding pv_summary request.
24Chapter 1
Installing and Using EMS
Using EMS HA Monitors
Copying Monitoring Requests
There are two ways to use the copy function:
•To create requests for many resources using the same monitoring parameters,
select the monitoring request in the main screen and choose Actions: Copy
Monitoring Request. You need to have configured at least one similar request for
a similar instance. Choose a different resource instance in the Add a Monitoring
Request screen, and click <OK> in the Monitoring Request Parameters screen.
•To create many different requests for the same resource, select the monitoring
request in the main screen and choose Actions: Copy Monitoring Request. You
need to have configured at last one request for that resource. Click <OK> in the
Add a Monitoring Request screen, and modify the parameters in the Monitoring
Request Parameters screen. You may want to do this to create requests that send
events using multiple protocols.
Modifying Monitoring Requests
To change the monitoring parameters of a request, select the monitoring request
from the main screen and select Actions: Modify Monitoring Request.
Removing Monitoring Requests
Select one or more monitoring requests from the main screen and choose Actions:
Remove Monitoring Request. To start monitoring the resource again you must
recreate the request, either by copying a similar request for a similar resource or by
re-entering the data.
Chapter 125
Installing and Using EMS
Using EMS HA Monitors
Configuring MC/ServiceGuard Package
Dependencies
This section describes how to use SAM to create package dependencies on EMS
resources. This creates an EMS request to monitor that resource and to notify
MC/ServiceGuard when that resource reaches a critical user-defined level.
MC/ServiceGuard will then failover the package. Here are some examples of how
EMS might be used:
•In a cluster where one copy of data is shared between all nodes in a cluster, you
may want to fail over a package if the host adapter has failed on the node
running the package. Because busses, controllers, and disks are shared, package
fail over to another node because of bus, controller, or disk failure would not
successfully run the package. To make sure you have proper failov er in a shared
data environment, you must create identical package dependencies on all nodes
in the cluster. MC/ServiceGuard can then compare the resource “UP” values on
all nodes and fail over to the node that has the correct resources available.
•In a cluster where each node has its own copy of data, you may want to fail o ver
a package to another node for any number of reasons:
•host adapter, bus, controller, or disk failure
•unprotected data (the number of copies is reduced to one)
•performance has degraded because one of the PV links has failed
In this sort of cluster of web servers, where each node has a copy of the data and
users are distributed for load balancing, you can fail over a package to another
node with the correct resources available. Again, the package resource
dependencies should be configured the same on all nodes.
This information for creating requests is also valid for EMS monitors sold with
other products (ATM or OTS, for example) and for user-written monitors written
according to developer specifications in Writing Monitors for the Event MonitoringService (EMS).
NOTEYou should create the same requests on all nodes in an MC/ServiceGuard cluster.
A package can depend on any resource monitored by an EMS monitor. To create
package dependencies, choose create or modify a package from the Package
Configuration interface under the High Availability Clusters subarea of SAM,
Figure 1-7. You see a new option called “Specify Package Resource Dependencies.”
26Chapter 1
Figure 1-7Package Configuration Screen
Installing and Using EMS
Using EMS HA Monitors
Click on “Specify Package Resource Dependencies...” to add EMS resources as
package dependencies; you see a screen similar to Figure 1-8. If you click “Add
Resource”, you get a screen similar to Figure 1-7 on page 27.
Chapter 127
Loading...
+ 60 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.