AOM
User Guide
Issue |
01 |
Date |
2020-08-27 |
HUAWEI TECHNOLOGIES CO., LTD.
Copyright © Huawei Technologies Co., Ltd. 2020. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise c fi in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every ff has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
i |
AOM |
|
User Guide |
Contents |
Contents
1 |
Introduction.............................................................................................................................. |
1 |
||||
2 |
Subscribing to AOM................................................................................................................ |
9 |
||||
3 |
Permissions Management................................................................................................... |
11 |
||||
3.1 |
Creating a User and Granting Permissions.................................................................................................................. |
11 |
||||
3.2 |
Creating a Custom Policy................................................................................................................................................... |
12 |
||||
4 |
Connecting Resources to AOM........................................................................................... |
14 |
||||
4.1 |
Installing the ICAgent (HUAWEI CLOUD Host)......................................................................................................... |
14 |
||||
4.2 |
|
nfi |
n |
Application Discovery Rules...................................................................................................................... |
17 |
|
4.3 |
|
nfi |
n |
Log Collection Paths.................................................................................................................................... |
21 |
|
4.3.1 |
nfi |
|
n Container Log Collection Paths............................................................................................................ |
21 |
||
4.3.2 |
nfi |
|
n VM Log Collection Paths........................................................................................................................ |
27 |
||
5 |
Overview................................................................................................................................. |
|
31 |
|||
5.1 |
O&M.......................................................................................................................................................................................... |
|
|
31 |
||
5.2 |
Dashboard............................................................................................................................................................................... |
|
38 |
|||
6 |
Alarm Management.............................................................................................................. |
44 |
||||
6.1 |
Usage Description................................................................................................................................................................. |
44 |
||||
6.2 |
Static Threshold Rules......................................................................................................................................................... |
44 |
||||
6.2.1 Creating Static Threshold Rules.................................................................................................................................... |
45 |
|||||
6.3 |
Creating |
|
fic n Rules............................................................................................................................................... |
48 |
||
6.4 |
Viewing Alarms...................................................................................................................................................................... |
51 |
||||
6.5 |
Viewing Events....................................................................................................................................................................... |
51 |
||||
7 |
Resource Monitoring............................................................................................................ |
53 |
||||
7.1 |
Usage Description................................................................................................................................................................. |
53 |
||||
7.2 |
Application Monitoring....................................................................................................................................................... |
53 |
||||
7.3 |
Component Monitoring...................................................................................................................................................... |
54 |
||||
7.4 |
Host Monitoring.................................................................................................................................................................... |
56 |
||||
7.5 |
Container Monitoring.......................................................................................................................................................... |
58 |
||||
7.6 |
Metric Monitoring................................................................................................................................................................. |
59 |
||||
7.7 |
Cloud Service Monitoring................................................................................................................................................... |
62 |
||||
8 |
Log Management.................................................................................................................. |
65 |
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
ii |
AOM |
|
|
|
|
User Guide |
|
Contents |
||
8.1 |
Usage Description................................................................................................................................................................. |
65 |
||
8.2 |
Searching for Logs................................................................................................................................................................ |
65 |
||
8.3 |
Viewing Log Files.................................................................................................................................................................. |
67 |
||
8.4 |
Adding Log Buckets.............................................................................................................................................................. |
69 |
||
8.5 |
Structuring Logs.................................................................................................................................................................... |
70 |
||
8.6 |
Viewing Bucket Logs............................................................................................................................................................ |
77 |
||
8.7 |
Adding Log Dumps............................................................................................................................................................... |
79 |
||
8.8 |
Creating Statistical Rules.................................................................................................................................................... |
83 |
||
9 |
n |
|
n Management............................................................................................... |
86 |
9.1 |
Agent Management (HUAWEI CLOUD Host)............................................................................................................. |
86 |
||
9.1.1 Installing the ICAgent...................................................................................................................................................... |
86 |
|||
9.1.2 Upgrading the ICAgent.................................................................................................................................................... |
91 |
|||
9.1.3 Uninstalling the ICAgent................................................................................................................................................. |
91 |
|||
9.1.4 ICAgent Management (Non-HUAWEI CLOUD Host)........................................................................................... |
94 |
|||
9.1.4.1 Installing the ICAgent................................................................................................................................................... |
94 |
|||
9.1.4.2 Upgrading the ICAgent................................................................................................................................................ |
97 |
|||
9.1.4.3 Uninstalling the ICAgent............................................................................................................................................. |
97 |
|||
9.2 |
Log |
nfi |
n................................................................................................................................................................. |
98 |
9.2.1 Setting the Log Quota..................................................................................................................................................... |
98 |
|||
9.2.2 nfi |
n |
Delimiters..................................................................................................................................................... |
98 |
|
9.2.3 Log Collection................................................................................................................................................................... |
102 |
|||
9.3 |
Quota |
nfi |
n......................................................................................................................................................... |
103 |
9.4 |
Metric |
nfi |
n......................................................................................................................................................... |
103 |
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
iii |
AOM |
|
User Guide |
1 Introduction |
1Introduction
Application Operations Management (AOM) is a one-stop and multi-dimensional O&M management platform for cloud applications. It monitors applications and related cloud resources in real time, collects and associates resource metrics, logs, and events to analyze application health status, and supports alarm reporting and data visualization, helping you detect faults in a timely manner and monitor the running status of applications, resources, and services in real time.
c fic y AOM monitors and uniformly manages servers, storage devices, networks, web containers, and applications hosted in Docker and Kubernetes, ff c y preventing problems, facilitating fault locating, and reducing O&M costs. Unlike traditional monitoring systems, AOM monitors services by applications. It meets enterprises' requirements for high ffic ncy and fast iteration, provides ff c IT support for their services, and protects and optimizes their IT assets, enabling enterprises to achieve strategic goals.
Console Description
Table 1-1 AOM console description
Item |
Description |
|
|
|
|
Overview |
Both the O&M overview and |
|
|
dashboard are provided. |
|
|
● |
O&M overview |
|
|
The O&M page supports full-link, |
|
|
multi-layer, and one-stop O&M for |
|
|
resources, applications, and user |
|
|
experience. |
|
● |
Dashboard |
|
|
With a dashboard, ff n graphs |
|
|
such as line graphs and digit graphs |
|
|
are displayed on the same screen, |
|
|
enabling you to understand |
|
|
monitoring data comprehensively. |
|
|
|
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
1 |
AOM |
|
|
|
|
|
User Guide |
|
|
|
1 Introduction |
|
|
|
|
|
|
|
|
Item |
Description |
|
||
|
|
|
|||
|
Alarm center |
Alarm center includes the alarm list, |
|||
|
|
event list, threshold rules, and |
|||
|
|
n |
fic |
n rules. |
|
|
|
● |
Alarm list |
|
|
|
|
|
Alarms are the information which is |
||
|
|
|
reported when AOM or an external |
||
|
|
|
service is abnormal or may cause |
||
|
|
|
exceptions. You need to take |
||
|
|
|
measures accordingly. Otherwise, |
||
|
|
|
service exceptions may occur. |
||
|
|
|
The alarm list displays the alarms |
||
|
|
|
generated within a |
c fi time |
|
|
|
|
range. |
|
|
● Event list
Events generally carry some important information, informing you of the changes of AOM or an external service. Such changes do not necessarily cause exceptions.
The event list displays the events generated within a c fi time range.
● Threshold rules
You can set threshold conditions for metrics by using threshold rules. When metric values meet conditions, AOM will generate threshold alarms. When no metric data is reported, AOM will report
n ffic n data events. In this way, you can identify and handle exceptions at the earliest time.
● |
fic |
n rules |
|
|
|
AOM supports alarm n fic |
n |
||
|
You can use this function by |
|
||
|
creating n |
fic |
n rules. When |
|
|
alarms are reported due to an |
|
||
|
exception in AOM or an external |
|||
|
service, alarm information can be |
|||
|
sent to |
c fi |
personnel by email |
or Short Message Service (SMS) message. In this way, these personnel can rectify faults in time to avoid service loss.
● Intelligent thresholds
When a metric value meets the preset threshold condition, the system generates a threshold-
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
2 |
AOM |
|
|
|
User Guide |
1 Introduction |
||
|
|
|
|
|
Item |
Description |
|
|
|
|
|
|
|
crossing alarm. If the n fic |
n |
|
|
function is enabled, alarm |
|
|
|
information wil be sent to |
c fi |
|
|
users by SMS message or email. |
|
|
|
|
|
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
3 |
AOM |
|
|
|
User Guide |
|
1 Introduction |
|
|
|
|
|
|
Item |
Description |
|
|
|
|
|
|
Monitoring |
Functions such as application |
|
|
|
monitoring, component monitoring, |
|
|
|
host monitoring, container monitoring, |
|
|
|
and metric monitoring are provided. |
|
|
|
● |
Application monitoring |
|
|
|
An application is a group of the |
|
|
|
same or similar components divided |
|
|
|
based on service requirements. |
|
|
|
AOM supports monitoring by |
|
|
|
application. |
|
|
● |
Component monitoring |
|
|
|
Components refer to the services |
|
|
|
that you deploy, including |
|
|
|
containers and common processes. |
|
|
|
The Component Monitoring page |
|
|
|
displays information such as type, |
|
|
|
CPU usage, memory usage, and |
|
|
|
status of each component. AOM |
|
|
|
supports drill-down from |
|
|
|
components to instances, and then |
|
|
|
to containers, enabling multi- |
|
|
|
dimensional monitoring. |
|
|
● |
Host monitoring |
|
|
|
The Host Monitoring page enables |
|
|
|
you to monitor common system |
|
|
|
devices such as disks and fi |
|
|
|
systems, and resource usage and |
|
|
|
health status of hosts and service |
|
|
|
processes or instances running on |
|
|
|
them. |
|
|
● |
Container monitoring |
|
|
|
Only workloads deployed by using |
|
|
|
Cloud Container Engine (CCE) and |
|
|
|
applications created by using |
|
|
|
ServiceStage are monitored. |
|
|
● |
Metric monitoring |
|
|
|
The Metric Monitoring page |
|
|
|
displays metric data of each |
|
|
|
resource. You can monitor metric |
|
|
|
values and trends in real time, add |
|
|
|
desired metrics to dashboards, |
|
|
|
create threshold rules, and export |
|
|
|
monitoring reports. In this way, you |
|
|
|
can monitor services in real time |
|
|
|
and perform data correlation |
|
|
|
analysis. |
|
|
● |
Cloud service monitoring |
|
|
|
|
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
4 |
AOM |
|
|
|
|
|
|
User Guide |
|
|
|
1 Introduction |
||
|
|
|
|
|
|
|
|
Item |
Description |
|
|
||
|
|
|
|
|||
|
|
|
The Cloud Service Monitoring |
|||
|
|
|
page displays historical |
|
||
|
|
|
performance curves of each cloud |
|||
|
|
|
service instance. You can view cloud |
|||
|
|
|
service data in the last six months. |
|||
|
|
|
||||
|
Log |
Functions such as log search, log fi |
||||
|
|
log dump, and path c nfi |
n are |
|||
|
|
provided. |
|
|
|
|
|
|
● |
Log search |
|
|
|
|
|
|
AOM enables you to quickly query |
|||
|
|
|
logs, and locate faults based on log |
|||
|
|
|
sources and contexts. |
|
||
|
|
● |
Log fi |
|
|
|
|
|
|
You can quickly view log fi |
of |
||
|
|
|
component instances to locate |
|||
|
|
|
faults. |
|
|
|
|
|
● |
Log dumps |
|
|
|
|
|
|
AOM enables you to dump logs to |
|||
|
|
|
Object Storage Service (OBS) |
|||
|
|
|
buckets for long-term storage. |
|||
|
|
● |
Path c |
nfi |
n |
|
|
|
|
AOM can collect and display VM |
|||
|
|
|
logs. VM refers to an Elastic Cloud |
|||
|
|
|
Server (ECS) or a Bare Metal Server |
|||
|
|
|
(BMS) running Linux. Before |
|||
|
|
|
collecting logs, ensure that you |
|||
|
|
|
have c |
nfi |
a log collection |
|
|
|
|
path. |
|
|
|
|
|
● |
Log buckets |
|
|
|
|
|
|
A log bucket is a logical group of |
|||
|
|
|
log fi |
You dump log fi |
create |
|
|
|
|
statistical rules, and view logs by |
|||
|
|
|
log bucket. |
|
|
|
|
|
● |
Statistical rules |
|
|
|
|
|
|
A statistical rule takes ff |
c by log |
||
|
|
|
bucket. You can c nfi |
keywords |
||
|
|
|
in statistical rules. Then, AOM |
|||
|
|
|
periodically counts the number of |
|||
|
|
|
such keywords in log buckets and |
|||
|
|
|
generates log metrics. |
|
||
|
|
● |
Log structuring |
|
|
|
|
|
|
In log structuring, original logs can |
|||
|
|
|
be separated by regular expressions |
|||
|
|
|
or special characters so that |
|||
|
|
|
structured logs can be queried and |
|||
|
|
|
analyzed based on the SQL syntax. |
|||
|
|
|
|
|
|
|
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
5 |
AOM |
|
|
|
|
|
|
|
User Guide |
|
|
|
|
|
1 Introduction |
|
|
|
|
|
|
|
|
|
|
Item |
|
Description |
|
|
||
|
|
|
|
||||
|
nfi |
n management |
Functions such as agent management, |
||||
|
|
|
application discovery, and log |
||||
|
|
|
c |
nfi |
n are provided. |
||
|
|
|
● |
Agent management |
|||
|
|
|
|
The ICAgent collects metrics, logs, |
|||
|
|
|
|
and application performance data |
|||
|
|
|
|
in real time. For hosts purchased |
|||
|
|
|
|
from the Elastic Cloud Server (ECS) |
|||
|
|
|
|
or Bare Metal Server (BMS) |
|||
|
|
|
|
console, you need to manually |
|||
|
|
|
|
install the ICAgent. For hosts |
|||
|
|
|
|
purchased from the Cloud |
|||
|
|
|
|
Container Engine (CCE) console, the |
|||
|
|
|
|
ICAgent is automatically installed. |
|||
|
|
|
● |
Application discovery |
|||
|
|
|
|
AOM can discover applications and |
|||
|
|
|
|
collect their metrics based on |
|||
|
|
|
|
c |
nfi |
rules. |
|
|
|
|
● |
Log c nfi |
n |
|
|
|
|
|
|
Log quotas and delimiters can be |
|||
|
|
|
|
c |
nfi |
|
|
|
|
|
● |
Quota c |
nfi |
n |
|
|
|
|
|
Earlier metrics will be deleted when |
|||
|
|
|
|
the metric quota is exceeded. |
|||
|
|
|
|
You can change the metric quota by |
|||
|
|
|
|
switching between the basic edition |
|||
|
|
|
|
(free) and pay-per-use edition. |
|||
|
|
|
● |
Metric c |
nfi |
n |
|
|
|
|
|
You can enable the metric |
|||
|
|
|
|
collection function to collect |
|||
|
|
|
|
metrics (excluding SLA and custom |
|||
|
|
|
|
metrics). |
|
|
|
|
|
|
|
|
|
|
|
Process for Using AOM
The following fi shows the process of using AOM.
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
6 |
AOM |
|
User Guide |
1 Introduction |
1.(Mandatory) Subscribe to AOM.
2.(Optional) Create a sub-account and set permissions.
3.(Mandatory) Create a cloud host.
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
7 |
AOM |
|
|
|
|
User Guide |
|
|
|
1 Introduction |
4. |
(Mandatory) Install the ICAgent. |
|
||
|
The ICAgent is a collector used to collect metric, log, and application |
|||
|
performance data in real time. |
|
||
|
If an ECS is purchased through CCE, the ICAgent is automatically installed on |
|||
|
the ECS. |
|
|
|
5. |
(Optional) |
n |
application discovery rules. |
|
|
For the applications that meet built-in application discovery rules, they will |
|||
|
be automatically discovered after the ICAgent is installed. For the applications |
|||
|
that cannot be discovered using built-in application discovery rules, you need |
|||
|
to c nfi |
custom application discovery rules. |
|
|
6. |
(Optional) |
n |
log collection paths. |
|
|
To use AOM to monitor host logs, you need to c nfi |
log collection paths. |
7.(Optional) Implement O&M.
You can use AOM functions such as Overview, Alarm Management,
Resource Monitoring, and Log Management to perform routine O&M.
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
8 |
AOM |
|
User Guide |
2 Subscribing to AOM |
2Subscribing to AOM
Before subscription, ensure that you have registered an account and implemented real-name authentication.
Registering an Account
Step 1 Log in to the cloud at https://www.huaweicloud.com/intl/en-us/.
Step 2 Click Register in the upper right corner of the page.
Complete the registration as prompted.
----End
Implementing Real-Name Authentication
You can use Application Operations Management (AOM) only after real-name authentication is complete.
Step 1 After logging in to the cloud, click the username in the upper right corner on the page and select My Account from the drop-down list.
Step 2 On the Basic Information page, click Authenticate to the right of
Authentication Status.
Complete real-name authentication as prompted.
----End
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
9 |
AOM |
|
User Guide |
2 Subscribing to AOM |
Subscribing to AOM
AOM resources are n c fic and cannot be used across regions. When
ff n regions (such as AP-Hong Kong and AP-Bangkok) exist, select a region before subscribing to AOM.
Click Console in the upper right corner and then click in the upper left corner to select a region. Then, click Service List and choose Application > Application Operations Management. In the dialog box that is displayed, click Subscribe for Free to enable AOM for free.
NOTE
AOM provides both basic and pay-per-use editions. The basic edition is used by default. You can click Switch Edition as required.
Switching Edition
AOM provides both basic and pay-per-use editions. The basic edition is used by default. You can click Switch Edition as required.
Step 1 Log in to the AOM console, choose Overview > O&M in the navigation pane, and click Switch Edition in the upper right corner of the page.
Step 2 Select an edition, select the prompt information, and click Switch Now.
----End
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
10 |
AOM |
|
User Guide |
3 Permissions Management |
3Permissions Management
This section describes the fin n permissions management provided by Identity and Access Management (IAM) for your Application Operations Management (AOM). With IAM, you can:
●Create IAM users for employees based on the organizational structure of your enterprise. Each IAM user has their own security credentials, providing access to AOM resources.
●Grant only the permissions required for users to perform a task.
●Entrust a HUAWEI CLOUD account or cloud service to perform professional
and ffic n O&M on your AOM resources.
If your HUAWEI CLOUD account does not need individual IAM users, then you may skip over this chapter.
This section describes the procedure for granting permissions (see Figure 3-1).
Prerequisites
Before assigning permissions to user groups, you should learn about the AOM permissions listed in Permissions Management. For the system permissions of other services, see System Permissions.
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
11 |
AOM |
|
User Guide |
3 Permissions Management |
Process
Figure 3-1 Process for granting AOM permissions
1.Creating a User Group and Assigning Permissions
Create a user group on the IAM console, and assign the AOM ReadOnlyAccess policy to the group.
2.Creating an IAM User
Create a user on the IAM console and add the user to the group created in 1.
3.Logging In Using an IAM User and Verifying Permissions
Log in to the AOM console as the created user, and verify that it only has read permissions for AOM.
Custom policies can be created as a supplement to the system policies of Application Operations Management (AOM). For the actions supported for custom policies, see Permissions Policies and Supported Actions.
You can create custom policies in either of the following two ways:
●Visual editor: Select cloud services, actions, resources, and request conditions without the need to know policy syntax.
●JSON: Edit JSON policies from scratch or based on an existing policy.
For details, see Creating a Custom Policy. The following section contains examples of common AOM custom policies.
Example Custom Policies
●Example 1: Allowing a user to create threshold rules
{
"Version": "1.1",
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
12 |
AOM |
|
User Guide |
3 Permissions Management |
"Statement": [
{
ff c "Allow", "Action": [
"aom:alarmRule:create"
]
}
]
}
●Example 2: Forbidding a user to delete application discovery rules
A deny policy must be used in conjunction with other policies to take ff c If the permissions assigned to a user contain both Allow and Deny actions, the Deny actions take precedence over the Allow actions.
To grant a user the AOM FullAccess system policy but forbid the user to delete application discovery rules, create a custom policy that denies the deletion of application discovery rules, and grant both the AOM FullAccess and deny policies to the user. Because the Deny action takes precedence, the user can perform all operations except deleting application discovery rules. The following is an example deny policy:
{
"Version": "1.1", "Statement": [
{
ffc "Deny", "Action": [
"aom:discoveryRule:delete"
]
}
]
}
● Example 3: fin n permissions for multiple services in a policy
A custom policy can contain actions of multiple services that are all of the project-level type. The following is an example policy containing actions of multiple services:
{
"Version": "1.1", "Statement": [
{
ff c "Allow", "Action": [
"aom:*:list",
"aom:*:get",
"apm:*:list",
"apm:*:get"
]
},
{
ff c "Allow", "Action": [
"cce:cluster:get",
"cce:cluster:list",
"cce:node:get",
"cce:node:list"
]
}
]
}
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
13 |
AOM |
|
User Guide |
4 Connecting Resources to AOM |
4Connecting Resources to AOM
The ICAgent collects metrics, logs, and application performance data in real time. For hosts purchased from the Elastic Cloud Server (ECS) or Bare Metal Server (BMS) console, you need to manually install the ICAgent. For hosts purchased from the Cloud Container Engine (CCE) console, the ICAgent is automatically installed.
Prerequisites
●Before installing the ICAgent, ensure that the time and time zone of the local browser are consistent with those of the server. If multiple servers are deployed, ensure that the local browser and multiple servers use the same time zone and time. Otherwise, metric data of applications and servers displayed on the UI may be incorrect.
●The ICAgent process needs to be installed and run by the root user.
Installation Methods
There are two methods to install the ICAgent. Note that the two methods are not applicable to container nodes (that is, nodes created using ServiceStage or CCE). For container nodes, you do not need to manually install the ICAgent. Instead, you only need to perform certain operations when creating clusters or deploying applications.
For details, see Table 4-1.
Table 4-1 Installation methods
Method |
Scenario |
|
|
|
|
Initial |
This method is used when the following conditions are met: |
|
installation |
1. |
An Elastic IP Address (EIP) has been bound to the server. |
|
||
|
2. |
The ICAgent has never been installed on the server. |
|
|
|
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
14 |
AOM |
|
|
User Guide |
4 Connecting Resources to AOM |
|
|
|
|
|
Method |
Scenario |
|
|
|
|
Inherited |
This method is used when the following conditions are met: |
|
installation |
You have multiple servers with ICAgent installed. One server is |
|
|
|
|
|
bound to an EIP, but others are not. The ICAgent has been |
|
|
installed on the server bound to an EIP by using the initial |
|
|
installation method. You can use the inherited method to |
|
|
install the ICAgent on the remaining servers. |
|
|
See Inherited Installation. |
|
|
|
Initial Installation
After you apply for a server and install the ICAgent for the fi time, perform the following operations:
Step 1 Obtain an Access Key ID/Secret Access Key (AK/SK).
●If you have obtained the AK/SK, skip this step.
●If you have not obtained the AK/SK, obtain them fi
Step 2 In the navigation pane, choose n |
n Management > Agent |
Management. |
|
Step 3 Click Install ICAgent.
Step 4 Generate the ICAgent installation command and copy it.
1.Enter the obtained AK/SK in the text box to generate the ICAgent installation command.
NOTE
Ensure that the AK/SK are correct. Otherwise, the ICAgent cannot be installed.
2.Click Copy Command.
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
15 |
AOM |
|
User Guide |
4 Connecting Resources to AOM |
Step 5 Use a remote login tool, such as PuTTY, to log in to the server where the ICAgent is to be installed as the root user and run the command copied in Step 4.2 to install the ICAgent.
NOTE
●If the message ICAgent install success is displayed, the ICAgent is successfully installed in the /opt/oss/servicemgr/ directory. After the ICAgent is successfully installed, choose
nn Management > Agent Management to view the ICAgent status.
●If the ICAgent fails to be installed, uninstall the ICAgent according to Uninstalling the ICAgent Through Logging In to the Server and then install it again. If the problem persists, contact technical support.
----End
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
16 |
AOM |
|
User Guide |
4 Connecting Resources to AOM |
Follow-up Operations
For more information about how to install, upgrade, and uninstall the ICAgent, see Agent Management (HUAWEI CLOUD Host).
4.2n n Application Discovery Rules
Application Operations Management (AOM) can discover applications and collect their metrics based on c nfi rules. There are two modes to c nfi application discovery: auto mode and manual mode. This section mainly describes the manual mode.
● Automatic c n |
n |
After you install the ICAgent on a host according to Installing the ICAgent, the ICAgent automatically discovers applications on the host based on Builtin Service Discovery Rules and displays them on the Application Monitoring page.
● Manual c n |
n |
After you add a custom service discovery rule on the application discovery page and apply it to the host where the ICAgent is installed (for details, see Installing the ICAgent), the ICAgent discovers applications on the host based on the c nfi service discovery rule and displays them on the application monitoring page.
Filtering Rules
The ICAgent will periodically implement detection on the target host to fin out all its processes. The ff c is similar to that of running the ps -e -o pid,comm,lstart,cmd | grep -v defunct command on the target host. Then, the
ICAgent checks whether processes match the fi |
n rules in Table 4-2. If a |
|
|||
process meets a fi |
n rule, the process is fi |
out and is not discovered by |
|||
AOM. If a process does not meet any fi |
n rules, the process is not fi |
out |
|||
and is discovered by AOM. |
|
|
|
||
ICAgent detection results may as follows: |
|
|
|
||
PID COMMAND |
|
STARTED CMD |
|
|
|
1 systemd |
Tue Oct |
2 21:12:06 2018 /usr/lib/systemd/systemd --switched-root --system -- |
|
||
deserialize 20 |
|
|
|
|
|
2 kthreadd |
Tue Oct |
2 21:12:06 2018 [kthreadd] |
|
|
|
3 ksoftirqd/0 |
Tue Oct |
2 21:12:06 2018 (ksoftirqd/0) |
|
|
|
1140 tuned |
Tue Oct |
2 21:12:27 2018 /usr/bin/python -Es /usr/sbin/tuned -l -P |
|
||
1144 sshd |
Tue Oct |
2 21:12:27 2018 /usr/sbin/sshd -D |
|
|
|
1148 agetty |
Tue Oct |
2 21:12:27 2018 /sbin/agetty --keep-baud 115200 38400 9600 hvc0 vt220 |
1154 docker-containe Tue Oct 2 21:12:29 2018 docker-containerd -l unix:///var/run/docker/libcontainerd/ docker-containerd.sock --shim docker-containerd-shim --start-timeout 2m --state-dir /var/run/docker/ libcontainerd/containerd --runtime docker-runc --metrics-interval=0
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
17 |
AOM |
|
|
|
|
User Guide |
|
|
4 Connecting Resources to AOM |
|
|
Table 4-2 Filtering rules |
|
|
|
|
|
|
|
|
|
Filtering Rule |
|
Example |
|
|
|
|
||
|
If the COMMAND value of a process is |
In the preceding information, the |
||
|
docker-containe, vi, vim, pause, sshd, |
process whose PID is 1154 is not |
||
|
ps, sleep, grep, tailf, tail, or systemd- |
discovered by AOM because its |
||
|
udevd, and the process is not running |
COMMAND value is docker-containe. |
||
|
in the container, the process is fi |
|
|
|
|
out and is not discovered by AOM. |
|
||
|
|
|
||
|
If the CMD value of a process starts |
In the preceding information, the |
||
|
with [ and ends with ], the process is |
process whose PID is 2 is not |
||
|
fi |
out and is not discovered by |
discovered by AOM because its CMD |
|
|
AOM. |
|
|
value is [kthreadd]. |
|
|
|
||
|
If the CMD value of a process starts |
In the preceding information, the |
||
|
with ( and ends with ), the process is |
process whose PID is 3 is not |
||
|
fi |
out and is not discovered by |
discovered by AOM because its CMD |
|
|
AOM. |
|
|
value is (ksoftirqd/0). |
|
|
|
||
|
If the CMD value of a process starts |
In the preceding information, the |
||
|
with /sbin/, the process is fi |
out |
process whose PID is 1148 is not |
|
|
and is not discovered by AOM. |
|
discovered by AOM because its CMD |
|
|
|
|
|
value starts with /sbin/. |
|
|
|
|
|
Built-in Service Discovery Rules
AOM provides two built-in discovery rules: Sys_Rule and Default_Rule. These rules are executed on all hosts, including hosts added later. The priority of Sys_Rule is higher than that of Default_Rule. That is, Sys_Rule is executed on the host fi If Sys_Rule is met, Default_Rule is not executed. Otherwise, Default_Rule is executed. Rule details are as follows:
Sys_Rule (cannot be disabled)
●For the component name, obtain the value of -Dapm_tier in the command, the value of the environment variable PAAS_APP_NAME, and the value of - Dapm_tier of the environment variable JAVA_TOOL_OPTIONS based on the priorities in descending order.
●For the application name, obtain the value of -Dapm_application in the command, the value of environment variable PAAS_MONITORING_GROUP, and the value of -Dapm_application in the environment variable JAVA_TOOL_OPTIONS based on the priorities in descending order.
In the following example, the component name is atps-demo and the application name is atpd-test.
PAAS_MONITORING_GROUP=atpd-test PAAS_APP_NAME=atps-demo
JAVA_TOOL_OPTIONS=-javaagent:/opt/oss/servicemgr/ICAgent/pinpoint/pinpoint-bootstrap.jar - Dapm_application=atpd-test -Dapm_tier=atps-demo
Default_Rule (can be disabled)
●If the COMMAND value of a process is java, obtain the name of the JAR package in the command, the main class name in the command, and the fi
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
18 |
AOM |
|
User Guide |
4 Connecting Resources to AOM |
keyword that does not start with a hyphen (-) in the command based on the priorities in descending order as the component name, and use the default value unknownapplicationname as the application name.
●If the COMMAND value of a process is python, obtain the name of the
fi .py/.pyc script in the command as the component name, and use the default value unknownapplicationname as the application name.
● If the COMMAND value of a process is node, obtain the name of the fi .js script in the command as the component name, and use the default value unknownapplicationname as the application name.
Custom Discovery Rules
Step 1 |
In the navigation pane, choose n |
n Management > Service Discovery. |
|
Step 2 |
Click Add Custom Application Discovery Rule and c nfi |
an application |
|
|
discovery rule. |
|
|
Step 3 |
Select a host for pre-detection. |
|
|
1.Customize a rule name, for example, ruletest.
2.Select a typical host, for example, hhhhhh-27465, to check whether the
application discovery rule is valid. The hosts that execute the rule will be c nfi in Step 6. Then, click Next.
Step 4 Set an application discovery rule.
1.Click Add Check Items. AOM can discover processes that meet the conditions of check items.
For example, AOM can detect the processes whose command parameters contain ovs-vswitchd unix: and environment variables contain
SUDO_USER=paas.
NOTE
–To precisely detect processes, you are advised to add check items about unique features of the processes.
– You need to add one check item at least and can add fi check items at most. If there are multiple check items, AOM only discovers the processes that meet the conditions of all check items.
2.After adding check items, click Detect to search for the processes that meet the conditions.
If no process is detected within 20s, modify the discovery rule and detect processes again. Only when at least one process is detected, go to the next step.
Step 5 Set a component name and log path.
1.Set a component name.
In the Component Name Settings area, click Add Naming Rule to set a component name for the discovered process. For example, add the fix text app-test as a component name.
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
19 |
AOM |
|
User Guide |
4 Connecting Resources to AOM |
NOTE
–If you do not set a component name, the default name unknownapplicationname is used.
–When you add multiple naming rules, all the naming rules are combined as the component name of the process. Metrics with the same component name are aggregated.
2.Set the function of collecting process logs.
Turn on Enable Automatic Log Association to obtain the .log, .trace,
and .out fi opened by processes. In this way, you can collect the fi for log analysis when monitoring processes.
This function is enabled by default. If you do not need it, disable it. In this case, no log list will be displayed in the Preview Component Name table.
3.Preview the component name.
|
If the name does not meet your requirements, click |
in the Preview |
|
|
Component Name table to rename it. |
|
|
4. |
nfi |
a log path. |
|
If you have turned on Enable Automatic Log Association, click n
Log Path in the Operation column of the Preview Component Name table to bind the application name, component name, and log path of the detected process to its command parameters.
Step 6 Set a priority and detection range.
1.Set a priority: When there are multiple rules, set priorities. Enter 1 to 9999. A smaller value indicates a higher priority. For example, 1 indicates the highest priority and 9999 indicates the lowest priority.
2.Set a detection range: Select a host to be detected. That is, select the host to
|
which the c nfi |
rule is applied. If no host is selected, this rule will be |
|
|
executed on all hosts, including hosts added later. |
||
Step 7 |
Click Add to complete the c nfi |
n AOM collects metrics of the process. |
|
Step 8 |
Wait for about two minutes, choose Monitoring > Component Monitoring in the |
||
|
navigation pane, select the hhhhhh-27465 host from the cluster drop-down list, |
||
|
and fin out the /openvswitch/ component that has been monitored. |
----End
More Operations
After creating an application discovery rule, you can also perform the operations described in Table 4-3.
Table 4-3 Related operations
Operation |
Description |
|
|
Viewing rule |
In the Name column, click the name of an application |
details |
discovery rule. |
|
|
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
20 |
AOM |
|
|
User Guide |
4 Connecting Resources to AOM |
|
|
|
|
|
Operation |
Description |
|
|
|
|
Enabling or |
● Click Enable in the Operation column. |
|
disabling a |
● Click Disable in the Operation column. After a rule is |
|
rule |
|
|
disabled, AOM does not collect corresponding process |
|
|
|
|
|
|
metrics. |
|
|
|
|
Deleting a |
● To delete an application discovery rule, click More in the |
|
rule |
Operation column and select Delete. |
|
|
● To delete one or more application discovery rules, select it or |
|
|
them and click Delete above the rule list. |
|
|
NOTE |
|
|
Built-in application discovery rules cannot be deleted. |
|
|
|
|
Modifying a |
Click More in the Operation column and select Modify from |
|
rule |
the drop-down list. |
|
|
NOTE |
|
|
Built-in application discovery rules cannot be m fi |
|
|
|
4.3n n Log Collection Paths
4.3.1n n Container Log Collection Paths
Application Operations Management (AOM) can collect and display container logs. To do so, c nfi a log collection path according to the following procedure.
Precautions
● The ICAgent only collects *.log, *.trace, and *.out text log fi ● AOM collects standard container output log by default.
Procedure
|
Adding a Log Policy on CCE |
|
Step 1 |
When creating a workload on Cloud Container Engine (CCE), click Log Policies |
|
|
after adding a container. |
|
Step 2 |
Click Add Log Policy. On the displayed page, c nfi |
parameters as required. |
|
The following uses Nginx as an example. |
|
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
21 |
AOM |
|
User Guide |
4 Connecting Resources to AOM |
Figure 4-1 Adding a log policy
Step 3 Set Storage Type to Host Path or Container Path. |
|
● Host Path: You can mount a host path to a |
c fi container path. Set |
parameters according to the following table. |
|
Table 4-4 Parameters for adding log policies (host path)
Parameter |
Description |
|
|
|
|
Storage Type |
Set this parameter to Host Path. You can mount a host |
|
|
path to a |
c fi container path. |
|
|
|
Add Container Path |
|
|
|
|
|
*Host Path |
Host path to which a container log fi is mounted. |
|
|
Example: /var/paas/sys/log/nginx |
|
|
|
|
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
22 |
AOM |
|
User Guide |
4 Connecting Resources to AOM |
Parameter |
Description |
|
|
|
|
|
|||
|
|
||||||||
Container Path |
Container path to which a data volume is mounted. |
||||||||
|
Example: /tmp |
|
|
|
|
|
|||
|
NOTICE |
|
|
|
|
|
|
|
|
|
– Do not mount a data volume to a system directory such as / |
||||||||
|
|
or /var/run. Otherwise, the container becomes abnormal. You |
|||||||
|
|
are advised to mount log fi |
|
to an empty directory. If the |
|||||
|
|
directory is not empty, ensure that there are no fi |
that |
||||||
|
|
ff |
c |
container startup. Otherwise, fi |
will be replaced, |
||||
|
|
causing container startup failures or workload creation |
|||||||
|
|
failures. |
|
|
|
|
|
||
|
– If the volume is mounted to a high-risk directory, you are |
||||||||
|
|
advised to use an account with minimum permissions to start |
|||||||
|
|
the container; otherwise, high-risk fi |
on the host may be |
||||||
|
|
damaged. |
|
|
|
|
|
||
|
– |
AOM collects only the fi |
20 log fi |
|
that have been |
||||
|
|
m |
|
fi recently. It collects fi |
from 2 levels of |
|
|||
|
|
subdirectories by default. |
|
|
|
|
|
||
|
– |
AOM only collects .log, .trace, and .out text log fi |
in |
||||||
|
|
mounting paths. |
|
|
|
|
|
||
|
|
||||||||
Extended Host |
Level-3 directory added to the original volume directory or |
||||||||
Path |
subdirectory. This path enables you to obtain output fi |
||||||||
|
of a single pod more easily. |
|
|
|
|
|
|||
|
– |
None: No extended paths c |
nfi |
|
|
|
|||
|
– |
PodUID: Pod ID. |
|
|
|
|
|
||
|
– |
PodName: Pod name. |
|
|
|
|
|
||
|
– PodUID/ContainerName: Pod ID/container name. |
||||||||
|
– PodName/ContainerName: Pod name/container name. |
||||||||
|
|
||||||||
Collection Path |
Path for collecting logs precisely. Details are as follows: |
||||||||
|
– |
If no collection path is |
c fi |
log fi in .log, .trace, |
|||||
|
|
and .out formats will be collected from the current |
|||||||
|
|
path by default. |
|
|
|
|
|
||
|
– If a collection path contains double asterisks (**), log |
||||||||
|
|
fi |
in .log, .trace, and .out formats will be collected |
||||||
|
|
from 5 levels of subdirectories. |
|
|
|
||||
|
– If a collection path contains an asterisk (*), a fuzzy |
||||||||
|
|
match is performed. |
|
|
|
|
|
||
|
Example: If the collection path is /tmp/**/test*.log, |
||||||||
|
all .log fi |
fix with test will be collected from /tmp |
|||||||
|
and its 5 levels of subdirectories. |
|
|
|
|||||
|
CAUTION |
|
|
|
|
|
|
||
|
To use the collection path function, ensure that the ICAgent |
||||||||
|
version is 5.12.22 or later. |
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
23 |
AOM |
|
|
|
|
|
|
|
|
User Guide |
|
|
|
|
4 Connecting Resources to AOM |
|||
|
|
|
|
|
|
|
|
|
|
Parameter |
Description |
|
|
|
|
|
|
|
|
|
|
|||||
|
Log Dumping |
Log dumping here refers to rolling local log fi |
|
|||||
|
|
– Enabled: AOM scans log fi |
every minute. When a log |
|||||
|
|
fi |
exceeds 50 MB, it is dumped immediately. A |
|||||
|
|
new .zip fi |
is generated in the directory where the log |
|||||
|
|
fi |
locates. For a log fi |
AOM stores only the latest |
||||
|
|
20 .zip fi |
When the number of .zip fi |
exceeds 20, |
||||
|
|
earlier .zip fi |
will be deleted. After the dump is |
|||||
|
|
complete, the log fi |
in AOM will be cleared. |
|||||
|
|
– Disabled: If you select Disabled, AOM does not dump |
||||||
|
|
log fi |
|
|
|
|
|
|
|
|
NOTE |
|
|
|
|
|
|
|
|
– |
AOM log fi |
rolling is implemented in the copytruncate mode. |
||||
|
|
|
During c nfi |
n |
ensure that log fi are written in the |
|||
|
|
|
append mode. Otherwise, fi |
holes may occur. |
|
|||
|
|
– Currently, mainstream log components such as Log4j and |
||||||
|
|
|
Logback support log fi |
rolling. If your log fi |
already |
|||
|
|
|
support rolling, skip the c nfi |
n Otherwise, c nfl c |
||||
|
|
|
may occur. |
|
|
|
|
|
|
|
– You are advised to c |
nfi |
log fi rolling for your own |
||||
|
|
|
services to fl |
x b y control the size and number of rolled fi |
||||
|
|
|
|
|
|
|
|
|
●Container Path: Logs will be stored in a container path. No host path needs to be mounted into the container. Set parameters according to the following table.
NOTE
Ensure that the ICAgent version is 5.10.79 or later.
Table 4-5 Parameters for adding log policies (container path)
Parameter |
Description |
|
|
Storage Type |
Set this parameter to Container Path. |
|
Logs will be stored in a container path. No host path |
|
needs to be mounted into the container. Ensure that the |
|
ICAgent version is 5.10.79 or later. |
|
|
Add Container Path
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
24 |
AOM |
|
User Guide |
4 Connecting Resources to AOM |
Parameter |
Description |
|
|
|
|
|||
|
|
|||||||
Container Path |
Container path to which a data volume is mounted. |
|||||||
|
Example: /tmp |
|
|
|
|
|||
|
NOTICE |
|
|
|
|
|
|
|
|
– |
Do not mount log fi to a system directory such as / or /var/ |
||||||
|
|
run. Otherwise, the container becomes abnormal. You are |
||||||
|
|
advised to mount the volume to an empty directory. If the |
||||||
|
|
directory is not empty, ensure that there are no fi |
that |
|||||
|
|
ff |
c |
container startup. Otherwise, fi |
will be replaced, |
|||
|
|
causing container startup failures or workload creation |
||||||
|
|
failures. |
|
|
|
|
||
|
– If the volume is mounted to a high-risk directory, you are |
|||||||
|
|
advised to use an account with minimum permissions to start |
||||||
|
|
the container; otherwise, high-risk fi |
on the host may be |
|||||
|
|
damaged. |
|
|
|
|
||
|
– |
AOM collects only the fi |
20 log fi |
that have been |
||||
|
|
m |
|
fi recently. It collects fi |
from 2 levels of |
|
||
|
|
subdirectories by default. |
|
|
|
|
||
|
– |
AOM only collects .log, .trace, and .out text log fi |
in |
|||||
|
|
mounting paths. |
|
|
|
|
||
|
|
|||||||
Collection |
Path for collecting logs precisely. Details are as follows: |
|||||||
Path |
– |
If no collection path is |
c fi |
log fi in .log, .trace, |
||||
|
||||||||
|
|
and .out formats will be collected from the current |
||||||
|
|
path by default. |
|
|
|
|
||
|
– If a collection path contains double asterisks (**), log |
|||||||
|
|
fi |
in .log, .trace, and .out formats will be collected |
|||||
|
|
from 5 levels of subdirectories. |
|
|
||||
|
– If a collection path contains an asterisk (*), a fuzzy |
|||||||
|
|
match is performed. |
|
|
|
|
||
|
Example: If the collection path is /tmp/**/test*.log, |
|||||||
|
all .log fi |
fix with test will be collected from /tmp |
||||||
|
and its 5 levels of subdirectories. |
|
|
|
||||
|
CAUTION |
|
|
|
|
|
||
|
To use the collection path function, ensure that the ICAgent |
|||||||
|
version is 5.12.22 or later. |
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
25 |
AOM |
|
|
|
|
|
|
|
|
User Guide |
|
|
|
|
4 Connecting Resources to AOM |
|||
|
|
|
|
|
|
|
|
|
|
Parameter |
Description |
|
|
|
|
|
|
|
|
|
|
|||||
|
Log Dumping |
Log dumping here refers to rolling local log fi |
|
|||||
|
|
– Enabled: AOM scans log fi |
every minute. When a log |
|||||
|
|
fi |
exceeds 50 MB, it is dumped immediately. A |
|||||
|
|
new .zip fi |
is generated in the directory where the log |
|||||
|
|
fi |
locates. For a log fi |
AOM stores only the latest |
||||
|
|
20 .zip fi |
When the number of .zip fi |
exceeds 20, |
||||
|
|
earlier .zip fi |
will be deleted. After the dump is |
|||||
|
|
complete, the log fi |
in AOM will be cleared. |
|||||
|
|
– Disabled: If you select Disabled, AOM does not dump |
||||||
|
|
log fi |
|
|
|
|
|
|
|
|
NOTE |
|
|
|
|
|
|
|
|
– |
AOM log fi |
rolling is implemented in the copytruncate mode. |
||||
|
|
|
During c nfi |
n |
ensure that log fi are written in the |
|||
|
|
|
append mode. Otherwise, fi |
holes may occur. |
|
|||
|
|
– Currently, mainstream log components such as Log4j and |
||||||
|
|
|
Logback support log fi |
rolling. If your log fi |
already |
|||
|
|
|
support rolling, skip the c nfi |
n Otherwise, c nfl c |
||||
|
|
|
may occur. |
|
|
|
|
|
|
|
– You are advised to c |
nfi |
log fi rolling for your own |
||||
|
|
|
services to fl |
x b y control the size and number of rolled fi |
||||
|
|
|
|
|
|
|
|
|
----End
Adding a Log Policy on ServiceStage
Step 1 When deploying a component on ServiceStage, add an image, click Advanced Settings, and then click the Container Log tab.
Step 2 Add a log policy.
The procedure for adding log policies on ServiceStage is the same as that on CCE. For details, see Step 3.
----End
Viewing Container Logs
After the log collection paths are c nfi the ICAgent collects log fi from such paths. This operation takes about 1 minute to complete. After collecting logs, you can perform the following operations:
●Viewing Container Log Files
In the navigation pane, choose Log > Log Files. On the Component tab,
select the corresponding cluster, namespace, and component to view log fi as shown in the following fi For details, see Viewing Log Files.
Issue 01 (2020-08-27) |
Copyright © Huawei Technologies Co., Ltd. |
26 |