HUAWEI AOM User Manual

AOM

User Guide

Issue

01

Date

2020-08-27

HUAWEI TECHNOLOGIES CO., LTD.

Copyright © Huawei Technologies Co., Ltd. 2020. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.

All other trademarks and trade names mentioned in this document are the property of their respective holders.

Notice

The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise c fi in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every ff has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

i

AOM

 

User Guide

Contents

Contents

1

Introduction..............................................................................................................................

1

2

Subscribing to AOM................................................................................................................

9

3

Permissions Management...................................................................................................

11

3.1

Creating a User and Granting Permissions..................................................................................................................

11

3.2

Creating a Custom Policy...................................................................................................................................................

12

4

Connecting Resources to AOM...........................................................................................

14

4.1

Installing the ICAgent (HUAWEI CLOUD Host).........................................................................................................

14

4.2

 

nfi

n

Application Discovery Rules......................................................................................................................

17

4.3

 

nfi

n

Log Collection Paths....................................................................................................................................

21

4.3.1

nfi

 

n Container Log Collection Paths............................................................................................................

21

4.3.2

nfi

 

n VM Log Collection Paths........................................................................................................................

27

5

Overview.................................................................................................................................

 

31

5.1

O&M..........................................................................................................................................................................................

 

 

31

5.2

Dashboard...............................................................................................................................................................................

 

38

6

Alarm Management..............................................................................................................

44

6.1

Usage Description.................................................................................................................................................................

44

6.2

Static Threshold Rules.........................................................................................................................................................

44

6.2.1 Creating Static Threshold Rules....................................................................................................................................

45

6.3

Creating

 

fic n Rules...............................................................................................................................................

48

6.4

Viewing Alarms......................................................................................................................................................................

51

6.5

Viewing Events.......................................................................................................................................................................

51

7

Resource Monitoring............................................................................................................

53

7.1

Usage Description.................................................................................................................................................................

53

7.2

Application Monitoring.......................................................................................................................................................

53

7.3

Component Monitoring......................................................................................................................................................

54

7.4

Host Monitoring....................................................................................................................................................................

56

7.5

Container Monitoring..........................................................................................................................................................

58

7.6

Metric Monitoring.................................................................................................................................................................

59

7.7

Cloud Service Monitoring...................................................................................................................................................

62

8

Log Management..................................................................................................................

65

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

ii

AOM

 

 

 

User Guide

 

Contents

8.1

Usage Description.................................................................................................................................................................

65

8.2

Searching for Logs................................................................................................................................................................

65

8.3

Viewing Log Files..................................................................................................................................................................

67

8.4

Adding Log Buckets..............................................................................................................................................................

69

8.5

Structuring Logs....................................................................................................................................................................

70

8.6

Viewing Bucket Logs............................................................................................................................................................

77

8.7

Adding Log Dumps...............................................................................................................................................................

79

8.8

Creating Statistical Rules....................................................................................................................................................

83

9

n

 

n Management...............................................................................................

86

9.1

Agent Management (HUAWEI CLOUD Host).............................................................................................................

86

9.1.1 Installing the ICAgent......................................................................................................................................................

86

9.1.2 Upgrading the ICAgent....................................................................................................................................................

91

9.1.3 Uninstalling the ICAgent.................................................................................................................................................

91

9.1.4 ICAgent Management (Non-HUAWEI CLOUD Host)...........................................................................................

94

9.1.4.1 Installing the ICAgent...................................................................................................................................................

94

9.1.4.2 Upgrading the ICAgent................................................................................................................................................

97

9.1.4.3 Uninstalling the ICAgent.............................................................................................................................................

97

9.2

Log

nfi

n.................................................................................................................................................................

98

9.2.1 Setting the Log Quota.....................................................................................................................................................

98

9.2.2 nfi

n

Delimiters.....................................................................................................................................................

98

9.2.3 Log Collection...................................................................................................................................................................

102

9.3

Quota

nfi

n.........................................................................................................................................................

103

9.4

Metric

nfi

n.........................................................................................................................................................

103

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

iii

AOM

 

User Guide

1 Introduction

1Introduction

Application Operations Management (AOM) is a one-stop and multi-dimensional O&M management platform for cloud applications. It monitors applications and related cloud resources in real time, collects and associates resource metrics, logs, and events to analyze application health status, and supports alarm reporting and data visualization, helping you detect faults in a timely manner and monitor the running status of applications, resources, and services in real time.

c fic y AOM monitors and uniformly manages servers, storage devices, networks, web containers, and applications hosted in Docker and Kubernetes, ff c y preventing problems, facilitating fault locating, and reducing O&M costs. Unlike traditional monitoring systems, AOM monitors services by applications. It meets enterprises' requirements for high ffic ncy and fast iteration, provides ff c IT support for their services, and protects and optimizes their IT assets, enabling enterprises to achieve strategic goals.

Console Description

Table 1-1 AOM console description

Item

Description

 

 

Overview

Both the O&M overview and

 

dashboard are provided.

 

O&M overview

 

 

The O&M page supports full-link,

 

 

multi-layer, and one-stop O&M for

 

 

resources, applications, and user

 

 

experience.

 

Dashboard

 

 

With a dashboard, ff n graphs

 

 

such as line graphs and digit graphs

 

 

are displayed on the same screen,

 

 

enabling you to understand

 

 

monitoring data comprehensively.

 

 

 

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

1

AOM

 

 

 

 

User Guide

 

 

 

1 Introduction

 

 

 

 

 

 

 

Item

Description

 

 

 

 

 

Alarm center

Alarm center includes the alarm list,

 

 

event list, threshold rules, and

 

 

n

fic

n rules.

 

 

 

Alarm list

 

 

 

 

Alarms are the information which is

 

 

 

reported when AOM or an external

 

 

 

service is abnormal or may cause

 

 

 

exceptions. You need to take

 

 

 

measures accordingly. Otherwise,

 

 

 

service exceptions may occur.

 

 

 

The alarm list displays the alarms

 

 

 

generated within a

c fi time

 

 

 

range.

 

 

● Event list

Events generally carry some important information, informing you of the changes of AOM or an external service. Such changes do not necessarily cause exceptions.

The event list displays the events generated within a c fi time range.

● Threshold rules

You can set threshold conditions for metrics by using threshold rules. When metric values meet conditions, AOM will generate threshold alarms. When no metric data is reported, AOM will report

n ffic n data events. In this way, you can identify and handle exceptions at the earliest time.

fic

n rules

 

 

AOM supports alarm n fic

n

 

You can use this function by

 

 

creating n

fic

n rules. When

 

alarms are reported due to an

 

 

exception in AOM or an external

 

service, alarm information can be

 

sent to

c fi

personnel by email

or Short Message Service (SMS) message. In this way, these personnel can rectify faults in time to avoid service loss.

● Intelligent thresholds

When a metric value meets the preset threshold condition, the system generates a threshold-

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

2

AOM

 

 

User Guide

1 Introduction

 

 

 

 

 

Item

Description

 

 

 

 

 

 

 

crossing alarm. If the n fic

n

 

 

function is enabled, alarm

 

 

 

information wil be sent to

c fi

 

 

users by SMS message or email.

 

 

 

 

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

3

AOM

 

 

User Guide

 

1 Introduction

 

 

 

 

 

Item

Description

 

 

 

 

Monitoring

Functions such as application

 

 

monitoring, component monitoring,

 

 

host monitoring, container monitoring,

 

 

and metric monitoring are provided.

 

 

Application monitoring

 

 

 

An application is a group of the

 

 

 

same or similar components divided

 

 

 

based on service requirements.

 

 

 

AOM supports monitoring by

 

 

 

application.

 

 

Component monitoring

 

 

 

Components refer to the services

 

 

 

that you deploy, including

 

 

 

containers and common processes.

 

 

 

The Component Monitoring page

 

 

 

displays information such as type,

 

 

 

CPU usage, memory usage, and

 

 

 

status of each component. AOM

 

 

 

supports drill-down from

 

 

 

components to instances, and then

 

 

 

to containers, enabling multi-

 

 

 

dimensional monitoring.

 

 

Host monitoring

 

 

 

The Host Monitoring page enables

 

 

 

you to monitor common system

 

 

 

devices such as disks and fi

 

 

 

systems, and resource usage and

 

 

 

health status of hosts and service

 

 

 

processes or instances running on

 

 

 

them.

 

 

Container monitoring

 

 

 

Only workloads deployed by using

 

 

 

Cloud Container Engine (CCE) and

 

 

 

applications created by using

 

 

 

ServiceStage are monitored.

 

 

Metric monitoring

 

 

 

The Metric Monitoring page

 

 

 

displays metric data of each

 

 

 

resource. You can monitor metric

 

 

 

values and trends in real time, add

 

 

 

desired metrics to dashboards,

 

 

 

create threshold rules, and export

 

 

 

monitoring reports. In this way, you

 

 

 

can monitor services in real time

 

 

 

and perform data correlation

 

 

 

analysis.

 

 

Cloud service monitoring

 

 

 

 

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

4

AOM

 

 

 

 

 

User Guide

 

 

 

1 Introduction

 

 

 

 

 

 

 

 

Item

Description

 

 

 

 

 

 

 

 

 

The Cloud Service Monitoring

 

 

 

page displays historical

 

 

 

 

performance curves of each cloud

 

 

 

service instance. You can view cloud

 

 

 

service data in the last six months.

 

 

 

 

Log

Functions such as log search, log fi

 

 

log dump, and path c nfi

n are

 

 

provided.

 

 

 

 

 

Log search

 

 

 

 

 

AOM enables you to quickly query

 

 

 

logs, and locate faults based on log

 

 

 

sources and contexts.

 

 

 

Log fi

 

 

 

 

 

 

You can quickly view log fi

of

 

 

 

component instances to locate

 

 

 

faults.

 

 

 

 

 

Log dumps

 

 

 

 

 

AOM enables you to dump logs to

 

 

 

Object Storage Service (OBS)

 

 

 

buckets for long-term storage.

 

 

Path c

nfi

n

 

 

 

 

AOM can collect and display VM

 

 

 

logs. VM refers to an Elastic Cloud

 

 

 

Server (ECS) or a Bare Metal Server

 

 

 

(BMS) running Linux. Before

 

 

 

collecting logs, ensure that you

 

 

 

have c

nfi

a log collection

 

 

 

path.

 

 

 

 

 

Log buckets

 

 

 

 

 

A log bucket is a logical group of

 

 

 

log fi

You dump log fi

create

 

 

 

statistical rules, and view logs by

 

 

 

log bucket.

 

 

 

 

Statistical rules

 

 

 

 

 

A statistical rule takes ff

c by log

 

 

 

bucket. You can c nfi

keywords

 

 

 

in statistical rules. Then, AOM

 

 

 

periodically counts the number of

 

 

 

such keywords in log buckets and

 

 

 

generates log metrics.

 

 

 

Log structuring

 

 

 

 

 

In log structuring, original logs can

 

 

 

be separated by regular expressions

 

 

 

or special characters so that

 

 

 

structured logs can be queried and

 

 

 

analyzed based on the SQL syntax.

 

 

 

 

 

 

 

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

5

AOM

 

 

 

 

 

 

User Guide

 

 

 

 

 

1 Introduction

 

 

 

 

 

 

 

 

 

Item

 

Description

 

 

 

 

 

 

 

nfi

n management

Functions such as agent management,

 

 

 

application discovery, and log

 

 

 

c

nfi

n are provided.

 

 

 

Agent management

 

 

 

 

The ICAgent collects metrics, logs,

 

 

 

 

and application performance data

 

 

 

 

in real time. For hosts purchased

 

 

 

 

from the Elastic Cloud Server (ECS)

 

 

 

 

or Bare Metal Server (BMS)

 

 

 

 

console, you need to manually

 

 

 

 

install the ICAgent. For hosts

 

 

 

 

purchased from the Cloud

 

 

 

 

Container Engine (CCE) console, the

 

 

 

 

ICAgent is automatically installed.

 

 

 

Application discovery

 

 

 

 

AOM can discover applications and

 

 

 

 

collect their metrics based on

 

 

 

 

c

nfi

rules.

 

 

 

 

Log c nfi

n

 

 

 

 

 

Log quotas and delimiters can be

 

 

 

 

c

nfi

 

 

 

 

 

Quota c

nfi

n

 

 

 

 

Earlier metrics will be deleted when

 

 

 

 

the metric quota is exceeded.

 

 

 

 

You can change the metric quota by

 

 

 

 

switching between the basic edition

 

 

 

 

(free) and pay-per-use edition.

 

 

 

Metric c

nfi

n

 

 

 

 

You can enable the metric

 

 

 

 

collection function to collect

 

 

 

 

metrics (excluding SLA and custom

 

 

 

 

metrics).

 

 

 

 

 

 

 

 

 

 

Process for Using AOM

The following fi shows the process of using AOM.

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

6

HUAWEI AOM User Manual

AOM

 

User Guide

1 Introduction

1.(Mandatory) Subscribe to AOM.

2.(Optional) Create a sub-account and set permissions.

3.(Mandatory) Create a cloud host.

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

7

AOM

 

 

 

 

User Guide

 

 

 

1 Introduction

4.

(Mandatory) Install the ICAgent.

 

 

The ICAgent is a collector used to collect metric, log, and application

 

performance data in real time.

 

 

If an ECS is purchased through CCE, the ICAgent is automatically installed on

 

the ECS.

 

 

 

5.

(Optional)

n

application discovery rules.

 

 

For the applications that meet built-in application discovery rules, they will

 

be automatically discovered after the ICAgent is installed. For the applications

 

that cannot be discovered using built-in application discovery rules, you need

 

to c nfi

custom application discovery rules.

 

6.

(Optional)

n

log collection paths.

 

 

To use AOM to monitor host logs, you need to c nfi

log collection paths.

7.(Optional) Implement O&M.

You can use AOM functions such as Overview, Alarm Management,

Resource Monitoring, and Log Management to perform routine O&M.

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

8

AOM

 

User Guide

2 Subscribing to AOM

2Subscribing to AOM

Before subscription, ensure that you have registered an account and implemented real-name authentication.

Registering an Account

Step 1 Log in to the cloud at https://www.huaweicloud.com/intl/en-us/.

Step 2 Click Register in the upper right corner of the page.

Complete the registration as prompted.

----End

Implementing Real-Name Authentication

You can use Application Operations Management (AOM) only after real-name authentication is complete.

Step 1 After logging in to the cloud, click the username in the upper right corner on the page and select My Account from the drop-down list.

Step 2 On the Basic Information page, click Authenticate to the right of

Authentication Status.

Complete real-name authentication as prompted.

----End

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

9

AOM

 

User Guide

2 Subscribing to AOM

Subscribing to AOM

AOM resources are n c fic and cannot be used across regions. When

ff n regions (such as AP-Hong Kong and AP-Bangkok) exist, select a region before subscribing to AOM.

Click Console in the upper right corner and then click in the upper left corner to select a region. Then, click Service List and choose Application > Application Operations Management. In the dialog box that is displayed, click Subscribe for Free to enable AOM for free.

NOTE

AOM provides both basic and pay-per-use editions. The basic edition is used by default. You can click Switch Edition as required.

Switching Edition

AOM provides both basic and pay-per-use editions. The basic edition is used by default. You can click Switch Edition as required.

Step 1 Log in to the AOM console, choose Overview > O&M in the navigation pane, and click Switch Edition in the upper right corner of the page.

Step 2 Select an edition, select the prompt information, and click Switch Now.

----End

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

10

AOM

 

User Guide

3 Permissions Management

3Permissions Management

3.1 Creating a User and Granting Permissions

This section describes the fin n permissions management provided by Identity and Access Management (IAM) for your Application Operations Management (AOM). With IAM, you can:

Create IAM users for employees based on the organizational structure of your enterprise. Each IAM user has their own security credentials, providing access to AOM resources.

Grant only the permissions required for users to perform a task.

Entrust a HUAWEI CLOUD account or cloud service to perform professional

and ffic n O&M on your AOM resources.

If your HUAWEI CLOUD account does not need individual IAM users, then you may skip over this chapter.

This section describes the procedure for granting permissions (see Figure 3-1).

Prerequisites

Before assigning permissions to user groups, you should learn about the AOM permissions listed in Permissions Management. For the system permissions of other services, see System Permissions.

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

11

AOM

 

User Guide

3 Permissions Management

Process

Figure 3-1 Process for granting AOM permissions

1.Creating a User Group and Assigning Permissions

Create a user group on the IAM console, and assign the AOM ReadOnlyAccess policy to the group.

2.Creating an IAM User

Create a user on the IAM console and add the user to the group created in 1.

3.Logging In Using an IAM User and Verifying Permissions

Log in to the AOM console as the created user, and verify that it only has read permissions for AOM.

3.2Creating a Custom Policy

Custom policies can be created as a supplement to the system policies of Application Operations Management (AOM). For the actions supported for custom policies, see Permissions Policies and Supported Actions.

You can create custom policies in either of the following two ways:

Visual editor: Select cloud services, actions, resources, and request conditions without the need to know policy syntax.

JSON: Edit JSON policies from scratch or based on an existing policy.

For details, see Creating a Custom Policy. The following section contains examples of common AOM custom policies.

Example Custom Policies

Example 1: Allowing a user to create threshold rules

{

"Version": "1.1",

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

12

AOM

 

User Guide

3 Permissions Management

"Statement": [

{

ff c "Allow", "Action": [

"aom:alarmRule:create"

]

}

]

}

Example 2: Forbidding a user to delete application discovery rules

A deny policy must be used in conjunction with other policies to take ff c If the permissions assigned to a user contain both Allow and Deny actions, the Deny actions take precedence over the Allow actions.

To grant a user the AOM FullAccess system policy but forbid the user to delete application discovery rules, create a custom policy that denies the deletion of application discovery rules, and grant both the AOM FullAccess and deny policies to the user. Because the Deny action takes precedence, the user can perform all operations except deleting application discovery rules. The following is an example deny policy:

{

"Version": "1.1", "Statement": [

{

ffc "Deny", "Action": [

"aom:discoveryRule:delete"

]

}

]

}

● Example 3: fin n permissions for multiple services in a policy

A custom policy can contain actions of multiple services that are all of the project-level type. The following is an example policy containing actions of multiple services:

{

"Version": "1.1", "Statement": [

{

ff c "Allow", "Action": [

"aom:*:list",

"aom:*:get",

"apm:*:list",

"apm:*:get"

]

},

{

ff c "Allow", "Action": [

"cce:cluster:get",

"cce:cluster:list",

"cce:node:get",

"cce:node:list"

]

}

]

}

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

13

AOM

 

User Guide

4 Connecting Resources to AOM

4Connecting Resources to AOM

4.1 Installing the ICAgent (HUAWEI CLOUD Host)

The ICAgent collects metrics, logs, and application performance data in real time. For hosts purchased from the Elastic Cloud Server (ECS) or Bare Metal Server (BMS) console, you need to manually install the ICAgent. For hosts purchased from the Cloud Container Engine (CCE) console, the ICAgent is automatically installed.

Prerequisites

Before installing the ICAgent, ensure that the time and time zone of the local browser are consistent with those of the server. If multiple servers are deployed, ensure that the local browser and multiple servers use the same time zone and time. Otherwise, metric data of applications and servers displayed on the UI may be incorrect.

The ICAgent process needs to be installed and run by the root user.

Installation Methods

There are two methods to install the ICAgent. Note that the two methods are not applicable to container nodes (that is, nodes created using ServiceStage or CCE). For container nodes, you do not need to manually install the ICAgent. Instead, you only need to perform certain operations when creating clusters or deploying applications.

For details, see Table 4-1.

Table 4-1 Installation methods

Method

Scenario

 

 

Initial

This method is used when the following conditions are met:

installation

1.

An Elastic IP Address (EIP) has been bound to the server.

 

 

2.

The ICAgent has never been installed on the server.

 

 

 

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

14

AOM

 

User Guide

4 Connecting Resources to AOM

 

 

 

 

Method

Scenario

 

 

 

 

Inherited

This method is used when the following conditions are met:

 

installation

You have multiple servers with ICAgent installed. One server is

 

 

 

 

bound to an EIP, but others are not. The ICAgent has been

 

 

installed on the server bound to an EIP by using the initial

 

 

installation method. You can use the inherited method to

 

 

install the ICAgent on the remaining servers.

 

 

See Inherited Installation.

 

 

 

Initial Installation

After you apply for a server and install the ICAgent for the fi time, perform the following operations:

Step 1 Obtain an Access Key ID/Secret Access Key (AK/SK).

If you have obtained the AK/SK, skip this step.

If you have not obtained the AK/SK, obtain them fi

Step 2 In the navigation pane, choose n

n Management > Agent

Management.

 

Step 3 Click Install ICAgent.

Step 4 Generate the ICAgent installation command and copy it.

1.Enter the obtained AK/SK in the text box to generate the ICAgent installation command.

NOTE

Ensure that the AK/SK are correct. Otherwise, the ICAgent cannot be installed.

2.Click Copy Command.

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

15

AOM

 

User Guide

4 Connecting Resources to AOM

Step 5 Use a remote login tool, such as PuTTY, to log in to the server where the ICAgent is to be installed as the root user and run the command copied in Step 4.2 to install the ICAgent.

NOTE

If the message ICAgent install success is displayed, the ICAgent is successfully installed in the /opt/oss/servicemgr/ directory. After the ICAgent is successfully installed, choose

nn Management > Agent Management to view the ICAgent status.

If the ICAgent fails to be installed, uninstall the ICAgent according to Uninstalling the ICAgent Through Logging In to the Server and then install it again. If the problem persists, contact technical support.

----End

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

16

AOM

 

User Guide

4 Connecting Resources to AOM

Follow-up Operations

For more information about how to install, upgrade, and uninstall the ICAgent, see Agent Management (HUAWEI CLOUD Host).

4.2n n Application Discovery Rules

Application Operations Management (AOM) can discover applications and collect their metrics based on c nfi rules. There are two modes to c nfi application discovery: auto mode and manual mode. This section mainly describes the manual mode.

Automatic c n

n

After you install the ICAgent on a host according to Installing the ICAgent, the ICAgent automatically discovers applications on the host based on Builtin Service Discovery Rules and displays them on the Application Monitoring page.

Manual c n

n

After you add a custom service discovery rule on the application discovery page and apply it to the host where the ICAgent is installed (for details, see Installing the ICAgent), the ICAgent discovers applications on the host based on the c nfi service discovery rule and displays them on the application monitoring page.

Filtering Rules

The ICAgent will periodically implement detection on the target host to fin out all its processes. The ff c is similar to that of running the ps -e -o pid,comm,lstart,cmd | grep -v defunct command on the target host. Then, the

ICAgent checks whether processes match the fi

n rules in Table 4-2. If a

 

process meets a fi

n rule, the process is fi

out and is not discovered by

AOM. If a process does not meet any fi

n rules, the process is not fi

out

and is discovered by AOM.

 

 

 

ICAgent detection results may as follows:

 

 

 

PID COMMAND

 

STARTED CMD

 

 

 

1 systemd

Tue Oct

2 21:12:06 2018 /usr/lib/systemd/systemd --switched-root --system --

 

deserialize 20

 

 

 

 

 

2 kthreadd

Tue Oct

2 21:12:06 2018 [kthreadd]

 

 

 

3 ksoftirqd/0

Tue Oct

2 21:12:06 2018 (ksoftirqd/0)

 

 

1140 tuned

Tue Oct

2 21:12:27 2018 /usr/bin/python -Es /usr/sbin/tuned -l -P

 

1144 sshd

Tue Oct

2 21:12:27 2018 /usr/sbin/sshd -D

 

 

1148 agetty

Tue Oct

2 21:12:27 2018 /sbin/agetty --keep-baud 115200 38400 9600 hvc0 vt220

1154 docker-containe Tue Oct 2 21:12:29 2018 docker-containerd -l unix:///var/run/docker/libcontainerd/ docker-containerd.sock --shim docker-containerd-shim --start-timeout 2m --state-dir /var/run/docker/ libcontainerd/containerd --runtime docker-runc --metrics-interval=0

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

17

AOM

 

 

 

User Guide

 

 

4 Connecting Resources to AOM

 

Table 4-2 Filtering rules

 

 

 

 

 

 

 

Filtering Rule

 

Example

 

 

 

 

If the COMMAND value of a process is

In the preceding information, the

 

docker-containe, vi, vim, pause, sshd,

process whose PID is 1154 is not

 

ps, sleep, grep, tailf, tail, or systemd-

discovered by AOM because its

 

udevd, and the process is not running

COMMAND value is docker-containe.

 

in the container, the process is fi

 

 

 

out and is not discovered by AOM.

 

 

 

 

 

If the CMD value of a process starts

In the preceding information, the

 

with [ and ends with ], the process is

process whose PID is 2 is not

 

fi

out and is not discovered by

discovered by AOM because its CMD

 

AOM.

 

 

value is [kthreadd].

 

 

 

 

If the CMD value of a process starts

In the preceding information, the

 

with ( and ends with ), the process is

process whose PID is 3 is not

 

fi

out and is not discovered by

discovered by AOM because its CMD

 

AOM.

 

 

value is (ksoftirqd/0).

 

 

 

 

If the CMD value of a process starts

In the preceding information, the

 

with /sbin/, the process is fi

out

process whose PID is 1148 is not

 

and is not discovered by AOM.

 

discovered by AOM because its CMD

 

 

 

 

value starts with /sbin/.

 

 

 

 

 

Built-in Service Discovery Rules

AOM provides two built-in discovery rules: Sys_Rule and Default_Rule. These rules are executed on all hosts, including hosts added later. The priority of Sys_Rule is higher than that of Default_Rule. That is, Sys_Rule is executed on the host fi If Sys_Rule is met, Default_Rule is not executed. Otherwise, Default_Rule is executed. Rule details are as follows:

Sys_Rule (cannot be disabled)

For the component name, obtain the value of -Dapm_tier in the command, the value of the environment variable PAAS_APP_NAME, and the value of - Dapm_tier of the environment variable JAVA_TOOL_OPTIONS based on the priorities in descending order.

For the application name, obtain the value of -Dapm_application in the command, the value of environment variable PAAS_MONITORING_GROUP, and the value of -Dapm_application in the environment variable JAVA_TOOL_OPTIONS based on the priorities in descending order.

In the following example, the component name is atps-demo and the application name is atpd-test.

PAAS_MONITORING_GROUP=atpd-test PAAS_APP_NAME=atps-demo

JAVA_TOOL_OPTIONS=-javaagent:/opt/oss/servicemgr/ICAgent/pinpoint/pinpoint-bootstrap.jar - Dapm_application=atpd-test -Dapm_tier=atps-demo

Default_Rule (can be disabled)

If the COMMAND value of a process is java, obtain the name of the JAR package in the command, the main class name in the command, and the fi

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

18

AOM

 

User Guide

4 Connecting Resources to AOM

keyword that does not start with a hyphen (-) in the command based on the priorities in descending order as the component name, and use the default value unknownapplicationname as the application name.

If the COMMAND value of a process is python, obtain the name of the

fi .py/.pyc script in the command as the component name, and use the default value unknownapplicationname as the application name.

● If the COMMAND value of a process is node, obtain the name of the fi .js script in the command as the component name, and use the default value unknownapplicationname as the application name.

Custom Discovery Rules

Step 1

In the navigation pane, choose n

n Management > Service Discovery.

Step 2

Click Add Custom Application Discovery Rule and c nfi

an application

 

discovery rule.

 

 

Step 3

Select a host for pre-detection.

 

 

1.Customize a rule name, for example, ruletest.

2.Select a typical host, for example, hhhhhh-27465, to check whether the

application discovery rule is valid. The hosts that execute the rule will be c nfi in Step 6. Then, click Next.

Step 4 Set an application discovery rule.

1.Click Add Check Items. AOM can discover processes that meet the conditions of check items.

For example, AOM can detect the processes whose command parameters contain ovs-vswitchd unix: and environment variables contain

SUDO_USER=paas.

NOTE

To precisely detect processes, you are advised to add check items about unique features of the processes.

– You need to add one check item at least and can add fi check items at most. If there are multiple check items, AOM only discovers the processes that meet the conditions of all check items.

2.After adding check items, click Detect to search for the processes that meet the conditions.

If no process is detected within 20s, modify the discovery rule and detect processes again. Only when at least one process is detected, go to the next step.

Step 5 Set a component name and log path.

1.Set a component name.

In the Component Name Settings area, click Add Naming Rule to set a component name for the discovered process. For example, add the fix text app-test as a component name.

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

19

AOM

 

User Guide

4 Connecting Resources to AOM

NOTE

If you do not set a component name, the default name unknownapplicationname is used.

When you add multiple naming rules, all the naming rules are combined as the component name of the process. Metrics with the same component name are aggregated.

2.Set the function of collecting process logs.

Turn on Enable Automatic Log Association to obtain the .log, .trace,

and .out fi opened by processes. In this way, you can collect the fi for log analysis when monitoring processes.

This function is enabled by default. If you do not need it, disable it. In this case, no log list will be displayed in the Preview Component Name table.

3.Preview the component name.

 

If the name does not meet your requirements, click

in the Preview

 

Component Name table to rename it.

 

4.

nfi

a log path.

 

If you have turned on Enable Automatic Log Association, click n

Log Path in the Operation column of the Preview Component Name table to bind the application name, component name, and log path of the detected process to its command parameters.

Step 6 Set a priority and detection range.

1.Set a priority: When there are multiple rules, set priorities. Enter 1 to 9999. A smaller value indicates a higher priority. For example, 1 indicates the highest priority and 9999 indicates the lowest priority.

2.Set a detection range: Select a host to be detected. That is, select the host to

 

which the c nfi

rule is applied. If no host is selected, this rule will be

 

executed on all hosts, including hosts added later.

Step 7

Click Add to complete the c nfi

n AOM collects metrics of the process.

Step 8

Wait for about two minutes, choose Monitoring > Component Monitoring in the

 

navigation pane, select the hhhhhh-27465 host from the cluster drop-down list,

 

and fin out the /openvswitch/ component that has been monitored.

----End

More Operations

After creating an application discovery rule, you can also perform the operations described in Table 4-3.

Table 4-3 Related operations

Operation

Description

 

 

Viewing rule

In the Name column, click the name of an application

details

discovery rule.

 

 

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

20

AOM

 

User Guide

4 Connecting Resources to AOM

 

 

 

 

Operation

Description

 

 

 

 

Enabling or

● Click Enable in the Operation column.

 

disabling a

● Click Disable in the Operation column. After a rule is

 

rule

 

disabled, AOM does not collect corresponding process

 

 

 

 

metrics.

 

 

 

 

Deleting a

● To delete an application discovery rule, click More in the

 

rule

Operation column and select Delete.

 

 

● To delete one or more application discovery rules, select it or

 

 

them and click Delete above the rule list.

 

 

NOTE

 

 

Built-in application discovery rules cannot be deleted.

 

 

 

 

Modifying a

Click More in the Operation column and select Modify from

 

rule

the drop-down list.

 

 

NOTE

 

 

Built-in application discovery rules cannot be m fi

 

 

 

4.3n n Log Collection Paths

4.3.1n n Container Log Collection Paths

Application Operations Management (AOM) can collect and display container logs. To do so, c nfi a log collection path according to the following procedure.

Precautions

● The ICAgent only collects *.log, *.trace, and *.out text log fi ● AOM collects standard container output log by default.

Procedure

 

Adding a Log Policy on CCE

 

Step 1

When creating a workload on Cloud Container Engine (CCE), click Log Policies

 

after adding a container.

 

Step 2

Click Add Log Policy. On the displayed page, c nfi

parameters as required.

 

The following uses Nginx as an example.

 

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

21

AOM

 

User Guide

4 Connecting Resources to AOM

Figure 4-1 Adding a log policy

Step 3 Set Storage Type to Host Path or Container Path.

 

Host Path: You can mount a host path to a

c fi container path. Set

parameters according to the following table.

 

Table 4-4 Parameters for adding log policies (host path)

Parameter

Description

 

 

 

Storage Type

Set this parameter to Host Path. You can mount a host

 

path to a

c fi container path.

 

 

 

Add Container Path

 

 

 

*Host Path

Host path to which a container log fi is mounted.

 

Example: /var/paas/sys/log/nginx

 

 

 

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

22

AOM

 

User Guide

4 Connecting Resources to AOM

Parameter

Description

 

 

 

 

 

 

 

Container Path

Container path to which a data volume is mounted.

 

Example: /tmp

 

 

 

 

 

 

NOTICE

 

 

 

 

 

 

 

 

– Do not mount a data volume to a system directory such as /

 

 

or /var/run. Otherwise, the container becomes abnormal. You

 

 

are advised to mount log fi

 

to an empty directory. If the

 

 

directory is not empty, ensure that there are no fi

that

 

 

ff

c

container startup. Otherwise, fi

will be replaced,

 

 

causing container startup failures or workload creation

 

 

failures.

 

 

 

 

 

 

– If the volume is mounted to a high-risk directory, you are

 

 

advised to use an account with minimum permissions to start

 

 

the container; otherwise, high-risk fi

on the host may be

 

 

damaged.

 

 

 

 

 

 

AOM collects only the fi

20 log fi

 

that have been

 

 

m

 

fi recently. It collects fi

from 2 levels of

 

 

 

subdirectories by default.

 

 

 

 

 

 

AOM only collects .log, .trace, and .out text log fi

in

 

 

mounting paths.

 

 

 

 

 

 

 

Extended Host

Level-3 directory added to the original volume directory or

Path

subdirectory. This path enables you to obtain output fi

 

of a single pod more easily.

 

 

 

 

 

 

None: No extended paths c

nfi

 

 

 

 

PodUID: Pod ID.

 

 

 

 

 

 

PodName: Pod name.

 

 

 

 

 

 

– PodUID/ContainerName: Pod ID/container name.

 

– PodName/ContainerName: Pod name/container name.

 

 

Collection Path

Path for collecting logs precisely. Details are as follows:

 

If no collection path is

c fi

log fi in .log, .trace,

 

 

and .out formats will be collected from the current

 

 

path by default.

 

 

 

 

 

 

– If a collection path contains double asterisks (**), log

 

 

fi

in .log, .trace, and .out formats will be collected

 

 

from 5 levels of subdirectories.

 

 

 

 

– If a collection path contains an asterisk (*), a fuzzy

 

 

match is performed.

 

 

 

 

 

 

Example: If the collection path is /tmp/**/test*.log,

 

all .log fi

fix with test will be collected from /tmp

 

and its 5 levels of subdirectories.

 

 

 

 

CAUTION

 

 

 

 

 

 

 

To use the collection path function, ensure that the ICAgent

 

version is 5.12.22 or later.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

23

AOM

 

 

 

 

 

 

 

User Guide

 

 

 

 

4 Connecting Resources to AOM

 

 

 

 

 

 

 

 

 

 

Parameter

Description

 

 

 

 

 

 

 

 

 

 

Log Dumping

Log dumping here refers to rolling local log fi

 

 

 

Enabled: AOM scans log fi

every minute. When a log

 

 

fi

exceeds 50 MB, it is dumped immediately. A

 

 

new .zip fi

is generated in the directory where the log

 

 

fi

locates. For a log fi

AOM stores only the latest

 

 

20 .zip fi

When the number of .zip fi

exceeds 20,

 

 

earlier .zip fi

will be deleted. After the dump is

 

 

complete, the log fi

in AOM will be cleared.

 

 

Disabled: If you select Disabled, AOM does not dump

 

 

log fi

 

 

 

 

 

 

 

NOTE

 

 

 

 

 

 

 

AOM log fi

rolling is implemented in the copytruncate mode.

 

 

 

During c nfi

n

ensure that log fi are written in the

 

 

 

append mode. Otherwise, fi

holes may occur.

 

 

 

– Currently, mainstream log components such as Log4j and

 

 

 

Logback support log fi

rolling. If your log fi

already

 

 

 

support rolling, skip the c nfi

n Otherwise, c nfl c

 

 

 

may occur.

 

 

 

 

 

 

 

– You are advised to c

nfi

log fi rolling for your own

 

 

 

services to fl

x b y control the size and number of rolled fi

 

 

 

 

 

 

 

 

 

Container Path: Logs will be stored in a container path. No host path needs to be mounted into the container. Set parameters according to the following table.

NOTE

Ensure that the ICAgent version is 5.10.79 or later.

Table 4-5 Parameters for adding log policies (container path)

Parameter

Description

 

 

Storage Type

Set this parameter to Container Path.

 

Logs will be stored in a container path. No host path

 

needs to be mounted into the container. Ensure that the

 

ICAgent version is 5.10.79 or later.

 

 

Add Container Path

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

24

AOM

 

User Guide

4 Connecting Resources to AOM

Parameter

Description

 

 

 

 

 

 

Container Path

Container path to which a data volume is mounted.

 

Example: /tmp

 

 

 

 

 

NOTICE

 

 

 

 

 

 

 

Do not mount log fi to a system directory such as / or /var/

 

 

run. Otherwise, the container becomes abnormal. You are

 

 

advised to mount the volume to an empty directory. If the

 

 

directory is not empty, ensure that there are no fi

that

 

 

ff

c

container startup. Otherwise, fi

will be replaced,

 

 

causing container startup failures or workload creation

 

 

failures.

 

 

 

 

 

– If the volume is mounted to a high-risk directory, you are

 

 

advised to use an account with minimum permissions to start

 

 

the container; otherwise, high-risk fi

on the host may be

 

 

damaged.

 

 

 

 

 

AOM collects only the fi

20 log fi

that have been

 

 

m

 

fi recently. It collects fi

from 2 levels of

 

 

 

subdirectories by default.

 

 

 

 

 

AOM only collects .log, .trace, and .out text log fi

in

 

 

mounting paths.

 

 

 

 

 

 

Collection

Path for collecting logs precisely. Details are as follows:

Path

If no collection path is

c fi

log fi in .log, .trace,

 

 

 

and .out formats will be collected from the current

 

 

path by default.

 

 

 

 

 

– If a collection path contains double asterisks (**), log

 

 

fi

in .log, .trace, and .out formats will be collected

 

 

from 5 levels of subdirectories.

 

 

 

– If a collection path contains an asterisk (*), a fuzzy

 

 

match is performed.

 

 

 

 

 

Example: If the collection path is /tmp/**/test*.log,

 

all .log fi

fix with test will be collected from /tmp

 

and its 5 levels of subdirectories.

 

 

 

 

CAUTION

 

 

 

 

 

 

To use the collection path function, ensure that the ICAgent

 

version is 5.12.22 or later.

 

 

 

 

 

 

 

 

 

 

 

 

 

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

25

AOM

 

 

 

 

 

 

 

User Guide

 

 

 

 

4 Connecting Resources to AOM

 

 

 

 

 

 

 

 

 

 

Parameter

Description

 

 

 

 

 

 

 

 

 

 

Log Dumping

Log dumping here refers to rolling local log fi

 

 

 

Enabled: AOM scans log fi

every minute. When a log

 

 

fi

exceeds 50 MB, it is dumped immediately. A

 

 

new .zip fi

is generated in the directory where the log

 

 

fi

locates. For a log fi

AOM stores only the latest

 

 

20 .zip fi

When the number of .zip fi

exceeds 20,

 

 

earlier .zip fi

will be deleted. After the dump is

 

 

complete, the log fi

in AOM will be cleared.

 

 

Disabled: If you select Disabled, AOM does not dump

 

 

log fi

 

 

 

 

 

 

 

NOTE

 

 

 

 

 

 

 

AOM log fi

rolling is implemented in the copytruncate mode.

 

 

 

During c nfi

n

ensure that log fi are written in the

 

 

 

append mode. Otherwise, fi

holes may occur.

 

 

 

– Currently, mainstream log components such as Log4j and

 

 

 

Logback support log fi

rolling. If your log fi

already

 

 

 

support rolling, skip the c nfi

n Otherwise, c nfl c

 

 

 

may occur.

 

 

 

 

 

 

 

– You are advised to c

nfi

log fi rolling for your own

 

 

 

services to fl

x b y control the size and number of rolled fi

 

 

 

 

 

 

 

 

 

----End

Adding a Log Policy on ServiceStage

Step 1 When deploying a component on ServiceStage, add an image, click Advanced Settings, and then click the Container Log tab.

Step 2 Add a log policy.

The procedure for adding log policies on ServiceStage is the same as that on CCE. For details, see Step 3.

----End

Viewing Container Logs

After the log collection paths are c nfi the ICAgent collects log fi from such paths. This operation takes about 1 minute to complete. After collecting logs, you can perform the following operations:

Viewing Container Log Files

In the navigation pane, choose Log > Log Files. On the Component tab,

select the corresponding cluster, namespace, and component to view log fi as shown in the following fi For details, see Viewing Log Files.

Issue 01 (2020-08-27)

Copyright © Huawei Technologies Co., Ltd.

26

Loading...
+ 78 hidden pages