CANN
V100R020C20
AI CPU Custom Operator
Development Guide (Inference)
Issue |
01 |
Date |
2021-03-11 |
HUAWEI TECHNOLOGIES CO., LTD.
Copyright © Huawei Technologies Co., Ltd. 2021. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise c fi in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every ff has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
i |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
About This Document |
About This Document
Overview
An AI CPU operator is an operation of complete compute logic that runs on AI CPU, one of the compute engines in the Ascend AI Processor. You might need to develop a custom AI CPU operator in the following cases:
● During neural network (NN) training or inference, if you fin an unsupported operator when converting a third-party open-source network to adapt to the Ascend AI Processor, a custom AI CPU operator can help you streamline the model execution process and improve the functionality commissioning
ffic ncy After the functionality commissioning is passed, convert the custom AI CPU operator into a TBE operator for performance commissioning.
●In certain scenarios, it is impossible to implement custom operators that run on AI Core. For example, some operators require int64 data, which is incompatible with AI Core instructions. When such an operator is not the performance bottleneck of your network, you can develop a custom AI CPU operator instead for Ascend AI Processor support.
NOTICE
In the moment, AI CPU custom operators can run only in EP standard form.
Intended Audience
This document is intended for developers who develop custom AI CPU operators. After reading this document, you will be able to:
●Describe the principles and w fl w of AI CPU operator implementation.
●Develop custom AI CPU operators based on the samples provided in this document.
To better understand this document, you need to have:
●fic ncy in C++ programming.
●Knowledge of mathematical expressions
●Knowledge of machine learning and deep learning
●Knowledge of the w fl w and principles of the Ascend platform.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
ii |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
About This Document |
● Knowledge of the TBE custom operator development w |
fl w on the Ascend |
platform. |
|
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
iii |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
Contents |
Contents
About This Document................................................................................................................ |
|
ii |
|||||
1 |
Quick Start................................................................................................................................ |
|
|
1 |
|||
1.1 |
Neural Network Introduction.............................................................................................................................................. |
|
1 |
|
|||
1.2 |
Operator Basics........................................................................................................................................................................ |
|
|
1 |
|
||
2 |
AI CPU Introduction................................................................................................................ |
|
7 |
|
|||
2.1 |
Overview.................................................................................................................................................................................... |
|
|
7 |
|
||
2.2 |
Building and Running an Operator................................................................................................................................... |
|
9 |
|
|||
3 |
Operator Development W |
fl w.................................................................................... |
13 |
||||
4 |
Operator Development Preparations............................................................................... |
16 |
|||||
4.1 |
Environment Setup............................................................................................................................................................... |
|
|
16 |
|
||
4.2 |
Operator Analysis................................................................................................................................................................. |
|
|
16 |
|
||
4.3 |
Project Creation..................................................................................................................................................................... |
|
|
18 |
|
||
5 |
Operator Development W |
fl w.................................................................................... |
20 |
||||
5.1 |
Operator Code Implementation...................................................................................................................................... |
|
20 |
||||
5.2 |
Operator Prototype |
fin n......................................................................................................................................... |
|
23 |
|||
|
5.2.1 Principle................................................................................................................................................................................ |
|
|
23 |
|||
|
5.2.2 Implementation.................................................................................................................................................................. |
|
|
24 |
|||
5.3 |
Operator Information Library fin |
n...................................................................................................................... |
29 |
||||
5.4 |
Operator Adaptation........................................................................................................................................................... |
|
|
32 |
|||
|
5.4.1 Adaptation Plug-in Development (TensorFlow)..................................................................................................... |
33 |
|||||
|
5.4.2 Adaptation Plug-in Development (C ff )................................................................................................................. |
36 |
|||||
5.5 |
Operator Project Building and Deployment................................................................................................................ |
40 |
|||||
|
5.5.1 Operator Project Building............................................................................................................................................... |
|
40 |
||||
|
5.5.2 OPP Deployment............................................................................................................................................................... |
|
|
41 |
|||
6 |
Operator ST............................................................................................................................. |
|
|
42 |
|||
6.1 |
Introduction............................................................................................................................................................................ |
|
|
42 |
|||
6.2 |
Implementation..................................................................................................................................................................... |
|
|
43 |
|||
7 |
Operator V fic |
n on Network.................................................................................... |
53 |
||||
8 |
Sample Reference.................................................................................................................. |
|
|
54 |
|||
8.1 |
Custom Operator Development....................................................................................................................................... |
|
54 |
||||
|
|
|
|
|
|||
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
iv |
CANN |
|
|
AI CPU Custom Operator Development Guide |
|
|
(Inference) |
|
Contents |
8.2 Operator V fic |
n on Network.................................................................................................................................. |
54 |
9 API Reference......................................................................................................................... |
|
56 |
9.1 Overview.................................................................................................................................................................................. |
|
56 |
9.2 AI CPU APIs............................................................................................................................................................................. |
|
57 |
9.2.1 Introduction......................................................................................................................................................................... |
|
57 |
9.2.2 Class CpuKernelContext.................................................................................................................................................. |
58 |
|
9.2.2.1 CpuKernelContext Constructor and Destructor................................................................................................... |
58 |
|
9.2.2.2 GetOpType........................................................................................................................................................................ |
|
59 |
9.2.2.3 Input................................................................................................................................................................................... |
|
59 |
9.2.2.4 Output................................................................................................................................................................................ |
|
60 |
9.2.2.5 GetAttr............................................................................................................................................................................... |
|
61 |
9.2.2.6 GetInputsSize................................................................................................................................................................... |
|
61 |
9.2.2.7 GetOutputsSize............................................................................................................................................................... |
|
62 |
9.2.2.8 Init....................................................................................................................................................................................... |
|
62 |
9.2.3 Class TensorShape............................................................................................................................................................. |
63 |
|
9.2.3.1 SetFormat......................................................................................................................................................................... |
|
63 |
9.2.3.2 GetFormat........................................................................................................................................................................ |
|
64 |
9.2.3.3 GetUnknownRank.......................................................................................................................................................... |
64 |
|
9.2.3.4 SetUnknownRank........................................................................................................................................................... |
65 |
|
9.2.3.5 GetDimSizes..................................................................................................................................................................... |
|
66 |
9.2.3.6 SetDimSizes...................................................................................................................................................................... |
|
66 |
9.2.3.7 GetDimSize....................................................................................................................................................................... |
|
67 |
9.2.3.8 GetDims............................................................................................................................................................................. |
|
67 |
9.2.3.9 NumElements.................................................................................................................................................................. |
|
68 |
9.2.4 Class Tensor......................................................................................................................................................................... |
|
68 |
9.2.4.1 GetTensorShape.............................................................................................................................................................. |
|
69 |
9.2.4.2 SetTensorShape............................................................................................................................................................... |
|
69 |
9.2.4.3 SetDataType..................................................................................................................................................................... |
|
70 |
9.2.4.4 GetDataType.................................................................................................................................................................... |
|
70 |
9.2.4.5 SetData.............................................................................................................................................................................. |
|
71 |
9.2.4.6 GetData............................................................................................................................................................................. |
|
72 |
9.2.4.7 SetDataSize...................................................................................................................................................................... |
|
72 |
9.2.4.8 GetDataSize..................................................................................................................................................................... |
|
73 |
9.2.4.9 CalcDataSizeByShape................................................................................................................................................... |
73 |
|
9.2.4.10 NumElements............................................................................................................................................................... |
|
74 |
9.2.5 Class AttrValue................................................................................................................................................................... |
|
74 |
9.2.5.1 GetString........................................................................................................................................................................... |
|
74 |
9.2.5.2 GetListString.................................................................................................................................................................... |
|
75 |
9.2.5.3 AddListString................................................................................................................................................................... |
|
75 |
9.2.5.4 ListStringSize.................................................................................................................................................................... |
|
76 |
9.2.5.5 SetString............................................................................................................................................................................ |
|
77 |
9.2.5.6 SetListString..................................................................................................................................................................... |
|
77 |
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
v |
CANN |
|
|
AI CPU Custom Operator Development Guide |
|
|
(Inference) |
|
Contents |
9.2.5.7 GetInt................................................................................................................................................................................. |
|
78 |
9.2.5.8 GetListInt........................................................................................................................................................................... |
|
79 |
9.2.5.9 AddListInt.......................................................................................................................................................................... |
|
79 |
9.2.5.10 ListIntSize....................................................................................................................................................................... |
|
80 |
9.2.5.11 SetInt................................................................................................................................................................................ |
|
80 |
9.2.5.12 SetListInt......................................................................................................................................................................... |
|
81 |
9.2.5.13 GetFloat.......................................................................................................................................................................... |
|
82 |
9.2.5.14 GetListFloat.................................................................................................................................................................... |
|
82 |
9.2.5.15 AddListFloat................................................................................................................................................................... |
|
83 |
9.2.5.16 ListFloatSize................................................................................................................................................................... |
|
83 |
9.2.5.17 SetFloat........................................................................................................................................................................... |
|
84 |
9.2.5.18 SetListFloat.................................................................................................................................................................... |
|
85 |
9.2.5.19 GetBool........................................................................................................................................................................... |
|
85 |
9.2.5.20 GetListBool..................................................................................................................................................................... |
|
86 |
9.2.5.21 AddListBool.................................................................................................................................................................... |
|
86 |
9.2.5.22 ListBoolSize.................................................................................................................................................................... |
|
87 |
9.2.5.23 SetBool............................................................................................................................................................................ |
|
88 |
9.2.5.24 SetListBool...................................................................................................................................................................... |
|
88 |
9.2.5.25 GetDataType.................................................................................................................................................................. |
|
89 |
9.2.5.26 GetListDataType........................................................................................................................................................... |
|
90 |
9.2.5.27 AddListDataType.......................................................................................................................................................... |
|
90 |
9.2.5.28 ListDataTypeSize.......................................................................................................................................................... |
|
91 |
9.2.5.29 SetDataType.................................................................................................................................................................. |
|
91 |
9.2.5.30 SetListDataType............................................................................................................................................................ |
|
92 |
9.2.5.31 SetTensorShape............................................................................................................................................................ |
|
93 |
9.2.5.32 SetListTensorShape..................................................................................................................................................... |
|
93 |
9.2.5.33 AddListTensorShape.................................................................................................................................................... |
|
94 |
9.2.5.34 GetTensorShape........................................................................................................................................................... |
|
95 |
9.2.5.35 GetListTensorShape..................................................................................................................................................... |
|
95 |
9.2.5.36 ListTensorShapeSize.................................................................................................................................................... |
|
96 |
9.2.5.37 SetTensor........................................................................................................................................................................ |
|
96 |
9.2.5.38 SetListTensor................................................................................................................................................................. |
|
97 |
9.2.5.39 AddListTensor................................................................................................................................................................ |
|
97 |
9.2.5.40 GetTensor....................................................................................................................................................................... |
|
98 |
9.2.5.41 GetListTensor................................................................................................................................................................. |
|
98 |
9.2.5.42 ListTensorSize................................................................................................................................................................ |
|
99 |
9.2.6 GetSizeByDataType........................................................................................................................................................... |
|
99 |
9.2.7 Data Types......................................................................................................................................................................... |
|
100 |
9.2.7.1 DataType......................................................................................................................................................................... |
|
100 |
9.2.7.2 Format............................................................................................................................................................................. |
|
101 |
9.2.7.3 DeviceType..................................................................................................................................................................... |
|
102 |
9.2.8 Macro fin n ............................................................................................................................................................ |
|
103 |
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
vi |
CANN |
|
|
|
AI CPU Custom Operator Development Guide |
|
||
(Inference) |
|
|
Contents |
9.2.8.1 REGISTER_CPU_KERNEL............................................................................................................................................ |
|
103 |
|
9.3 Operator Prototype |
fin |
n APIs............................................................................................................................. |
103 |
9.3.1 Introduction....................................................................................................................................................................... |
|
|
103 |
9.3.2 Operator Prototype |
fin |
n APIs.......................................................................................................................... |
105 |
9.3.2.1 Prototype fin |
n (REG_OP)............................................................................................................................... |
105 |
|
9.3.2.2 Derivative APIs.............................................................................................................................................................. |
|
|
111 |
9.3.3 Operator Prototype InferShape APIs........................................................................................................................ |
115 |
||
9.3.3.1 IMPLEMT_INFERFUNC............................................................................................................................................... |
|
115 |
|
9.3.3.2 IMPLEMT_COMMON_INFERFUNC........................................................................................................................ |
116 |
||
9.3.3.3 INFER_FUNC_REG........................................................................................................................................................ |
|
|
117 |
9.3.3.4 COMMON_INFER_FUNC_REG................................................................................................................................. |
117 |
||
9.3.3.5 ELMTWISE_INFER_SHAPEANDTYPE...................................................................................................................... |
118 |
||
9.3.3.6 BROADCAST_INFER..................................................................................................................................................... |
|
119 |
|
9.3.4 Operator Prototype Verify APIs.................................................................................................................................. |
120 |
||
9.3.4.1 IMPLEMT_VERIFIER..................................................................................................................................................... |
|
120 |
|
9.3.4.2 VERIFY_FUNC_REG...................................................................................................................................................... |
|
|
121 |
9.3.5 Operator InferFormat APIs.......................................................................................................................................... |
121 |
||
9.3.5.1 IMPLEMT_INFERFORMAT_FUNC............................................................................................................................ |
121 |
||
9.3.5.2 INFER_FORMAT_FUNC_REG.................................................................................................................................... |
122 |
||
9.3.6 Internal Associated APIs............................................................................................................................................... |
|
123 |
|
9.3.6.1 Class OperatorFactory................................................................................................................................................ |
|
123 |
|
9.3.6.1.1 CreateOperator......................................................................................................................................................... |
|
|
123 |
9.3.6.1.2 GetOpsTypeList......................................................................................................................................................... |
|
|
124 |
9.3.6.1.3 IsExistOp...................................................................................................................................................................... |
|
|
125 |
9.3.6.2 Class OperatorCreatorRegister................................................................................................................................ |
125 |
||
9.3.6.2.1 Constructor and Destructor.................................................................................................................................. |
125 |
||
9.3.6.3 Class InferShapeFuncRegister.................................................................................................................................. |
126 |
||
9.3.6.3.1 Constructor and Destructor.................................................................................................................................. |
126 |
||
9.3.6.4 Class InferFormatFuncRegister............................................................................................................................... |
127 |
||
9.3.6.4.1 Constructor and Destructor.................................................................................................................................. |
127 |
||
9.3.6.5 Class VerifyFuncRegister........................................................................................................................................... |
|
128 |
|
9.3.6.5.1 Constructor and Destructor.................................................................................................................................. |
128 |
||
9.3.6.6 Class InferenceContext............................................................................................................................................... |
|
129 |
|
9.3.6.6.1 InferenceContext Constructor and Destructor............................................................................................... |
129 |
||
9.3.6.6.2 SetInputHandleShapesAndTypes........................................................................................................................ |
130 |
||
9.3.6.6.3 SetOutputHandleShapesAndTypes..................................................................................................................... |
130 |
||
9.3.6.6.4 GetInputHandleShapesAndTypes........................................................................................................................ |
131 |
||
9.3.6.6.5 GetOutputHandleShapesAndTypes.................................................................................................................... |
131 |
||
9.3.6.6.6 SetMarks..................................................................................................................................................................... |
|
|
132 |
9.3.6.6.7 GetMarks..................................................................................................................................................................... |
|
|
133 |
9.3.6.6.8 Create........................................................................................................................................................................... |
|
|
134 |
9.3.6.7 Class ShapeAndType................................................................................................................................................... |
|
134 |
|
Issue 01 (2021-03-11) |
|
Copyright © Huawei Technologies Co., Ltd. |
vii |
CANN |
|
|
AI CPU Custom Operator Development Guide |
|
|
(Inference) |
|
Contents |
9.3.6.7.1 Constructor and Destructor.................................................................................................................................. |
134 |
|
9.3.6.7.2 SetShape...................................................................................................................................................................... |
|
135 |
9.3.6.7.3 GetShape..................................................................................................................................................................... |
|
136 |
9.3.6.7.4 SetType......................................................................................................................................................................... |
|
136 |
9.3.6.7.5 GetDataType.............................................................................................................................................................. |
|
137 |
9.4 operator API......................................................................................................................................................................... |
|
137 |
9.4.1 Overview............................................................................................................................................................................ |
|
137 |
9.4.2 Class Operator.................................................................................................................................................................. |
|
138 |
9.4.2.1 Constructor and Destructor..................................................................................................................................... |
138 |
|
9.4.2.2 AddControlInput........................................................................................................................................................... |
|
139 |
9.4.2.3 BreakConnect................................................................................................................................................................ |
|
139 |
9.4.2.4 IsEmpty............................................................................................................................................................................ |
|
140 |
9.4.2.5 InferShapeAndType..................................................................................................................................................... |
|
140 |
9.4.2.6 GetAttr............................................................................................................................................................................. |
|
141 |
9.4.2.7 GetAllAttrNamesAndTypes....................................................................................................................................... |
144 |
|
9.4.2.8 GetDynamicInputNum............................................................................................................................................... |
|
144 |
9.4.2.9 GetDynamicInputDesc............................................................................................................................................... |
|
145 |
9.4.2.10 GetDynamicOutputNum......................................................................................................................................... |
146 |
|
9.4.2.11 GetDynamicOutputDesc......................................................................................................................................... |
146 |
|
9.4.2.12 GetDynamicSubgraph.............................................................................................................................................. |
|
147 |
9.4.2.13 GetDynamicSubgraphBuilder................................................................................................................................ |
148 |
|
9.4.2.14 GetInferenceContext................................................................................................................................................. |
|
149 |
9.4.2.15 GetInputConstData................................................................................................................................................... |
|
149 |
9.4.2.16 GetInputsSize.............................................................................................................................................................. |
|
150 |
9.4.2.17 GetInputDesc.............................................................................................................................................................. |
|
151 |
9.4.2.18 GetName...................................................................................................................................................................... |
|
152 |
9.4.2.19 GetSubgraph............................................................................................................................................................... |
|
152 |
9.4.2.20 GetSubgraphBuilder................................................................................................................................................. |
|
153 |
9.4.2.21 GetSubgraphNamesCount...................................................................................................................................... |
154 |
|
9.4.2.22 GetSubgraphNames................................................................................................................................................. |
|
154 |
9.4.2.23 GetOpType................................................................................................................................................................... |
|
155 |
9.4.2.24 GetOutputDesc.......................................................................................................................................................... |
|
155 |
9.4.2.25 GetOutputsSize.......................................................................................................................................................... |
|
156 |
9.4.2.26 SetAttr........................................................................................................................................................................... |
|
157 |
9.4.2.27 SetInput........................................................................................................................................................................ |
|
160 |
9.4.2.28 SetInferenceContext................................................................................................................................................. |
|
161 |
9.4.2.29 TryGetInputDesc........................................................................................................................................................ |
|
162 |
9.4.2.30 UpdateInputDesc....................................................................................................................................................... |
|
163 |
9.4.2.31 UpdateOutputDesc................................................................................................................................................... |
|
164 |
9.4.2.32 UpdateDynamicInputDesc..................................................................................................................................... |
164 |
|
9.4.2.33 UpdateDynamicOutputDesc.................................................................................................................................. |
165 |
|
9.4.2.34 VerifyAllAttr................................................................................................................................................................. |
|
166 |
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
viii |
CANN |
|
|
AI CPU Custom Operator Development Guide |
|
|
(Inference) |
|
Contents |
9.4.3 Class Tensor....................................................................................................................................................................... |
|
167 |
9.4.3.1 Constructor and Destructor..................................................................................................................................... |
167 |
|
9.4.3.2 Clone................................................................................................................................................................................ |
|
168 |
9.4.3.3 IsValid.............................................................................................................................................................................. |
|
168 |
9.4.3.4 GetData........................................................................................................................................................................... |
|
169 |
9.4.3.5 GetTensorDesc.............................................................................................................................................................. |
|
170 |
9.4.3.6 GetSize............................................................................................................................................................................. |
|
170 |
9.4.3.7 SetData............................................................................................................................................................................ |
|
171 |
9.4.3.8 SetTensorDesc............................................................................................................................................................... |
|
172 |
9.4.4 Class TensorDesc............................................................................................................................................................. |
|
172 |
9.4.4.1 Constructor and Destructor..................................................................................................................................... |
172 |
|
9.4.4.2 GetDataType.................................................................................................................................................................. |
|
173 |
9.4.4.3 GetFormat...................................................................................................................................................................... |
|
174 |
9.4.4.4 GetName........................................................................................................................................................................ |
|
175 |
9.4.4.5 GetOriginFormat.......................................................................................................................................................... |
|
175 |
9.4.4.6 GetOriginShape............................................................................................................................................................ |
|
176 |
9.4.4.7 GetRealDimCnt............................................................................................................................................................. |
|
177 |
9.4.4.8 GetShape........................................................................................................................................................................ |
|
177 |
9.4.4.9 GetShapeRange............................................................................................................................................................ |
|
178 |
9.4.4.10 GetSize.......................................................................................................................................................................... |
|
178 |
9.4.4.11 SetDataType................................................................................................................................................................ |
|
179 |
9.4.4.12 SetFormat.................................................................................................................................................................... |
|
180 |
9.4.4.13 SetName....................................................................................................................................................................... |
|
180 |
9.4.4.14 SetOriginFormat........................................................................................................................................................ |
|
181 |
9.4.4.15 SetOriginShape.......................................................................................................................................................... |
|
182 |
9.4.4.16 SetSize........................................................................................................................................................................... |
|
182 |
9.4.4.17 SetRealDimCnt........................................................................................................................................................... |
|
183 |
9.4.4.18 SetShape....................................................................................................................................................................... |
|
184 |
9.4.4.19 SetShapeRange.......................................................................................................................................................... |
|
184 |
9.4.4.20 SetUnknownDimNumShape.................................................................................................................................. |
185 |
|
9.4.4.21 Update.......................................................................................................................................................................... |
|
186 |
9.4.5 Class Shape....................................................................................................................................................................... |
|
187 |
9.4.5.1 Constructor and Destructor..................................................................................................................................... |
187 |
|
9.4.5.2 GetDim............................................................................................................................................................................ |
|
187 |
9.4.5.3 GetDims.......................................................................................................................................................................... |
|
188 |
9.4.5.4 GetDimNum.................................................................................................................................................................. |
|
189 |
9.4.5.5 GetShapeSize................................................................................................................................................................. |
|
189 |
9.4.5.6 SetDim............................................................................................................................................................................. |
|
190 |
9.4.6 Class AttrValue................................................................................................................................................................. |
|
191 |
9.4.6.1 Constructor and Destructor..................................................................................................................................... |
191 |
|
9.4.6.2 CreateFrom.................................................................................................................................................................... |
|
191 |
9.4.6.3 GetValue......................................................................................................................................................................... |
|
192 |
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
ix |
CANN |
|
|
AI CPU Custom Operator Development Guide |
|
|
(Inference) |
|
Contents |
9.4.7 Data Type and Enumerated Value............................................................................................................................ |
193 |
|
9.4.7.1 Format............................................................................................................................................................................. |
|
193 |
9.4.7.2 DataType......................................................................................................................................................................... |
|
194 |
9.4.7.3 Struct UsrQuantizeFactor.......................................................................................................................................... |
194 |
|
9.4.7.4 Struct TensorDescInfo................................................................................................................................................ |
|
195 |
9.4.7.5 GetSizeByDataType..................................................................................................................................................... |
|
195 |
9.5 Operator Plug-in APIs....................................................................................................................................................... |
|
195 |
9.5.1 Overview............................................................................................................................................................................ |
|
195 |
9.5.2 Class OpRegistrationData............................................................................................................................................ |
|
196 |
9.5.2.1 Overview......................................................................................................................................................................... |
|
196 |
9.5.2.2 Constructor and Destructor..................................................................................................................................... |
196 |
|
9.5.2.3 REGISTER_CUSTOM_OP............................................................................................................................................. |
|
197 |
9.5.2.4 FrameworkType............................................................................................................................................................ |
|
197 |
9.5.2.5 OriginOpType................................................................................................................................................................ |
|
198 |
9.5.2.6 ParseParamsFn............................................................................................................................................................. |
|
199 |
9.5.2.7 ParseParamsByOperatorFn....................................................................................................................................... |
200 |
|
9.5.2.8 FusionParseParamsFn................................................................................................................................................. |
|
201 |
9.5.2.9 FusionParseParamsFn (Overload) ........................................................................................................................ |
202 |
|
9.5.2.10 ParseSubgraphPostFn.............................................................................................................................................. |
|
203 |
9.5.2.11 ParseOpToGraphFn................................................................................................................................................... |
|
204 |
9.5.2.12 ImplyType..................................................................................................................................................................... |
|
206 |
9.5.2.13 DelInputWithCond.................................................................................................................................................... |
|
207 |
9.5.2.14 DelInputWithOriginalType..................................................................................................................................... |
208 |
|
9.5.2.15 GetImplyType.............................................................................................................................................................. |
|
208 |
9.5.2.16 GetOmOptype............................................................................................................................................................ |
|
209 |
9.5.2.17 GetOriginOpTypeSet................................................................................................................................................ |
|
209 |
9.5.2.18 GetFrameworkType................................................................................................................................................... |
|
210 |
9.5.2.19 GetParseParamFn...................................................................................................................................................... |
|
210 |
9.5.2.20 GetParseParamByOperatorFn............................................................................................................................... |
210 |
|
9.5.2.21 GetFusionParseParamFn......................................................................................................................................... |
211 |
|
9.5.2.22 GetFusionParseParamByOpFn.............................................................................................................................. |
211 |
|
9.5.2.23 GetParseSubgraphPostFn....................................................................................................................................... |
212 |
|
9.5.2.24 GetParseOpToGraphFn............................................................................................................................................ |
|
212 |
9.5.2.25 AutoMappingFn......................................................................................................................................................... |
|
213 |
9.5.2.26 AutoMappingByOpFn.............................................................................................................................................. |
|
213 |
9.5.2.27 AutoMappingFnDynamic........................................................................................................................................ |
215 |
|
9.5.2.28 AutoMappingByOpFnDynamic............................................................................................................................. |
216 |
|
9.5.2.29 AutoMappingSubgraphIndex................................................................................................................................ |
218 |
|
9.5.2.30 InputReorderVector................................................................................................................................................... |
|
218 |
9.5.3 Class OpReceiver............................................................................................................................................................. |
|
219 |
9.5.3.1 Constructor and Destructor..................................................................................................................................... |
219 |
|
9.5.4 DECLARE_ERRORNO...................................................................................................................................................... |
|
220 |
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
x |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
1 Quick Start |
1Quick Start
1.2Operator Basics
1.1Neural Network Introduction
To enable computers to master knowledge like human beings, a multi-layer connection network needs to be constructed to fin a complex object. After iterative computing and training of the network, it can extract object features. Generally, this method is called deep learning (DL). With uninterrupted development, deep learning has displayed its tremendous application value and is receiving increasing attentions from the industry and academia. Deep learning has achieved remarkable progresses in image, voice, natural language processing, big data feature extraction, and ad click-through rate estimation. As a result, multiple infrastructures, such as C ff , MXNet, and TensorFlow, have been developed to promote deep learning across fi
Deep neural network research fuels rapid development of neural network models, enabling them to complete more and more complex processing tasks in a wider range of fi With the rapid development of semiconductor chips and computer technologies for decades, ever fast and n gy ffic n computing resources have been provided for neural network models and data, such as CPUs, GPUs, TPUs, and the latest Ascend AI Processor launched by Huawei.
fic |
neural network (ANN) may also be referred to as neural network (NN) |
|
for short, which is an important branch of machine learning (ML). Scientists |
||
perform mathematical modeling on the most basic neurons and build |
fic |
|
neural networks based on the certain hierarchical relationship of neurons, |
|
|
enabling |
fic neural networks to learn knowledge, adjust their internal |
structures through learning and training, and thereby achieve various complex computations.
A deep learning algorithm consists of multiple compute units referred to as operators (Ops). In network models, an operator describes the compute logic of
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
1 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
1 Quick Start |
the layer, for example, the convolution layer that performs convolution and the fully-connected (FC) layer that multiplies the input by a weight matrix.
The following introduces some basic terms about operators.
Operator Name
The name of an operator n fi the operator on a network, and therefore must be unique on a network. The example network has operators Conv1, Pool1, and Conv2. Conv1 and Conv2 are of the same type convolution. Conv1 and Conv2 each indicates a convolution operation.
Figure 1-1 Example network topology
Operator Type
Each operator is of a c fic type. For example, the convolution operator is of the convolution type. A network can have ff n operators of the same type.
Tensor
Tensors are used to represent the input data and output data in operator computations. TensorDesc (the tensor descriptor) describes the input data and output data. Table 1-1 describes the attributes of the TensorDesc struct.
Table 1-1 Description of the TensorDesc attributes
Attribute |
fin n |
|
|
name |
Indexes a tensor and must be unique. |
|
|
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
2 |
CANN |
|
|
|
AI CPU Custom Operator Development Guide |
|
||
(Inference) |
|
1 Quick Start |
|
|
|
|
|
|
Attribute |
fin |
n |
|
|
|
|
|
shape |
c fi |
the shape of a tensor, for example, (10,), |
|
|
(1024,1024), or (2,3,4). For details, see Shape. |
|
|
|
Default: none |
|
|
|
Format: (i1, i2, ..., in), where, i1 to in are positive |
|
|
|
integers. |
|
|
|
|
|
|
dtype |
c fi |
the data types of a tensor object. |
|
|
Default: none |
|
|
|
Value range: fl 1 , fl 32, int8, int16, int32, uint8, |
|
|
|
uint16, bool |
|
|
|
|
|
|
Format |
c fi |
the data layout format. For details, see Format. |
|
|
|
|
Format
In the deep learning framework, n-dimensional data is stored in an n-dimensional array. For example, a feature graph of a convolutional neural network is stored in a four-dimensional array. The four dimensions are batch size (N), height (H), width (W), and channels (C), respectively.
Data can be stored only in linear mode because the dimensions have a fix order. ff n deep learning frameworks store feature maps in ff n layouts. For
example, C ff uses the layout [Batch, Channels, Height, Width], that is, NCHW, while TensorFlow uses the layout [Batch, Height, Width, Channels], that is, NHWC.
As shown in Figure 1-2, for an RGB image, the pixel values of each channel are clustered in sequence as RRRGGGBBB with the NCHW layout. However, with the NHWC layout, the pixel values are interleaved as RGBRGBRGB.
Figure 1-2 NCHW and NHWC
Shape
The shape of a tensor is described in the format of (D0, D1, ..., Dn – 1), where, D0 to Dn are positive integers.
For example, the shape (3, 4) indicates a 3 x 4 matrix, where the fi dimension has three elements, and the second dimension has four elements.
The number count in the round bracket equals to the dimension count of the tensor. The fi element depends on the element count in the outer square brackets, and the second element depends on the element count in the second left square bracket, and so on. See the following examples.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
3 |
CANN |
|
|
AI CPU Custom Operator Development Guide |
|
|
(Inference) |
1 Quick Start |
|
|
Table 1-2 Tensor shape examples |
|
|
|
|
|
Tensor |
Shape |
|
|
|
|
1 |
(0,) |
|
|
|
|
[1, 2, 3] |
(3,) |
|
|
|
|
[[1, 2],[3, 4]] |
(2, 2) |
|
|
|
|
[[[1, 2],[3, 4]], [[5, 6],[7, 8]]] |
(2,2,2) |
|
|
|
The tensor shape has its physical meanings:
For a tensor with shape (4, 20, 20, 3), it indicates four 20 x 20 (corresponding to the two 20s in the shape) pictures (corresponding to 4 in the shape), each of whose pixel contains the red, green, and blue color components (corresponding to 3 in the shape).
Figure 1-3 Physical meanings of tensor shape
In programming, the shape can be simply understood as a loop of each layer of a tensor. For example, for operating tensor A with shape (4, 20, 20, 3), the loop statement is as follows.
produce A { for (i, 0, 4) {
for (j, 0, 20) { for (p, 0, 20) {
for (q, 0, 3) {
A[((((((i*20) + j)*20) + p)*3) + q)] = a_tensor[((((((i*20) + j)*20) + p)*3) + q)]
}
}
}
}
}
Axis
An axis is denoted by the index of a dimension of a tensor. For a 2D tensor with fiv rows and six columns, that is, with shape (5, 6), axis 0 represents the fi
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
4 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
1 Quick Start |
dimension in the tensor, that is, the rows; axis 1 represents the second dimension of tensor, that is, the columns.
For example, for tensor [[[1, 2],[3, 4]], [[5, 6],[7, 8]]] with shape (2, 2, 2), axis 0 represents data in the fi dimension, that is, matrices [[1, 2],[3, 4]] and [[5, 6], [7, 8]], axis 1 represents data in the second dimension, that is, arrays [1, 2], [3, 4], [5, 6], and [7, 8], and axis 2 indicates the data in the third dimension, that is, numbers 1, 2, 3, 4, 5, 6, 7, and 8.
A negative axis is interpreted as indexing from the end.
The axes of an n-dimensional tensor include 0, 1, 2, ..., and n – 1.
Figure 1-4 Axis diagram
Weight
The input data is multiplied by a weight value in the compute unit. For example, for a two-input operator, an associated weight value is allocated to each of the inputs. Generally, data of more importance is assigned with a greater weight value. Therefore, the feature indicated by data with zero weight can be ignored.
As shown in Figure 1-5, in the compute unit, input X1 is multiplied by its associated weight W1, that is, X1 * W1.
Figure 1-5 Weight computation example
Bias
A bias is another linear component to be applied to the input data, in addition to a weight. The bias is added to the product of the input and its weight.
As shown in Figure 1-6, in the compute unit, input X1 is multiplied by its associated weight W1 and then added with its associated bias B1, that is, X1 * W1 + B1.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
5 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
1 Quick Start |
Figure 1-6 Bias computation example
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
6 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
2 AI CPU Introduction |
2AI CPU Introduction
2.2Building and Running an Operator
2.1Overview
AI CPU executes CPU operators (including control, scalar, and vector operators) on Ascend AI Processor. The following fig shows its context in the Ascend solution architecture.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
7 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
2 AI CPU Introduction |
Figure 2-1 System architecture
The following components are involved in building and executing AI CPU operators:
● Graph Engine (GE): a n fi IR interface provided by Huawei based on the Ascend AI Software Stack for interfacing with ff n machine learning frameworks, such as TensorFlow and PyTorch. GE implements the preparation, partition, optimization, compilation, loading, execution, and management of the network topology, or the graph.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
8 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
2 AI CPU Introduction |
●AI CPU Engine: interfaces with GE, provides the AI CPU operator information library, and implements operator registration, operator memory allocation calculation, subgraph optimization, and task generation.
●AI CPU Schedule: works with the Task Schedule to schedule and execute NN models.
●AI CPU Processor: completes operator computations and provides the operator implementation library for implementing the execution of AI CPU operators.
●Data Processor: preprocesses data of training samples in training scenarios.
Logical Architecture for Building and Running an Operator
A complete AI CPU operator consists of four parts: operator prototype fin n operator adaption plug-in of the corresponding open-source framework, operator information library fin n and operator implementation.
Figure 2-2 shows the logical architecture of building and running a developed operator on the Ascend AI Processor hardware platform.
Figure 2-2 Logical architecture for building and running an operator
TFAdapter is used only for training based on the TensorFlow framework.
The columns in the preceding fig are the deliverables to be implemented during custom operator development.
Deliver Description able
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
9 |
CANN |
|
|
|
|
|
|
|
|
AI CPU Custom Operator Development Guide |
|
|
|
|
|
|
||
(Inference) |
|
|
|
|
|
|
2 AI CPU Introduction |
|
|
|
|
|
|
|
|
|
|
|
Operat |
The operator class implementation includes the operator fin n |
||||||
|
or |
and operator computation implementation. |
|
|
||||
|
implem |
|
|
|
|
|
|
|
|
entatio |
|
|
|
|
|
|
|
|
n |
|
|
|
|
|
|
|
|
|
|
||||||
|
Operat |
In the custom operator development scenario based on a third-party |
||||||
|
or |
framework (such as TensorFlow and C ff ) |
after developing |
|||||
|
plug-in |
implementation code of the custom operator, you need to develop an |
||||||
|
|
adaptation plug-in to map the third-party operator to an operator |
||||||
|
|
supported by the Ascend AI Processor and register the operator |
||||||
|
|
information with GE. To run a network trained on a third-party |
||||||
|
|
framework, the operator plug-in information in GE is loaded and |
||||||
|
|
called to parse and map the operators on the network to operators |
||||||
|
|
supported by the Ascend AI Processor. |
|
|
|
|||
|
|
|
|
|
|
|||
|
Operat |
The operator prototype fin |
n |
c fi |
the constraints on an |
|||
|
or |
operator that runs on the Ascend AI Processor, mainly fl c ng the |
||||||
|
prototy |
mathematical meanings of the operator. The constraints include |
||||||
|
pe |
fin ng the operator inputs, outputs, and other attributes, verifying |
||||||
|
library |
arguments, and inferring the shape. During network execution, GE |
||||||
|
|
calls the v fic |
n API of the operator prototype library to verify |
|||||
|
|
operator arguments. If the v |
fic |
n passes, GE infers the output |
||||
|
|
shape and dtype of each node by calling the inference function of the |
||||||
|
|
operator prototype library and allocates static memory for the result |
||||||
|
|
tensor. |
|
|
|
|
|
|
|
|
|
|
|
||||
|
Operat |
The operator information library mainly |
fl |
c the restrictions on the |
||||
|
or |
physical implementation of operators on the Ascend AI Processor, |
||||||
|
inform |
including the input and output names and data types. During network |
||||||
|
ation |
execution, AI CPU Engine performs basic v |
|
fic |
n and operator |
|||
|
library |
matching based on the operator information in the operator |
||||||
|
|
information library. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Building an Operator
Figure 2-3 shows the w fl w of building an AI CPU operator.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
10 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
2 AI CPU Introduction |
Figure 2-3 Building an AI CPU operator
1.Deliver a third-party network model to GE.
For TensorFlow-based online training, TF Adapter is called to generate the source TensorFlow model, which is then delivered to GE. For AscendCL-based model inference, the source model is directly delivered to GE.
The topology of a network model is referred to as a graph.
2.GE calls the operator plug-in to map operators in the source network model to operators supported by the Ascend AI Processor, so that the original
n w C ff graph can be parsed into a graph supported by the Ascend AI Processor.
3. GE calls the v fic |
n API of the operator prototype library to verify |
|
operator arguments. If the v fic |
n passes, GE infers the output shape and |
dtype of each node by calling the inference function of the operator prototype library and allocates memory for the result tensor.
4.GE delivers the entire graph to AI CPU Engine. AI CPU Engine reads the operator information library, looks up an appropriate format for the operator, and returns the format to GE.
5.GE partitions the graph into subgraphs and delivers the subgraphs to AI CPU Engine. AI CPU Engine optimizes the subgraphs and returns the optimized subgraphs to GE.
6.GE builds the graph (including memory and stream allocation) and sends a genTask request to AI CPU Engine. Then, AI CPU Engine returns the taskinfo of the operator to GE. After the graph build process is complete, a model fi that adapts to the Ascend AI Processor is generated.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
11 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
2 AI CPU Introduction |
Running an Operator
Figure 2-4 shows the w fl w of running an AI CPU operator.
Figure 2-4 Running an AI CPU operator
1.GE delivers an operator execution request.
2.Runtime delivers the corresponding tasks to AI CPU Schedule.
3.AI CPU Schedule schedules tasks and calls the operator compute API.
4.The AI CPU Operator Package (OPP) parses and instantiates the operator implementation, and executes the Compute function to run the operator.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
12 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
3 Operator Development W fl w |
3Operator Development W fl w
The following fig |
shows the AI CPU custom operator development w fl w |
Figure 3-1 Operator development w |
fl w |
Table 3-1 describes the development w fl w
Table 3-1 AI CPU operator command-line development w fl w
Action |
Description |
|
|
See Also |
|
|
|
|
|
Environment |
Set up the development and operating |
|
4 Operator |
|
setup |
environment required for operator |
|
Development |
|
|
development, execution, and v |
fic |
n |
Preparations |
|
|
|
||
Operator |
Analyze the operator, specify its functionality, |
4.2 Operator |
||
analysis |
input, and output, and determine the operator |
Analysis |
||
|
type and the name of the OPP fi |
generated |
|
|
|
after the operator is built. |
|
|
|
|
|
|
|
|
Project |
Create a custom operator project. |
|
4.3 Project |
|
creation |
|
|
|
Creation |
|
|
|
|
|
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
13 |
CANN |
|
|
|
|
|
|
|
|
|
|
|
|
AI CPU Custom Operator Development Guide |
|
|
|
|
|
|
|
|
||||
(Inference) |
|
|
|
|
|
|
3 Operator Development W fl w |
|||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Action |
|
Description |
|
|
|
|
|
See Also |
|
||
|
|
|
|
|||||||||
|
Operator |
Implement the compute logic of the operator. |
5.1 Operator |
|||||||||
|
code |
|
|
|
|
|
|
|
|
|
Code |
|
|
implementati |
|
|
|
|
|
|
|
|
Implementati |
||
|
on |
|
|
|
|
|
|
|
|
|
on |
|
|
|
|
|
|
|
|||||||
|
Operator |
Implement the operator prototype |
fin |
n |
5.2 Operator |
|||||||
|
prototype |
fi |
which |
c fi |
the constraints on an |
|
Prototype |
|||||
|
fin |
n |
operator that runs on the Ascend AI Processor, |
fin |
n |
|||||||
|
|
|
mainly |
fl c |
ng the mathematical meanings |
|
|
|||||
|
|
|
of the operator. The constraints include |
|
|
|
||||||
|
|
|
fin ng the operator inputs, outputs, |
|
|
|
|
|||||
|
|
|
attributes, and value ranges, verifying |
|
|
|
||||||
|
|
|
arguments, and inferring the shape. The |
|
|
|
||||||
|
|
|
information |
fin |
by the prototype is |
|
|
|
||||
|
|
|
registered with the operator prototype library |
|
|
|||||||
|
|
|
of GE. During |
ffl n |
model conversion, GE |
|
|
|
||||
|
|
|
calls the v fic |
n API of the operator |
|
|
|
|||||
|
|
|
prototype library to verify operator arguments. |
|
|
|||||||
|
|
|
If the v |
fic |
n passes, GE infers the output |
|
|
|||||
|
|
|
shape and dtype of each node by calling the |
|
|
|||||||
|
|
|
inference function of the operator prototype |
|
|
|||||||
|
|
|
library and allocates static memory for the |
|
|
|
||||||
|
|
|
result tensor. |
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|||||||
|
Operator |
The operator information c nfig |
n fi |
is |
5.3 Operator |
|||||||
|
information |
used to register the operator information with |
Information |
|||||||||
|
fin |
n |
the operator information library, including the |
Library |
|
|||||||
|
|
|
OpType and input/output dtype and name. |
fin |
n |
|||||||
|
|
|
During network execution, AI CPU Engine |
|
|
|
||||||
|
|
|
performs basic v |
fic |
n and operator |
|
|
|
||||
|
|
|
matching based on the operator information |
|
|
|||||||
|
|
|
in the operator information library. |
|
|
|
|
|||||
|
|
|
|
|||||||||
|
Operator |
If your custom operator is developed based on |
5.4 Operator |
|||||||||
|
plug-in |
|
a third-party framework (such as TensorFlow |
Adaptation |
||||||||
|
implementati |
or C |
ff ) |
you need to develop a plug-in to |
|
|
|
|||||
|
on |
|
map the operator to one that adapts to the |
|
|
|||||||
|
|
|
Ascend AI Processor. |
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
|
|
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
14 |
CANN |
|
|
|
|
|
|
|
|
AI CPU Custom Operator Development Guide |
|
|
|
|
||||
(Inference) |
|
|
|
3 Operator Development W |
fl w |
|||
|
|
|
|
|
|
|
|
|
|
Action |
|
Description |
|
|
See Also |
|
|
|
|
|
|
|
||||
|
Operator |
|
● Operator build: builds the operator plug-in |
5.5.1 Operator |
||||
|
project |
|
implementation fi |
prototype |
fin n |
Project |
|
|
|
building and |
fi |
and information |
fin n fi |
into the |
Building |
|
|
|
deployment |
operator plug-in library, operator prototype |
|
|
||||
|
|
|
library, and operator information library, |
|
|
|||
|
|
|
respectively. |
|
|
|
|
|
|
|
|
● Operator deployment: deploys the operator |
|
|
|||
|
|
|
implementation fi |
plug-in library, |
|
|
||
|
|
|
prototype library, and information library to |
|
|
|||
|
|
|
the system OPP, that is, a corresponding |
|
|
|||
|
|
|
directory in the opp directory. |
|
|
|
||
|
|
|
In the command line, you can use the build |
|
|
|||
|
|
|
script of the sample project for one-click |
|
|
|||
|
|
|
compilation. A custom OPP will be generated. |
|
|
|||
|
|
|
Specify the opp directory and execute the OPP |
|
|
|||
|
|
|
to deploy your custom operator. |
|
|
|
||
|
|
|
|
|
||||
|
Operator ST |
System Testing (ST) v |
fi the operator |
6 Operator ST |
||||
|
|
|
correctness in an actual hardware |
|
|
|
||
|
|
|
environment. |
|
|
|
|
|
|
|
|
|
|
||||
|
Operator |
|
Load the custom operator to a model for |
7 Operator |
||||
|
v fic |
n |
v fic |
n |
|
|
V fic |
n |
|
on network |
|
|
|
|
on Network |
||
|
|
|
|
|
|
|
|
|
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
15 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
4 Operator Development Preparations |
4Operator Development Preparations
4.3Project Creation
4.1Environment Setup
●Before custom operator development, you need to set up the development environment and operating environment.
Set the development environment and operating environment by referring to
CANN Software Installation Guide.
a.Select an installation scheme and install the required hardware to run on the development environment and operating devices.
b. |
Deploy and install Toolkit and c nfig |
environment variables in the |
|
development environment. |
|
c. |
Install the inference software and c nfig |
environment variables in the |
|
operating environment. |
|
Once the development environment is set up, you can obtain the API header fi and the library fi required for building and running operators.
Once the operating environment is set up, you can run the executable fi generated after build.
AI CPU operator development depends on the AI CPU OPP. During environment setup, make sure to install the AI CPU OPP.
●If you tend to develop custom operators in MindStudio, install MindStudio by referring to MindStudio User Guide.
4.2Operator Analysis
Before developing an AI CPU operator, you need to determine the operator function, input, output, development mode, operator type (OpType), implementation function name, and more.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
16 |
CANN |
|
AI CPU Custom Operator Development Guide |
|
(Inference) |
4 Operator Development Preparations |
Step 1 Specify the operator function and mathematical expression.
Take the Add operator as an example. The mathematical expression of the Add operator is as follows:
z=x+y
The Add operator adds two inputs and returns a result.
Step 2 Specify the inputs and output.
●The Add operator has two inputs, x and y, and outputs the result z.
● The supported input data types include fl 1 fl 32 and int32. The output has the same data type as the inputs.
●The operator inputs support all shapes. The output has the same shape as the inputs.
●The operator input supports the following formats: NCHW, NC1HWC0,
NHWC, and ND.
Step 3 Specify the operator implementation fi name and operator type (OpType).
●Name OpType in upper camel case and indicate the separation of words with a single capitalized letter.
● Name the operator fi |
in either of the following ways: |
Name the operator fi |
after OpType as follows: |
– Convert the fi |
uppercase letter to a lowercase letter. |
Example: Abc -> abc
–Replace each uppercase letter following lowercase letters with an underscore (_) and a lowercase letter.
Example: AbcDef -> abc_def
–Uppercase letters following a digit or an uppercase letter are regarded as a semantic string. If there is a lowercase letter after this string, replace the last uppercase letter in this string with an underscore (_) and a lowercase letter, and convert the other uppercase letters into lowercase letters. If there is no lowercase letter after the string, directly convert the string into lowercase letters.
Examples: ABCDef -> abc_def; Abc2DEf -> abc2d_ef; Abc2DEF -> abc2def; ABC2dEF -> abc2d_ef
In this example, OpType of the operator is |
fin as Add. You are advised to |
||||||
name the deliverables as follows: |
|
|
|
||||
● |
Code implementation (or kernel implementation) fi |
of the operator: |
|||||
|
add_kernel.h and add_kernel.cc |
|
|
|
|||
● |
Plug-in implementation fi |
add_kernel_plugin.cpp |
|
||||
● |
Prototype |
fin |
n fi |
add.h and add.cpp |
|
|
|
● |
Information |
fin |
n fi |
add.ini |
|
|
|
Based on the preceding analysis, the design |
c fic |
n |
of the Add operator are |
||||
as follows. |
|
|
|
|
|
|
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
17 |
CANN |
|
|
|
|
|
|
|
|
|
AI CPU Custom Operator Development Guide |
|
|
|
|
|
|
|
||
(Inference) |
|
|
4 Operator Development Preparations |
||||||
|
Table 4-1 Add operator design |
c fic n |
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
OpType |
Add |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
Operator Input |
Name: x |
|
Shape: all |
Data type: |
|
|
||
|
|
|
|
|
fl |
1 |
fl |
32 |
int32 |
|
|
|
|
|
|
|
|
||
|
|
Name: y |
|
Shape: all |
Data type: |
|
|
||
|
|
|
|
|
fl |
1 |
fl |
32 |
int32 |
|
|
|
|
|
|
|
|
||
|
Operator Output |
Name: z |
|
Shape: all |
Data type: |
|
|
||
|
|
|
|
|
fl |
1 |
fl |
32 |
int32 |
|
|
|
|
|
|
|
|
|
|
|
Operator |
add |
|
|
|
|
|
|
|
|
Implementation |
|
|
|
|
|
|
|
|
|
File Name |
|
|
|
|
|
|
|
|
|
----End |
|
|
|
|
|
|
|
|
Overview
Before developing an operator, you need to create an operator project.
MindStudio Mode
For details about how to create an operator project in MindStudio, see "Custom Operator Development > Project Creation" in MindStudio User Guide.
Command Line Mode
Click here to download the sample package that matches your CANN version in use. Find the sample in the samples/cplusplus/level1_single_api/4_op_dev/ 1_custom_op directory.
Append your own custom operator to the sample project. The sample project provides some AI CPU and TBE custom operator samples developed from their C ff and TensorFlow counterparts.
NOTE
Note: If you do not have enough permission to obtain the code, contact Huawei technical support to apply for joining the Ascend community.
The directory structure of the operator project is as follows. Develop the operator deliverables in the corresponding directory accordingly.
├── cpukernel |
// Directory of the operator implementation fi (.h and .cc) |
|
│ |
├── impl |
|
│ |
├── op_info_cfg |
|
│ |
├── aicpu_kernel |
|
│ |
├── xx.ini |
// Operator information library fi (.ini) |
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
18 |
CANN |
|
|
|
AI CPU Custom Operator Development Guide |
|
||
(Inference) |
4 Operator Development Preparations |
||
|
|
|
|
├── framework |
|
|
|
|
│ ├── xx_plugin |
// Directory of the operator adaptation plug-in fi |
(.cpp) |
|
├── op_proto |
// Directory of operator prototype fin n fi |
(.h and .cpp) |
NOTICE
● If you need to develop multiple custom AI CPU operators, implement them in the same operator project. Store the implementation fi according to the preceding directory structure.
●If you need to develop custom AI CPU operators and custom TBE operators at the same time, you also need to implement them in the same operator project. For details about TBE operator development, see TBE Custom Operator Development Guide.
Issue 01 (2021-03-11) |
Copyright © Huawei Technologies Co., Ltd. |
19 |