From 91b61d1b25588f536f5647a320bc84c66947a837 Mon Sep 17 00:00:00 2001 From: wuhongfa <1660398197@qq.com> Date: Wed, 24 Jul 2024 15:42:37 +0000 Subject: [PATCH 1/2] =?UTF-8?q?=E3=80=90FEAT=E3=80=91=E6=96=B0=E5=A2=9Eatt?= =?UTF-8?q?ention=20grad=E8=9E=8D=E5=90=88=E7=AE=97=E5=AD=90=20PART2?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- cust_op/attention_fusion_grad/README.md | 175 +++++++++++++++++ .../aclnn_attention_fusion_grad/inc/common.h | 49 +++++ .../inc/op_runner.h | 184 ++++++++++++++++++ .../inc/operator_desc.h | 59 ++++++ .../aclnn_attention_fusion_grad/run.sh | 91 +++++++++ .../attention_fusion_grad.json | 90 +++++++++ cust_op/attention_fusion_grad/creat.sh | 56 ++++++ 7 files changed, 704 insertions(+) create mode 100644 cust_op/attention_fusion_grad/README.md create mode 100644 cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/common.h create mode 100644 cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/op_runner.h create mode 100644 cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/operator_desc.h create mode 100755 cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/run.sh create mode 100644 cust_op/attention_fusion_grad/attention_fusion_grad.json create mode 100755 cust_op/attention_fusion_grad/creat.sh diff --git a/cust_op/attention_fusion_grad/README.md b/cust_op/attention_fusion_grad/README.md new file mode 100644 index 00000000..69f4b57e --- /dev/null +++ b/cust_op/attention_fusion_grad/README.md @@ -0,0 +1,175 @@ +# LazyAdam优化器融合算子及样例说明 + +## LazyAdam融合算子文件结构 + +```shell +├── aclnn_attention_fusion_grad # 单算子测试用例 +├── attention_fusion_grad.json # 算子原型配置 +├── op_host # AttentionGrad融合算子Host侧实现 +├── op_kernel # AttentionGrad融合算子Kernel侧实现 +├── README.md # AttentionGrad融合算子说明文档 +└── creat.sh # AttentionGrad融合算子安装脚本 +``` + +## Ascend C参考设计 + +更多详情可以参考CANN官方的Ascend +C算子开发手册[Ascend C算子开发](https://www.hiascend.com/document/detail/zh/canncommercial/70RC1/operatordev/Ascendcopdevg/atlas_ascendc_10_0001.html)。 + +## LazyAdam融合算子使用 + +1. 上传fused_lazy_adam文件夹到目标环境,并进入当前目录,执行指令对lazy_adam融合算子进行编译和部署 + +```shell +bash creat.sh +``` + +注:需先在环境中设置CANN相关环境变量,再执行算子编译和安装指令。使用默认路径安装CANN时设置环境变量指令如下: + +```shell +source /usr/local/Ascend/ascend-toolkit/set_env.sh +``` + +2. 模型脚本中创建lazy_adam优化器并指定使用融合算子实现。代码示例: + +```python +from mx_rec.optimizers.lazy_adam import create_hash_optimizer + +# 创建lazy_adam优化器时增加"use_fusion_optim=True"参数,表示使用融合算子实现。use_fusion_optim参数默认值为False。 +# lazy_adam优化器详细使用指导请参考mxRec用户指南。 +sparse_optimizer = create_hash_optimizer(learning_rate=0.001, use_fusion_optim=True) +``` + +## LazyAdam融合算子介绍 + +1. 算子分析 + +a) 算子的主要功能是实现Attention的反向计算; +b) 算子输入说明: +* dout: 前向算子反向传播的梯度; +* softmax_out: 前向softmax的输出; +* query: query矩阵; +* key: key矩阵; +* value: value矩阵; + +c) 算子输出说明: +* grad_query: query的反向; +* grad_key: key的反向; +* grad_value: value的反向; + +d) 算子约束说明: +* 支持的型号:Atlas A2系列产品; +* 支持的CANN版本:8.0.RC1及之后版本; +* 支持的输入数据类型:float32; +* 输入的数据的batch size均相等, 且值在(0, 2000) +* 输入的数据的满足attention的公式,shape支持对应的matmul计算 +* 输入的数据除batch size外,所有的维度满足(0, 1000) +* 融合算子在key的第2维度为8的倍数且较长如大于500是计算性能较好 +* 融合算子的性能提升适用于小算子间free时间较长的情况 + +2. Host侧算子实现 + +Host侧算子实现在目录 op_host下 + +a) Tiling实现 + +namespace optiling域中的LazyAdamTilingFunc函数,主要实现从context中获取外部入参信息(输入参数指针、shape信息),及校验有效性; +并计算kernel侧需要的数据切分相关参数,包括softmax、matmul、ub大小、batch等(详情见tiling文件注释),设置BlockDim,最后通过TilingData传递属性信息。 + +b) Shape推导 + +推导输出的rShape和DataType函数体。 + +c) 原型注册 + +定义了算子原型,并将算子注册到GE。 + +3. Kernel侧算子实现 + +Kernel侧算子实现在目录op_kernel下,其中包括:attention_fusion_grad.cpp。 + +a) 核函数的入口:extern "C" __global__ __aicore__ void attention_fusion_grad + +b) 解析tiling参数:GET_TILING_DATA(tilingData, tiling)从TilingData中获取host侧传入的数据 + +c) 调用AttentionFusionGradKernel完成计算; + +## AclNN单算子测试参考设计 + +更多详情可以参考CANN官方的[Ascend C单算子调用概述](https://www.hiascend.com/document/detail/zh/canncommercial/70RC1/operatordev/Ascendcopdevg/atlas_ascendc_10_0036.html)。 + +单算子调用分为两种方式:单算子API执行和模型执行。mxRec提供单算子API执行供参考。 + +单算子测试用例在目录aclnn_attention_fusion_grad下,其中: + +* inc是头文件目录 +* scripts存放生成数据和验证数据的python脚本 +* input是存放算子入参的bin文件 +* output是存放生成的可执行程序execute_op、算子输出bin文件和用于验证的golden数据bin文件 +* src是存放公共函数common、构造算子输入输出描述类oprator_desc、单算子调用主体流程实现op_runner文件和入口main文件 + +执行单算子测试: + +```shell +bash run.sh +``` + +### 前置条件 + +1. + +参考[基于msopgen工具创建算子工程](https://www.hiascend.com/document/detail/zh/canncommercial/70RC1/operatordev/Ascendcopdevg/atlas_ascendc_10_0023.html) +完成算子工程的创建, +参考[kernel侧算子实现](https://www.hiascend.com/document/detail/zh/canncommercial/70RC1/operatordev/Ascendcopdevg/atlas_ascendc_10_0024.html) +完成kernel侧实现的相关准备, +参考[host侧算子实现](https://www.hiascend.com/document/detail/zh/canncommercial/70RC1/operatordev/Ascendcopdevg/atlas_ascendc_10_0026.html) +完成host侧实现相关准备。 + +2. + +参考[算子编译部署](https://www.hiascend.com/document/detail/zh/canncommercial/70RC1/operatordev/Ascendcopdevg/atlas_ascendc_10_0031.html) +完成算子的编译部署,编译部署时需要开启算子的二进制编译功能:修改算子工程中的编译配置项文件CMakePresets.json,将 +ENABLE_BINARY_PACKAGE设置为True。编译部署时可将算子的二进制部署到当前环境,便于后续算子的调用。 + +### Attention融合算子的AclNN调用实现 + +调用入口在src/main.cpp中: + +1. InitResource函数:初始化AscendCL并运行管理资源申请,不用修改 +2. RunLookupOp运行算子: + +a) 创建算子输入输出描述CreateOpDesc,OperatorDesc对象定义(inc/operator_desc.h)中设置了算子入参为成员变量,以便后续 +op_runner中使用; + +b) 创建OpRunner的对象,并依次执行: + +* opRunner.Init():申请内存存放执行算子的输入输出数据 +* SetInputData():加载数据输入bin文件并传输给OpRunner的Buffer供后续算子执行使用 +* opRunner.RunOp():算子执行,主要流程为:入参数据拷贝,创建Stream,执行Stream,输出数据拷贝,释放Stream资源 +* ProcessOutputData():算子输出数据处理,并落盘文件,以供后续与golden数据比对 + +3. DestroyResource函数:释放内存,不用修改 + +### 运行脚本 + +run.sh脚本依次执行: + +1. 清除遗留生成文件和日志文件 +2. 生成输入数据和真值数据 +3. 编译acl可执行文件 +4. 运行可执行文件 +5. 比较真值文件 + +### scripts脚本 + +* gen_data.py:生成LazyAdam融合算子的输入数据和用于精度校验的golden数据,可自行修改测试相关dim参数。 +* verify_result.py:将算子的输出和脚本生成的golden数据进行精度比对,并输出比较结果。比对规则为:允许误差精度loss:1e-4 + +a) 绝对误差 +b) 相对误差 +c) 误差相对个数 + +同时满足绝对误差不全小于loss,相对误差不全小于loss,且绝对误差和相对误差大于loss的个数都超过总数的1/loss,也就是 +1/10000(万分之一),即认为算子精度不达标。其余情况均认为算子达标。 + +用户可自行修改允许精度误差范围loss。 \ No newline at end of file diff --git a/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/common.h b/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/common.h new file mode 100644 index 00000000..225d5994 --- /dev/null +++ b/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/common.h @@ -0,0 +1,49 @@ +/** + * @file common.h + * + * Copyright (C) 2024. Huawei Technologies Co., Ltd. All rights reserved. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + */ +#ifndef COMMON_H +#define COMMON_H + +#include +#include +#include +#include +#include + +#include "acl/acl.h" + +#define SUCCESSED 0 +#define FAILED 1 + +#define INFO_LOG(fmt, args...) fprintf(stdout, "[INFO] " fmt "\n", ##args) +#define WARN_LOG(fmt, args...) fprintf(stdout, "[WARN] " fmt "\n", ##args) +#define ERROR_LOG(fmt, args...) fprintf(stderr, "[ERROR] " fmt "\n", ##args) + +namespace AttentionFusionGrad { +constexpr int NUM_TEST_EXEC = 100; +constexpr int TIME_OUT = 5000; +/** + * @brief Read data from file + * @param [in] filePath: file path + * @param [out] fileSize: file size + * @return read result + */ +bool ReadFile(const std::string& filePath, size_t fileSize, void* buffer, size_t bufferSize); + +/** + * @brief Write data to file + * @param [in] filePath: file path + * @param [in] buffer: data to write to file + * @param [in] size: size to write + * @return write result + */ +bool WriteFile(const std::string& filePath, const void* buffer, size_t size); +} // namespace AttentionFusionGrad + +#endif // COMMON_H diff --git a/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/op_runner.h b/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/op_runner.h new file mode 100644 index 00000000..8005fa64 --- /dev/null +++ b/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/op_runner.h @@ -0,0 +1,184 @@ +/** +* @file op_runner.h +* +* Copyright (C) 2024. Huawei Technologies Co., Ltd. All rights reserved. +* +* This program is distributed in the hope that it will be useful, +* but WITHOUT ANY WARRANTY; without even the implied warranty of +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +*/ +#ifndef OP_RUNNER_H +#define OP_RUNNER_H + +#include "aclnn/acl_meta.h" +#include "acl/acl.h" +#include "common.h" +#include "operator_desc.h" + +namespace AttentionFusionGrad { +/** + * Op Runner + */ +class OpRunner { +public: + /** + * @brief Constructor + * @param [in] opDesc: op description + */ + explicit OpRunner(OperatorDesc* opDesc); + + /** + * @brief Destructor + */ + virtual ~OpRunner(); + + /** + * @brief Init op runner + */ + bool Init(); + + /** + * @brief Get number of inputs + * @return number of inputs + */ + const size_t NumInputs(); + + /** + * @brief Get number of outputs + * @return number of outputs + */ + const size_t NumOutputs(); + + /** + * @brief Get input size by index + * @param [in] index: input index + * @return size of the input + */ + const size_t GetInputSize(size_t index) const; + const size_t GetInputNumDims(size_t index) const; + aclDataType GetInputDataType(size_t index) const; + aclFormat GetInputFormat(size_t index) const; + + /** + * @brief Get output size by index + * @param [in] index: output index + * @return size of the output + */ + size_t GetOutputSize(size_t index) const; + const size_t GetOutputNumDims(size_t index) const; + aclDataType GetOutputDataType(size_t index) const; + aclFormat GetOutputFormat(size_t index) const; + + /** + * @brief Get input element count by index + * @param i[in] ndex: input index + * @return element count of the input + */ + size_t GetInputElementCount(size_t index) const; + + /** + * @brief Get output element count by index + * @param [in] index: output index + * @return element count of the output + */ + size_t GetOutputElementCount(size_t index) const; + + /** + * @brief Get input shape by index + * @param [in] index: input index + * @return shape of the output + */ + std::vector GetInputShape(size_t index) const; + + /** + * @brief Get output shape by index + * @param [in] index: output index + * @return shape of the output + */ + std::vector GetOutputShape(size_t index) const; + + /** + * @brief Get input buffer(host memory) by index + * @tparam T: data type + * @param [in] index: input index + * @return host address of the input + */ + template + T *GetInputBuffer(size_t index) + { + if (index >= numInputs_) { + ERROR_LOG("Index out of range. index = %zu, numInputs = %zu", index, numInputs_); + return nullptr; + } + return reinterpret_cast(hostInputs_[index]); + } + + /** + * @brief Get output buffer(host memory) by index + * @tparam T: data type + * @param [in] index: output index + * @return host address of the output + */ + template + const T *GetOutputBuffer(size_t index) + { + if (index >= numOutputs_) { + ERROR_LOG("Index out of range. index = %zu, numOutputs = %zu", index, numOutputs_); + return nullptr; + } + + return reinterpret_cast(hostOutputs_[index]); + } + + /** + * @brief Print readable input by index + * @param [in] index: input index + * @param [in] elementsPerRow: number of elements per row + */ + void PrintInput(size_t index, size_t elementsPerRow = 16); + + /** + * @brief Print readable output by index + * @param [in] index: output index + * @param [in] elementsPerRow: number of elements per row + */ + void PrintOutput(size_t index, size_t elementsPerRow = 16); + + /** + * @brief Compile static op + * @return compile result + */ + bool CompileStaticOp(); + + /** + * @brief Compile dynamic op + * @return compile result + */ + bool CompileDynamicOp(); + + /** + * @brief Run op + * @return run result + */ + bool RunOp(); + +private: + size_t numInputs_; + size_t numOutputs_; + + std::vector inputBuffers_; + std::vector outputBuffers_; + + std::vector devInputs_; + std::vector devOutputs_; + + std::vector hostInputs_; + std::vector hostOutputs_; + + std::vector inputTensor_; + std::vector outputTensor_; + OperatorDesc *opDesc_; +}; +} + +#endif // OP_RUNNER_H diff --git a/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/operator_desc.h b/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/operator_desc.h new file mode 100644 index 00000000..a643d1d8 --- /dev/null +++ b/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/operator_desc.h @@ -0,0 +1,59 @@ +/** + * @file operator_desc.h + * + * Copyright (C) 2024. Huawei Technologies Co., Ltd. All rights reserved. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + */ +#ifndef OPERATOR_DESC_H +#define OPERATOR_DESC_H + +#include +#include + +#include "acl/acl.h" + +namespace AttentionFusionGrad { +/** + * Op description + */ +struct OperatorDesc { + /** + * Constructor + */ + explicit OperatorDesc(); + + /** + * Destructor + */ + virtual ~OperatorDesc(); + + /** + * Add an input tensor description + * @param [in] dataType: data type + * @param [in] numDims: number of dims + * @param [in] dims: dims + * @param [in] format: format + * @return OperatorDesc + */ + OperatorDesc& AddInputTensorDesc(aclDataType dataType, int numDims, const int64_t* dims, aclFormat format); + + /** + * Add an output tensor description + * @param [in] dataType: data type + * @param [in] numDims: number of dims + * @param [in] dims: dims + * @param [in] format: format + * @return OperatorDesc + */ + OperatorDesc& AddOutputTensorDesc(aclDataType dataType, int numDims, const int64_t* dims, aclFormat format); + + std::string opType; + std::vector inputDesc; + std::vector outputDesc; +}; +} // namespace AttentionFusionGrad + +#endif // OPERATOR_DESC_H diff --git a/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/run.sh b/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/run.sh new file mode 100755 index 00000000..6793de82 --- /dev/null +++ b/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/run.sh @@ -0,0 +1,91 @@ +#!/bin/bash +export ASCEND_SLOG_PRINT_TO_STDOUT=0 +export ASCEND_GLOBAL_LOG_LEVEL=0 + +CURRENT_DIR=$( + cd $(dirname ${BASH_SOURCE:-$0}) + pwd +) +cd $CURRENT_DIR + +SHORT=v:, +LONG=dtype:, +OPTS=$(getopt -a --options $SHORT --longoptions $LONG -- "$@") +eval set -- "$OPTS" +while : +do + case "$1" in + # float16, float, int32 + (-v | --dtype) + DTYPE="$2" + shift 2;; + (--) + shift; + break;; + (*) + echo "[ERROR] Unexpected option: $1"; + break;; + esac +done + +if [ ! $ASCEND_HOME_DIR ]; then + if [ -d "$HOME/Ascend/ascend-toolkit/latest" ]; then + export ASCEND_HOME_DIR=$HOME/Ascend/ascend-toolkit/latest + else + export ASCEND_HOME_DIR=/usr/local/Ascend/ascend-toolkit/latest + fi +fi + +export DDK_PATH=$ASCEND_HOME_DIR +arch=$(uname -m) +export NPU_HOST_LIB=$ASCEND_HOME_DIR/${arch}-linux/lib64 + +function main { + rm -rf $HOME/ascend/log/* + rm ./input/*.bin + rm ./output/*.bin + + cd $CURRENT_DIR + python3 scripts/gen_data.py + if [ $? -ne 0 ]; then + echo "ERROR: generate input data failed!" + return 1 + fi + echo "INFO: generate input data success!" + + cd $CURRENT_DIR; rm -rf build; mkdir -p build; cd build + cmake ../src + if [ $? -ne 0 ]; then + echo "ERROR: cmake failed!" + return 1 + fi + echo "INFO: cmake success!" + make + if [ $? -ne 0 ]; then + echo "ERROR: make failed!" + return 1 + fi + echo "INFO: make success!" + + cd $CURRENT_DIR/output + echo "INFO: execute op!" + ./execute_attention_fusion_grad_op + + if [ $? -ne 0 ]; then + echo "ERROR: acl executable run failed! please check your project!" + return 1 + fi + echo "INFO: acl executable run success!" + cd $CURRENT_DIR + ret=`python3 scripts/verify_result.py output/grad_query.bin output/grad_key.bin output/grad_value.bin output/golden_grad_query.bin output/golden_grad_key.bin output/golden_grad_value.bin ` + echo $ret + if [ "x$ret" == "xtest pass" ]; then + echo "" + echo "#####################################" + echo "INFO: you have passed the Precision!" + echo "#####################################" + echo "" + fi +} + +main diff --git a/cust_op/attention_fusion_grad/attention_fusion_grad.json b/cust_op/attention_fusion_grad/attention_fusion_grad.json new file mode 100644 index 00000000..704568e4 --- /dev/null +++ b/cust_op/attention_fusion_grad/attention_fusion_grad.json @@ -0,0 +1,90 @@ +[ + { + "op": "AttentionFusionGrad", + "language": "cpp", + "input_desc": [ + { + "name": "dout", + "param_type": "required", + "format": [ + "ND" + ], + "type": [ + "float" + ] + }, + { + "name": "softmax_out", + "param_type": "required", + "format": [ + "ND" + ], + "type": [ + "float" + ] + }, + { + "name": "query", + "param_type": "required", + "format": [ + "ND" + ], + "type": [ + "float" + ] + }, + { + "name": "key", + "param_type": "optional", + "format": [ + "ND" + ], + "type": [ + "float" + ] + }, + { + "name": "value", + "param_type": "optional", + "format": [ + "ND" + ], + "type": [ + "float" + ] + } + ], + "output_desc": [ + { + "name": "grad_query", + "param_type": "required", + "format": [ + "ND" + ], + "type": [ + "float" + ] + }, + { + "name": "grad_key", + "param_type": "required", + "format": [ + "ND" + ], + "type": [ + "float" + ] + }, + { + "name": "grad_value", + "param_type": "required", + "format": [ + "ND" + ], + "type": [ + "float" + ] + } + ] + } +] \ No newline at end of file diff --git a/cust_op/attention_fusion_grad/creat.sh b/cust_op/attention_fusion_grad/creat.sh new file mode 100755 index 00000000..2c186bbc --- /dev/null +++ b/cust_op/attention_fusion_grad/creat.sh @@ -0,0 +1,56 @@ +#!/bin/bash +# Copyright 2024. Huawei Technologies Co.,Ltd. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +set -e + + +# 查找msopgen的路径,加入到环境变量PATH中 +msopgen_path=$(find /usr/local/Ascend/ -name msopgen | grep bin) +parent_dir=$(dirname "$msopgen_path") +export PATH=$parent_dir:$PATH + +# 利用msopgen生成可编译文件 +rm -rf ./attention_fusion_grad +python3 /usr/local/Ascend/ascend-toolkit/latest/python/site-packages/bin/msopgen gen -i attention_fusion_grad.json -f tf -c ai_core-Ascend910B1 -lan cpp -out ./attention_fusion_grad -m 0 -op AttentionFusionGrad +rm -rf attention_fusion_grad/op_kernel +rm -rf attention_fusion_grad/host +cp -rf op_kernel attention_fusion_grad/ +cp -rf op_host attention_fusion_grad/ + +cd attention_fusion_grad + +# 判断当前目录下是否存在CMakePresets.json文件 +if [ ! -f "CMakePresets.json" ]; then + echo "ERROR, CMakePresets.json file not exist." + exit 1 +fi + +# 禁止生成CRC校验和 +sed -i 's/--nomd5/--nomd5 --nocrc/g' ./cmake/makeself.cmake + +# 修改cann安装路径 +sed -i 's:"/usr/local/Ascend/latest":"/usr/local/Ascend/ascend-toolkit/latest":g' CMakePresets.json +# 修改vendor_name 防止覆盖之前vendor_name为customize的算子; +# vendor_name需要和aclnn中的CMakeLists.txt中的CUST_PKG_PATH值同步,不同步aclnn会调用失败; +# vendor_name字段值不能包含customize;包含会导致多算子部署场景CANN的vendors路径下config.ini文件内容截取错误 +sed -i 's:"customize":"attention_fusion_grad":g' CMakePresets.json + +bash build.sh + +# # 安装编译成功的算子包 +bash ./build_out/custom_opp*.run +cd ../aclnn_attention_fusion_grad +bash run.sh \ No newline at end of file -- Gitee From 794b3eb889648caddc39fb732e51b9af931387ba Mon Sep 17 00:00:00 2001 From: dev Date: Thu, 25 Jul 2024 01:36:20 +0800 Subject: [PATCH 2/2] =?UTF-8?q?=E3=80=90FEAT=E3=80=91=E6=96=B0=E5=A2=9Eatt?= =?UTF-8?q?ention=20grad=E8=9E=8D=E5=90=88=E7=AE=97=E5=AD=90=20PART2?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../aclnn_attention_fusion_grad/inc/op_runner.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/op_runner.h b/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/op_runner.h index 8005fa64..bd3347ac 100644 --- a/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/op_runner.h +++ b/cust_op/attention_fusion_grad/aclnn_attention_fusion_grad/inc/op_runner.h @@ -37,6 +37,11 @@ public: */ bool Init(); + /** + * @brief Init op runner output info + */ + bool InitOutputInfo(); + /** * @brief Get number of inputs * @return number of inputs -- Gitee