diff --git a/tools/infer_tool/README.md b/tools/infer_tool/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..3d336c5d3190cbf62a3159eaf391d0439a051cc6
--- /dev/null
+++ b/tools/infer_tool/README.md
@@ -0,0 +1,514 @@
+
+
+# ais_bench推理工具使用指南
+
+## 简介
+本文介绍ais_bench推理工具，用来针对指定的推理模型运行推理程序，并能够测试推理模型的性能（包括吞吐率、时延）。
+
+## 工具安装
+
+### 环境和依赖
+
+- 目前ais_bench推理工具支持trtexec和aclruntime推理后端，使用本工具时确保安装这两个后端，且这两个后端可以正常运行。
+- 安装Python3、Python包模块numpy、tqdm、wheel。
+
+### 工具安装方式
+
+ais_bench推理工具的安装方式包括：一键式编译安装和源代码编译安装。
+
+**说明**：
+
+- 安装过程中会自动检查和安装python包依赖，确保安装环境要求网络畅通。
+- centos平台默认为gcc 4.8编译器，可能无法安装本工具，建议更新gcc编译器后再安装。
+
+#### 一键式编译安装
+   在安装环境执行如下命令安装ais_bench推理程序包：
+
+   ```bash
+    pip3 install -v 'git+https://gitee.com/aisbench/inference.git#egg=ais_bench&subdirectory=tools/infer_tool/'
+   ```
+
+   说明：若为覆盖安装，请增加“--force-reinstall”参数强制安装，例如：
+
+   ```bash
+   pip3 install -v --force-reinstall 'git+https://gitee.com/aisbench/inference.git#egg=ais_bench&subdirectory=tools/infer_tool/'
+   ```
+
+   提示如下示例信息则表示安装成功：
+
+   ```bash
+   Successfully installed ais_bench-{version}
+   ```
+
+
+
+#### 源代码编译安装
+1. 从代码开源仓[Gitee](git+https://gitee.com/aisbench/inference.git#egg=ais_bench&subdirectory=tools/infer_tool/)克隆/下载工具压缩包“inference-master.zip”。
+
+2. 将工具压缩包上传并解压至安装环境。
+
+3. 从工具解压目录下进入tools/infer_tool/目录下，执行如下命令进行编译：
+
+   ```bash
+   # 进入工具解压目录
+   cd ${HOME}/tools/infer_tool/
+   # 构建ais_bench推理程序包
+   pip3 wheel ./ -v
+   ```
+
+   其中，${HOME}为ais_bench推理工具包所在目录。
+
+   分别提示如下信息则表示编译成功：
+
+   ```bash
+   # 成功编译ais_bench推理程序包
+   Successfully built ais-bench
+   ```
+
+4. 执行如下命令，进行安装。
+
+   ```bash
+   # 安装ais_bench推理程序
+   pip3 install ./ais_bench-{version}-py3-none-any.whl
+   ```
+
+   {version}表示软件版本号，{python_version}表示Python版本号，{arch}表示CPU架构。
+
+   说明：若为覆盖安装，请增加“--force-reinstall”参数强制安装，例如：
+
+   ```bash
+   pip3 install ./ais_bench-{version}-py3-none-any.whl --force-reinstall
+   ```
+
+   分别提示如下信息则表示安装成功：
+
+   ```bash
+   # 成功安装ais_bench推理程序
+   Successfully installed ais_bench-{version}
+   ```
+
+## 使用方法（以接入aclruntime后端为例）
+
+### 工具介绍
+ais_bench推理工具的使用方法主要通过命令行使用。
+#### 使用入口
+
+ais_bench推理工具可以通过ais_bench可执行文件方式启动模型测试。启动方式如下：
+
+```bash
+python3 -m ais_bench --model <path to offline model>
+```
+
+#### 参数说明
+
+ais_bench推理工具可以通过配置不同的参数，来应对各种测试场景以及实现其他辅助功能。
+
+参数按照功能类别分为**基础功能参数**和**高级功能参数**：
+
+- **基础功能参数**：主要包括输入输入文件及格式、debug、推理次数、预热次数、指定运行设备以及帮助信息等。
+- **高级功能参数**：主要包括动态分档场景和动态Shape场景的ais_bench推理测试参数以及profiler或dump数据获取等。
+
+**说明**：以下参数中，参数和取值之间可以用“ ”空格分隔也可以用“=”等号分隔。例如：--debug 1或--debug=0。
+
+##### 基础功能参数
+
+| 参数名                | 说明                                                         | 是否必选 |
+| --------------------- | ------------------------------------------------------------ | -------- |
+| --model               | 需要进行推理的离线模型文件。                               | 是       |
+| --input               | 模型需要的输入。可指定输入文件所在目录或直接指定输入文件。支持输入文件格式为“NPY”、“BIN”。可输入多个文件或目录，文件或目录之间用“,”隔开。具体输入文件请根据模型要求准备。  若不配置该参数，会自动构造输入数据，输入数据类型由--pure_data_type参数决定。 | 否       |
+| --pure_data_type      | 纯推理数据类型。取值为：“zero”、“random”，默认值为"zero"。 未配置模型输入文件时，工具自动构造输入数据。设置为zero时，构造全为0的纯推理数据；设置为random时，为每一个输入生成一组随机数据。 | 否       |
+| --output              | 推理结果保存目录。配置后会创建“日期+时间”的子目录，保存输出结果。如果指定output_dirname参数，输出结果将保存到子目录output_dirname下。不配置输出目录时，仅打印输出结果，不保存输出结果。 | 否       |
+| --output_dirname      | 推理结果保存子目录。设置该值时输出结果将保存到*output/output_dirname*目录下。  配合output参数使用，单独使用无效。 例如：--output */output* --output_dirname *output_dirname* | 否       |
+| --outfmt              | 输出数据的格式。取值为：“NPY”、“BIN”、“TXT”，默认为”BIN“。  配合output参数使用，单独使用无效。 例如：--output */output* --outfmt NPY。 | 否       |
+| --debug               | 调试开关。可打印model的desc信息和其他详细执行信息。1或true（开启）、0或false（关闭），默认关闭。 | 否       |
+| --run_mode | 推理执行前的数据加载方式：可取值：array（将数据转换成host侧的ndarray，再调用推理接口推理），files（将文件直接加载进device内，再调用推理接口推理），tensor（将数据加载进device内，再调用推理接口推理），full（将数据转换成host侧的ndarray，再将ndarray格式数据加载进device内，再调用推理接口推理），默认为array。 | 否 |
+| --display_all_summary | 是否显示所有的汇总信息，包含h2d和d2h信息。1或true（开启）、0或false（关闭），默认关闭。 | 否       |
+| --loop                | 推理次数。默认值为1，取值范围为大于0的正整数。  profiler参数配置为true时，推荐配置为1。 | 否       |
+| --warmup_count        | 推理预热次数。默认值为1，取值范围为大于等于0的整数。配置为0则表示不预热。 | 否       |
+| --device              | 指定运行设备。根据设备实际的Device ID指定，默认值为0。多Device场景下，可以同时指定多个Device进行推理测试，例如：--device 0,1,2,3。 | 否       |
+| --divide_input | 输入数据集切分开关，1或true（开启）、0或false（关闭），默认关闭。多Device场景下，打开时，工具会将数据集平分给这些Device进行推理。| 否 |
+| --help                | 工具使用帮助信息。                                           | 否       |
+
+##### 高级功能参数
+
+| 参数名                   | 说明                                                         | 是否必选 |
+| ------------------------ | ------------------------------------------------------------ | -------- |
+| --dymBatch               | 动态Batch参数，指定模型输入的实际Batch。 <br>如模型转换时，设置--input_shape="data:-1,600,600,3;img_info:-1,3" --dynamic_batch_size="1,2,4,8"，dymBatch参数可设置为：--dymBatch 2。 | 否       |
+| --dymHW                  | 动态分辨率参数，指定模型输入的实际H、W。 <br>如模型转换时，设置--input_shape="data:8,3,-1,-1;img_info:8,4,-1,-1" --dynamic_image_size="300,500;600,800"，dymHW参数可设置为：--dymHW 300,500。 | 否       |
+| --dymDims                | 动态维度参数，指定模型输入的实际Shape。 <br>如模型转换时，设置 --input_shape="data:1,-1;img_info:1,-1" --dynamic_dims="224,224;600,600"，dymDims参数可设置为：--dymDims "data:1,600;img_info:1,600"。 | 否       |
+| --dymShape               | 动态Shape参数，指定模型输入的实际Shape。 <br>如ATC模型转换时，设置--input_shape_range="input1:\[8\~20,3,5,-1\];input2:\[5,3\~9,10,-1\]"，dymShape参数可设置为：--dymShape "input1:8,3,5,10;input2:5,3,10,10"。<br>动态Shape场景下，获取模型的输出size通常为0（即输出数据占内存大小未知），建议设置--outputSize参数。<br/>例如：--dymShape "input1:8,3,5,10;input2:5,3,10,10" --outputSize "10000,10000" | 否       |
+| --dymShape_range         | 动态Shape的阈值范围。如果设置该参数，那么将根据参数中所有的Shape列表进行依次推理，得到汇总推理信息。<br/>配置格式为：name1:1,3,200\~224,224-230;name2:1,300。其中，name为模型输入名，“\~”表示范围，“-”表示某一位的取值。<br/>也可以指定动态Shape的阈值范围配置文件*.info，该文件中记录动态Shape的阈值范围。 | 否       |
+| --outputSize             | 指定模型的输出数据所占内存大小，多个输出时，需要为每个输出设置一个值，多个值之间用“,”隔开。<br>动态Shape场景下，获取模型的输出size通常为0（即输出数据占内存大小未知），需要根据输入的Shape，预估一个较合适的大小，配置输出数据占内存大小。<br>例如：--dymShape "input1:8,3,5,10;input2:5,3,10,10" --outputSize "10000,10000" | 否       |
+| --auto_set_dymdims_mode  | 自动设置动态Dims模式。1或true（开启）、0或false（关闭），默认关闭。<br/>针对动态档位Dims模型，根据输入的文件的信息，自动设置Shape参数，注意输入数据只能为npy文件，因为bin文件不能读取Shape信息。<br/>配合input参数使用，单独使用无效。<br/>例如：--input 1.npy --auto_set_dymdims_mode 1 | 否       |
+| --auto_set_dymshape_mode | 自动设置动态Shape模式。取值为：1或true（开启）、0或false（关闭），默认关闭。<br>针对动态Shape模型，根据输入的文件的信息，自动设置Shape参数，注意输入数据只能为npy文件，因为bin文件不能读取Shape信息。<br>配合input参数使用，单独使用无效。<br/>例如：--input 1.npy --auto_set_dymshape_mode 1 | 否       |
+| --batchsize              | 模型batchsize。不输入该值将自动推导。当前推理模块根据模型输入和文件输出自动进行组Batch。参数传递的batchszie有且只用于结果吞吐率计算。自动推导逻辑为尝试获取模型的batchsize时，首先获取第一个参数的最高维作为batchsize； 如果是动态Batch的话，更新为动态Batch的值；如果是动态dims和动态Shape更新为设置的第一个参数的最高维。如果自动推导逻辑不满足要求，请务必传入准确的batchsize值，以计算出正确的吞吐率。 | 否       |
+| --output_batchsize_axis  | 输出tensor的batchsize轴，默认值为0。输出结果保存文件时，根据哪个轴进行切割推理结果，比如batchsize为2，表示2个输入文件组batch进行推理，那输出结果的batch维度是在哪个轴。默认为0轴，按照0轴进行切割为2份，但是部分模型的输出batch为1轴，所以要设置该值为1。 | 否       |
+| --backend|指定trtexec开关。需要指定为trtexec。配合--perf参数使用，单独使用无效。|否|
+| --perf|调用trtexec开关。1或true（开启）、0或false（关闭），默认关闭。配合--backend参数使用，单独使用无效。|否|
+| --pipeline               |指定pipeline开关，用于开启多线程推理功能。1或true（开启）、0或false（关闭），默认关闭。|否|
+| --threads                |指定threads开关，用于设置多计算线程推理时计算线程的数量。默认值为1，取值范围为大于0的正整数。需要配合--pipeline 1参数使用，单独使用无效。|否|
+
+### 使用场景
+
+ #### 纯推理场景
+
+默认情况下，构造全为0的数据送入模型推理。
+
+示例命令如下：
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model> --output ./ --outfmt BIN --loop 5
+```
+
+#### 调试模式
+开启debug调试模式。
+
+示例命令如下：
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model> --output ./ --debug 1
+```
+
+调试模式开启后会增加更多的打印信息，包括：
+- 模型的输入输出参数信息
+
+  ```bash
+  input:
+    #0    input_ids  (1, 384)  int32  1536  1536
+    #1    input_mask  (1, 384)  int32  1536  1536
+    #2    segment_ids  (1, 384)  int32  1536  1536
+  output:
+    #0    logits:0  (1, 384, 2)  float32  3072  3072
+  ```
+
+- 详细的推理耗时信息
+
+  ```bash
+  [DEBUG] model exec cost : 2.336000
+  ```
+- 模型输入输出等具体操作信息
+
+ #### 文件输入场景
+
+使用--input参数指定模型输入文件，多个文件之间通过“,”进行分隔。
+
+本场景会根据文件输入size和模型实际输入size进行对比，若缺少数据则会自动构造数据补全，称为组Batch。
+
+示例命令如下：
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model> --input "./1.bin,./2.bin,./3.bin,./4.bin,./5.bin"
+```
+
+ #### 文件夹输入场景
+
+使用input参数指定模型输入文件所在目录，多个目录之间通过“,”进行分隔。
+
+本场景会根据文件输入size和模型实际输入size进行组Batch。
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model> --input "./"
+```
+
+模型输入需要与传入文件夹的个数一致。
+
+例如，bert模型有三个输入，则必须传入3个文件夹，且三个文件夹分别对应模型的三个输入，顺序要对应。
+模型输入参数的信息可以通过开启调试模式查看，bert模型的三个输入依次为input_ids、 input_mask、 segment_ids，所以依次传入三个文件夹：
+
+- 第一个文件夹“./data/SQuAD1.1/input_ids"，对应模型第一个参数"input_ids"的输入
+- 第二个文件夹"./data/SQuAD1.1/input_mask"，对应第二个输入"input_mask"的输入
+- 第三个文件夹"./data/SQuAD1.1/segment_ids"，对应第三个输入"segment_ids"的输入
+
+```bash
+python3 -m ais_bench --model <path to bert base model> --input ./data/SQuAD1.1/input_ids,./data/SQuAD1.1/input_mask,./data/SQuAD1.1/segment_ids
+```
+
+
+
+#### 多Device场景
+
+多Device场景下，可以同时指定多个Device进行推理测试。
+
+示例命令如下：
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model> --input ./data/ --device 1,2
+```
+
+输出结果依次展示每个Device的推理测试结果，示例如下：
+
+```bash
+[INFO] -----------------Performance Summary------------------
+[INFO] NPU_compute_time (ms): min = 2.4769999980926514, max = 3.937000036239624, mean = 3.5538000106811523, median = 3.7230000495910645, percentile(99%) = 3.936680030822754
+[INFO] throughput 1000*batchsize.mean(1)/NPU_compute_time.mean(3.5538000106811523): 281.38893494131406
+[INFO] ------------------------------------------------------
+[INFO] -----------------Performance Summary------------------
+[INFO] NPU_compute_time (ms): min = 3.3889999389648438, max = 3.9230000972747803, mean = 3.616000032424927, median = 3.555000066757202, percentile(99%) = 3.9134000968933105
+[INFO] throughput 1000*batchsize.mean(1)/NPU_compute_time.mean(3.616000032424927): 276.54867008654026
+[INFO] ------------------------------------------------------
+[INFO] multidevice run end qsize:4 result:1
+i:0 device_1 throughput:281.38893494131406 start_time:1676875630.804429 end_time:1676875630.8303885
+i:1 device_2 throughput:276.54867008654026 start_time:1676875630.8043878 end_time:1676875630.8326817
+[INFO] summary throughput:557.9376050278543
+```
+
+其中结果最后展示每个Device推理测试的throughput（吞吐率）、start_time（测试启动时间）、end_time（测试结束时间）以及summary throughput（吞吐率汇总）。其他详细字段解释请参见本手册的“输出结果”章节。
+
+ #### 动态分档场景
+
+主要包含动态Batch、动态HW（宽高）、动态Dims三种场景，需要分别传入dymBatch、dymHW、dymDims指定实际档位信息。
+
+##### 动态Batch
+
+以档位1 2 4 8档为例，设置档位为2，本程序将获取实际模型输入组Batch，每2个输入为一组，进行组Batch。
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model with dymbatch> --input=./data/ --dymBatch 2
+```
+
+##### 动态HW宽高
+
+以档位224,224;448,448档为例，设置档位为224,224，本程序将获取实际模型输入组Batch。
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model with dymhw> --input=./data/ --dymHW 224,224
+```
+
+##### 动态Dims
+
+以设置档位1,3,224,224为例，本程序将获取实际模型输入组Batch。
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model with dymdims> --input=./data/ --dymDims actual_input_1:1,3,224,224
+```
+
+##### 自动设置Dims模式（动态Dims模型）
+
+动态Dims模型输入数据的Shape可能是不固定的，比如一个输入文件Shape为1,3,224,224，另一个输入文件Shape为 1,3,300,300。若两个文件同时推理，则需要设置两次动态Shape参数，当前不支持该操作。针对该场景，增加auto_set_dymdims_mode模式，可以根据输入文件的Shape信息，自动设置模型的Shape参数。
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model with dymdims> --input=./data/ --auto_set_dymdims_mode 1
+```
+
+
+#### 动态Shape场景
+
+##### 动态Shape
+
+以ATC设置[1\~8,3,200\~300,200\~300]，设置档位1,3,224,224为例，本程序将获取实际模型输入组Batch。
+
+动态Shape的输出大小通常为0，建议通过outputSize参数设置对应输出的内存大小。
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model with dymshape> --dymShape actual_input_1:1,3,224,224 --outputSize 10000
+```
+
+##### 自动设置Shape模式（动态Shape模型）
+
+动态Shape模型输入数据的Shape可能是不固定的，比如一个输入文件Shape为1,3,224,224 另一个输入文件Shape为 1,3,300,300。若两个文件同时推理，则需要设置两次动态Shape参数，当前不支持该操作。针对该场景，增加auto_set_dymshape_mode模式，可以根据输入文件的Shape信息，自动设置模型的Shape参数。
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model with dymshape>  --outputSize 100000 --auto_set_dymshape_mode 1  --input ./dymdata
+```
+
+**注意该场景下的输入文件必须为npy格式，如果是bin文件将获取不到真实的Shape信息。**
+
+##### 动态Shape模型range测试模式
+
+输入动态Shape的range范围。对于该范围内的Shape分别进行推理，得出各自的性能指标。
+
+以对1,3,224,224 1,3,224,225 1,3,224,226进行分别推理为例，命令如下：
+
+```bash
+python3 -m ais_bench --model <path to resnet50 model with dymshape>  --outputSize 100000 --dymShape_range actual_input_1:1,3,224,224~226
+```
+
+
+#### trtexec场景
+
+ais_bench支持ONNX模型推理（集成trtexec）,trtexec为NVIDIA TensorRT自带工具，作为推理后端。用户使用ais_bench拉起trtexec工具进行推理性能测试，测试过程中实时输出trtexec日志，打印在控制台，推理性能测试完成后，将性能数据输出在控制台。
+##### 前置条件
+推理性能测试环境需要配置有GPU，安装 CUDA及TensorRT，并且trtexec可以通过命令行调用到，安装方式可参考[TensorRT](https://github.com/NVIDIA/TensorRT)。
+
+示例命令如下：
+
+```bash
+python3 -m ais_bench --model <path to resnet50 ONNX model> --backend trtexec --perf 1
+```
+
+输出结果推理测试结果，示例如下：
+
+```bash
+[INFO] [05/27/2023-12:05:31] [I] === Performance summary ===
+[INFO] [05/27/2023-12:05:31] [I] Throughput: 120.699 qps
+[INFO] [05/27/2023-12:05:31] [I] Latency: min = 9.11414 ms, max = 11.7442 ms, mean = 9.81005 ms, median = 9.76404 ms, percentile(90%) = 10.1075 ms, percentile(95%) = 10.1624 ms, percentile(99%) = 11.4742 ms
+[INFO] [05/27/2023-12:05:31] [I] Enqueue Time: min = 0.516296 ms, max = 0.598633 ms, mean = 0.531443 ms, median = 0.5271 ms, percentile(90%) = 0.546875 ms, percentile(95%) = 0.564575 ms, percentile(99%) = 0.580566 ms
+[INFO] [05/27/2023-12:05:31] [I] H2D Latency: min = 1.55066 ms, max = 1.57336 ms, mean = 1.55492 ms, median = 1.55444 ms, percentile(90%) = 1.55664 ms, percentile(95%) = 1.55835 ms, percentile(99%) = 1.56458 ms
+[INFO] [05/27/2023-12:05:31] [I] GPU Compute Time: min = 7.54407 ms, max = 10.1723 ms, mean = 8.23978 ms, median = 8.19409 ms, percentile(90%) = 8.5354 ms, percentile(95%) = 8.59131 ms, percentile(99%) = 9.90002 ms
+[INFO] [05/27/2023-12:05:31] [I] D2H Latency: min = 0.0130615 ms, max = 0.0170898 ms, mean = 0.015342 ms, median = 0.0153809 ms, percentile(90%) = 0.0162354 ms, percentile(95%) = 0.0163574 ms, percentile(99%) = 0.0168457 ms
+[INFO] [05/27/2023-12:05:31] [I] Total Host Walltime: 3.02405 s
+[INFO] [05/27/2023-12:05:31] [I] Total GPU Compute Time: 3.00752 s
+```
+
+**字段说明**
+
+| 字段                  | 说明                                                         |
+| --------------------- | ------------------------------------------------------------ |
+| Throughput            | 吞吐率。                    |
+| Latency               | H2D延迟、GPU计算时间和D2H延迟的总和。这是推断单个执行的延迟。                    |
+| min                   | 推理执行时间最小值。                                         |
+| max                   | 推理执行时间最大值。                                         |
+| mean                  | 推理执行时间平均值。                                         |
+| median                | 推理执行时间取中位数。                                       |
+| percentile(99%)       | 推理执行时间中的百分位数。                                   |
+| H2D Latency           | 单个执行的输入张量的主机到设备数据传输的延迟。                                   |
+| GPU Compute Time      | 为执行CUDA内核的GPU延迟。                                |
+| D2H Latency           | 单个执行的输出张量的设备到主机数据传输的延迟。                    |
+| Total Host Walltime   | 从第一个执行（预热后）入队到最后一个执行完成的主机时间。 |
+| Total GPU Compute Time| 所有执行的GPU计算时间的总和。 |
+
+ #### 输出结果文件保存场景
+
+默认情况下，ais_bench推理工具执行后不保存输出结果数据文件，配置相关参数后，可生成的结果数据如下：
+
+| 文件/目录                                | 说明                                                         |
+| ---------------------------------------- | ------------------------------------------------------------ |
+| {文件名}.bin、{文件名}.npy或{文件名}.txt | 模型推理输出结果文件。<br/>文件命名格式：名称_输出序号.后缀。不指定input时（纯推理），名称固定为“pure_infer_data”；指定input时，名称以第一个输入的第一个名称命名；输出的序号从0开始按输出先后顺序排列；文件名后缀由--outfmt参数控制。<br/>默认情况下，会在--output参数指定的目录下创建“日期+时间”的目录，并将结果文件保存在该目录下；当指定了--output_dirname时，结果文件将直接保存在--output_dirname参数指定的目录下。<br/>指定--output_dirname参数时，多次执行工具推理会导致结果文件因同名而覆盖。 |
+| xx_summary.json                          | 工具输出模型性能结果数据。默认情况下，“xx”以“日期+时间”命名；当指定了--output_dirname时，“xx”以--output_dirname指定的目录名称命名。<br/>指定--output_dirname参数时，多次执行工具推理会导致结果文件因同名而覆盖。 |
+| dump                                     | dump数据文件目录。使用--dump开启dump时，在--output参数指定的目录下创建dump目录，保存dump数据文件。 |
+| profiler                                 | Profiler采集性能数据文件目录。使用--profiler开启性能数据采集时，在--output参数指定的目录下创建profiler目录，保存性能数据文件。 |
+
+- 仅设置--output参数。示例命令及结果如下：
+
+  ```bash
+  python3 -m ais_bench --model ./pth_resnet50_bs1.om --output ./result
+  ```
+
+  ```bash
+  result
+  |-- 2022_12_17-07_37_18
+  │   `-- pure_infer_data_0.bin
+  `-- 2022_12_17-07_37_18_summary.json
+  ```
+
+- 设置--input和--output参数。示例命令及结果如下：
+
+  ```bash
+  # 输入的input文件夹内容如下
+  ls ./data
+  196608-0.bin  196608-1.bin  196608-2.bin  196608-3.bin  196608-4.bin  196608-5.bin  196608-6.bin  196608-7.bin  196608-8.bin  196608-9.bin
+  ```
+
+  ```bash
+  python3 -m ais_bench --model ./pth_resnet50_bs1.om --input ./data --output ./result
+  ```
+
+  ```bash
+  result/
+  |-- 2023_01_03-06_35_53
+  |   |-- 196608-0_0.bin
+  |   |-- 196608-1_0.bin
+  |   |-- 196608-2_0.bin
+  |   |-- 196608-3_0.bin
+  |   |-- 196608-4_0.bin
+  |   |-- 196608-5_0.bin
+  |   |-- 196608-6_0.bin
+  |   |-- 196608-7_0.bin
+  |   |-- 196608-8_0.bin
+  |   `-- 196608-9_0.bin
+  `-- 2023_01_03-06_35_53_summary.json
+  ```
+
+- 设置--output_dirname参数。示例命令及结果如下：
+
+  ```bash
+  python3 -m ais_bench --model <path to resnet50 model> --output ./result --output_dirname subdir
+  ```
+
+  ```bash
+  result
+  |-- subdir
+  │   `-- pure_infer_data_0.bin
+  `-- subdir_summary.json
+  ```
+
+- 设置--dump参数。示例命令及结果如下：
+
+  ```bash
+  python3 -m ais_bench --model <path to resnet50 model> --output ./result --dump 1
+  ```
+
+  ```bash
+  result
+  |-- 2022_12_17-07_37_18
+  │   `-- pure_infer_data_0.bin
+  |-- dump
+  `-- 2022_12_17-07_37_18_summary.json
+  ```
+
+- 设置--profiler参数。示例命令及结果如下：
+
+  ```bash
+  python3 -m ais_bench --model <path to resnet50 model> --output ./result --profiler 1
+  ```
+
+  ```bash
+  result
+  |-- 2022_12_17-07_56_10
+  │   `-- pure_infer_data_0.bin
+  |-- profiler
+  │   `-- PROF_000001_20221217075609326_GLKQJOGROQGOLIIB
+  `-- 2022_12_17-07_56_10_summary.json
+  ```
+
+#### 多线程推理场景
+
+  ```bash
+  python3 -m ais_bench --model <path to resnet50 model> --pipeline 1
+  ```
+  在单线程推理的命令行基础上加上--pipeline 1即可开启多线程推理模式，实现计算-搬运的并行，加快端到端推理速度。
+
+  ```bash
+  python3 -m ais_bench --model <path to resnet50 model> --pipeline 1 --threads 2
+  ```
+  在多线程推理的命令行基础上加上--threads {$number of threads}，即可开启多计算线程推理模式，实现计算-计算的并行，提高推理吞吐量。
+
+### 输出结果
+
+ais_bench推理工具执行后，打屏输出结果示例如下：
+
+- display_all_summary=False时，打印如下：
+
+  ```bash
+  [INFO] -----------------Performance Summary------------------
+  [INFO] NPU_compute_time (ms): min = 0.6610000133514404, max = 0.6610000133514404, mean = 0.6610000133514404, median = 0.6610000133514404, percentile(99%) = 0.6610000133514404
+  [INFO] throughput 1000*batchsize.mean(1)/NPU_compute_time.mean(0.6610000133514404): 1512.8592735267011
+  [INFO] ------------------------------------------------------
+  ```
+
+- display_all_summary=True时，打印如下：
+
+  ```bash
+  [INFO] -----------------Performance Summary------------------
+  [INFO] H2D_latency (ms): min = 0.05700000002980232, max = 0.05700000002980232, mean = 0.05700000002980232, median = 0.05700000002980232, percentile(99%) = 0.05700000002980232
+  [INFO] NPU_compute_time (ms): min = 0.6650000214576721, max = 0.6650000214576721, mean = 0.6650000214576721, median = 0.6650000214576721, percentile(99%) = 0.6650000214576721
+  [INFO] D2H_latency (ms): min = 0.014999999664723873, max = 0.014999999664723873, mean = 0.014999999664723873, median = 0.014999999664723873, percentile(99%) = 0.014999999664723873
+  [INFO] throughput 1000*batchsize.mean(1)/NPU_compute_time.mean(0.6650000214576721): 1503.759349974173
+  ```
+
+通过输出结果可以查看模型执行耗时、吞吐率。耗时越小、吞吐率越高，则表示该模型性能越高。
+
+**字段说明**
+
+| 字段                  | 说明                                                         |
+| --------------------- | ------------------------------------------------------------ |
+| H2D_latency (ms)      | Host to Device的内存拷贝耗时。单位为ms。                     |
+| min                   | 推理执行时间最小值。                                         |
+| max                   | 推理执行时间最大值。                                         |
+| mean                  | 推理执行时间平均值。                                         |
+| median                | 推理执行时间取中位数。                                       |
+| percentile(99%)       | 推理执行时间中的百分位数。                                   |
+| NPU_compute_time (ms) | NPU推理计算的时间。单位为ms。                                |
+| D2H_latency (ms)      | Device to Host的内存拷贝耗时。单位为ms。                     |
+| throughput            | 吞吐率。吞吐率计算公式：1000 *batchsize/npu_compute_time.mean |
+| batchsize             | 批大小。本工具不一定能准确识别当前样本的batchsize，建议通过--batchsize参数进行设置。 |
diff --git a/tools/infer_tool/__init__.py b/tools/infer_tool/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..86c34465080b1a393e796e85b5bc1d0e49f3ffea
--- /dev/null
+++ b/tools/infer_tool/__init__.py
@@ -0,0 +1,16 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd.
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from components.utils.parser import load_command_instance
+
+benchmark_cmd = load_command_instance('benchmark_sub_task')
\ No newline at end of file
diff --git a/tools/infer_tool/ais_bench/__init__.py b/tools/infer_tool/ais_bench/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/tools/infer_tool/ais_bench/__main__.py b/tools/infer_tool/ais_bench/__main__.py
new file mode 100644
index 0000000000000000000000000000000000000000..123cffcf5ffa21ddc647eed2e831d3d2d6b267d8
--- /dev/null
+++ b/tools/infer_tool/ais_bench/__main__.py
@@ -0,0 +1,18 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+import os
+cur_path = os.path.dirname(os.path.realpath(__file__))
+exec(open(os.path.join(cur_path, "infer/__main__.py")).read())
\ No newline at end of file
diff --git a/tools/infer_tool/ais_bench/infer/__init__.py b/tools/infer_tool/ais_bench/infer/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/tools/infer_tool/ais_bench/infer/__main__.py b/tools/infer_tool/ais_bench/infer/__main__.py
new file mode 100644
index 0000000000000000000000000000000000000000..2359a582477e9ec52b8756baab5cad044d90f184
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/__main__.py
@@ -0,0 +1,281 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import os
+import re
+from ais_bench.infer.infer_process import infer_process
+from ais_bench.infer.args_adapter import AISBenchInferArgsAdapter
+from ais_bench.infer.args_check import (
+    check_dym_string, check_dym_range_string, check_number_list, str2bool, check_positive_integer,
+    check_batchsize_valid, check_nonnegative_integer, check_device_range_valid, check_om_path_legality,
+    check_input_path_legality, check_output_path_legality, check_acl_json_path_legality,
+    check_aipp_config_path_legality
+)
+
+
+def get_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--model",
+        "-m",
+        type=check_om_path_legality,
+        required=True,
+        help="The path of the om model"
+    )
+    parser.add_argument(
+        "--input",
+        "-i",
+        type=check_input_path_legality,
+        default=None,
+        help="Input file or dir"
+    )
+    parser.add_argument(
+        "--output",
+        "-o",
+        type=check_output_path_legality,
+        default=None,
+        help="Inference data output path. The inference results are output to \
+             the subdirectory named current date under given output path"
+    )
+    parser.add_argument(
+        "--output_dirname",
+        type=check_output_path_legality,
+        default=None,
+        help="Actual output directory name. \
+             Used with parameter output, cannot be used alone. \
+             The inference result is output to subdirectory named by output_dirname \
+             under  output path. such as --output_dirname 'tmp', \
+             the final inference results are output to the folder of  {$output}/tmp"
+    )
+    parser.add_argument(
+        "--outfmt",
+        default="BIN",
+        choices=["NPY", "BIN", "TXT"],
+        help="Output file format (NPY or BIN or TXT)"
+    )
+    parser.add_argument(
+        "--loop",
+        "-l",
+        type=check_positive_integer,
+        default=1,
+        help="The round of the PureInfer."
+    )
+    parser.add_argument(
+        "--debug",
+        type=str2bool,
+        default=False,
+        help="Debug switch,print model information"
+    )
+    parser.add_argument(
+        "--device",
+        "-d",
+        type=check_device_range_valid,
+        default=0,
+        help="The NPU device ID to use.valid value range is [0, 255]"
+    )
+    parser.add_argument(
+        "--dymBatch",
+        dest="dym_batch",
+        type=check_positive_integer,
+        default=0,
+        help="Dynamic batch size param，such as --dymBatch 2"
+    )
+    parser.add_argument(
+        "--dymHW",
+        dest="dym_hw",
+        type=check_dym_string,
+        default=None,
+        help="Dynamic image size param, such as --dymHW \"300,500\""
+    )
+    parser.add_argument(
+        "--dymDims",
+        dest="dym_dims",
+        type=check_dym_string,
+        default=None,
+        help="Dynamic dims param, such as --dymDims \"data:1,600;img_info:1,600\""
+    )
+    parser.add_argument(
+        "--dymShape",
+        "--dym-shape",
+        dest="dym_shape",
+        type=check_dym_string,
+        default=None,
+        help="Dynamic shape param, such as --dymShape \"data:1,600;img_info:1,600\""
+    )
+    parser.add_argument(
+        "--outputSize",
+        dest="output_size",
+        type=check_number_list,
+        default=None,
+        help="Output size for dynamic shape mode"
+    )
+    parser.add_argument(
+        "--auto_set_dymshape_mode",
+        type=str2bool,
+        default=False,
+        help="Auto_set_dymshape_mode"
+    )
+    parser.add_argument(
+        "--auto_set_dymdims_mode",
+        type=str2bool,
+        default=False,
+        help="Auto_set_dymdims_mode"
+    )
+    parser.add_argument(
+        "--batchsize",
+        type=check_batchsize_valid,
+        default=None,
+        help="Batch size of input tensor"
+    )
+    parser.add_argument(
+        "--pure_data_type",
+        type=str,
+        default="zero",
+        choices=["zero", "random"],
+        help="Null data type for pure inference(zero or random)"
+    )
+    parser.add_argument(
+        "--profiler",
+        type=str2bool,
+        default=False,
+        help="Profiler switch"
+    )
+    parser.add_argument(
+        "--dump",
+        type=str2bool,
+        default=False,
+        help="Dump switch"
+    )
+    parser.add_argument(
+        "--acl_json_path",
+        type=check_acl_json_path_legality,
+        default=None,
+        help="Acl json path for profiling or dump"
+    )
+    parser.add_argument(
+        "--output_batchsize_axis",
+        type=check_nonnegative_integer,
+        default=0,
+        help="Splitting axis number when outputing tensor results, such as --output_batchsize_axis 1"
+    )
+    parser.add_argument(
+        "--run_mode",
+        type=str,
+        default="array",
+        choices=["array", "files", "tensor", "full"],
+        help="Run mode"
+    )
+    parser.add_argument(
+        "--display_all_summary",
+        type=str2bool,
+        default=False,
+        help="Display all summary include h2d d2h info"
+    )
+    parser.add_argument(
+        "--warmup_count",
+        "--warmup-count",
+        type=check_nonnegative_integer,
+        default=1,
+        help="Warmup count before inference"
+        )
+    parser.add_argument(
+        "--dymShape_range",
+        dest="dym_shape_range",
+        type=check_dym_range_string,
+        default=None,
+        help="Dynamic shape range, such as --dymShape_range \"data:1,600~700;img_info:1,600-700\""
+    )
+    parser.add_argument(
+        "--aipp_config",
+        type=check_aipp_config_path_legality,
+        default=None,
+        help="File type: .config, to set actual aipp params before infer"
+    )
+    parser.add_argument(
+        "--energy_consumption",
+        type=str2bool,
+        default=False,
+        help="Obtain power consumption data for model inference"
+    )
+    parser.add_argument(
+        "--npu_id",
+        type=check_nonnegative_integer,
+        default=0,
+        help="The NPU ID to use.valid value range is [0, 255]"
+    )
+    parser.add_argument(
+        "--backend",
+        type=str,
+        default=None,
+        choices=["trtexec"],
+        help="Backend trtexec"
+    )
+    parser.add_argument(
+        "--perf",
+        type=str2bool,
+        default=False,
+        help="Perf switch"
+    )
+    parser.add_argument(
+        "--pipeline",
+        type=str2bool,
+        default=False,
+        help="Pipeline switch"
+    )
+    parser.add_argument(
+        "--profiler_rename",
+        type=str2bool,
+        default=True,
+        help="Profiler rename switch"
+    )
+    parser.add_argument(
+        "--dump_npy",
+        type=str2bool,
+        default=False,
+        help="dump data convert to npy"
+    )
+    parser.add_argument(
+        "--divide_input",
+        type=str2bool,
+        default=False,
+        help="Input datas need to be divided to match multi devices or not, \
+            --device should be list, default False"
+    )
+    parser.add_argument(
+        '--threads',
+        dest='threads',
+        type=check_positive_integer,
+        default=1,
+        help="Number of threads for computing. \
+            need to set --pipeline when setting threads number to be more than one."
+    )
+    benchmark_args = parser.parse_args()
+
+    return benchmark_args
+
+
+if __name__ == "__main__":
+    args = get_args()
+
+    args = AISBenchInferArgsAdapter(args.model, args.input, args.output,
+                args.output_dirname, args.outfmt, args.loop, args.debug, args.device,
+                args.dym_batch, args.dym_hw, args.dym_dims, args.dym_shape, args.output_size,
+                args.auto_set_dymshape_mode, args.auto_set_dymdims_mode, args.batchsize, args.pure_data_type,
+                args.profiler, args.dump, args.acl_json_path, args.output_batchsize_axis, args.run_mode,
+                args.display_all_summary, args.warmup_count, args.dym_shape_range, args.aipp_config,
+                args.energy_consumption, args.npu_id, args.backend, args.perf, args.pipeline, args.profiler_rename,
+                args.dump_npy, args.divide_input, args.threads)
+    ret = infer_process(args)
+    exit(ret)
diff --git a/tools/infer_tool/ais_bench/infer/args_adapter.py b/tools/infer_tool/ais_bench/infer/args_adapter.py
new file mode 100644
index 0000000000000000000000000000000000000000..a7c24d9412882103765e141d7a3d93ce7e8d366d
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/args_adapter.py
@@ -0,0 +1,96 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+class AISBenchInferArgsAdapter():
+    def __init__(self, model, input_path, output, output_dirname, outfmt, loop,
+                 debug, device, dym_batch, dym_hw, dym_dims,
+                 dym_shape, output_size, auto_set_dymshape_mode,
+                 auto_set_dymdims_mode, batchsize, pure_data_type,
+                 profiler, dump, acl_json_path, output_batchsize_axis,
+                 run_mode, display_all_summary, warmup_count, dym_shape_range, aipp_config,
+                 energy_consumption, npu_id, backend, perf, pipeline, profiler_rename,
+                 dump_npy, divide_input, threads):
+        self.model = model
+        self.input = input_path
+        self.output = output
+        self.output_dirname = output_dirname
+        self.outfmt = outfmt
+        self.loop = loop
+        self.debug = debug
+        self.device = device
+        self.dym_batch = dym_batch
+        self.dym_hw = dym_hw
+        self.dym_dims = dym_dims
+        self.dym_shape = dym_shape
+        self.output_size = output_size
+        self.auto_set_dymshape_mode = auto_set_dymshape_mode
+        self.auto_set_dymdims_mode = auto_set_dymdims_mode
+        self.batchsize = batchsize
+        self.pure_data_type = pure_data_type
+        self.profiler = profiler
+        self.dump = dump
+        self.acl_json_path = acl_json_path
+        self.output_batchsize_axis = output_batchsize_axis
+        self.run_mode = run_mode
+        self.display_all_summary = display_all_summary
+        self.warmup_count = warmup_count
+        self.dym_shape_range = dym_shape_range
+        self.aipp_config = aipp_config
+        self.energy_consumption = energy_consumption
+        self.npu_id = npu_id
+        self.backend = backend
+        self.perf = perf
+        self.pipeline = pipeline
+        self.profiler_rename = profiler_rename
+        self.dump_npy = dump_npy
+        self.divide_input = divide_input
+        self.threads = threads
+
+    def get_all_args_dict(self):
+        args_dict = {}
+        args_dict.update({'--model':self.model})
+        args_dict.update({'--input':self.input})
+        args_dict.update({'--output':self.output})
+        args_dict.update({'--output_dirname':self.output_dirname})
+        args_dict.update({'--outfmt':self.outfmt})
+        args_dict.update({'--loop':self.loop})
+        args_dict.update({'--debug':self.debug})
+        args_dict.update({'--device':self.device})
+        args_dict.update({'--dymBatch':self.dym_batch})
+        args_dict.update({'--dymHW':self.dym_hw})
+        args_dict.update({'--dymDims':self.dym_dims})
+        args_dict.update({'--dymShape':self.dym_shape})
+        args_dict.update({'--outputSize':self.output_size})
+        args_dict.update({'--auto_set_dymshape_mode':self.auto_set_dymshape_mode})
+        args_dict.update({'--auto_set_dymdims_mode':self.auto_set_dymdims_mode})
+        args_dict.update({'--batchsize':self.batchsize})
+        args_dict.update({'--pure_data_type':self.pure_data_type})
+        args_dict.update({'--profiler':self.profiler})
+        args_dict.update({'--dump':self.dump})
+        args_dict.update({'--acl_json_path':self.acl_json_path})
+        args_dict.update({'--output_batchsize_axis':self.output_batchsize_axis})
+        args_dict.update({'--run_mode':self.run_mode})
+        args_dict.update({'--display_all_summary':self.display_all_summary})
+        args_dict.update({'--warmup_count':self.warmup_count})
+        args_dict.update({'--dymShape_range':self.dym_shape_range})
+        args_dict.update({'--aipp_config':self.aipp_config})
+        args_dict.update({'--energy_consumption':self.energy_consumption})
+        args_dict.update({'--npu_id':self.npu_id})
+        args_dict.update({'--perf':self.perf})
+        args_dict.update({'--pipeline':self.pipeline})
+        args_dict.update({'--profiler_rename':self.profiler_rename})
+        args_dict.update({'--dump_npy':self.dump_npy})
+        args_dict.update({'--divide_input':self.divide_input})
+        args_dict.update({'--threads':self.threads})
+        return args_dict
\ No newline at end of file
diff --git a/tools/infer_tool/ais_bench/infer/args_check.py b/tools/infer_tool/ais_bench/infer/args_check.py
new file mode 100644
index 0000000000000000000000000000000000000000..1093fa422fa7eb361f6511c70216b35100396a90
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/args_check.py
@@ -0,0 +1,194 @@
+import os
+import re
+import argparse
+from ais_bench.infer.common.path_security_check import FileStat
+
+OM_MODEL_MAX_SIZE = 10 * 1024 * 1024 * 1024 # 10GB
+ACL_JSON_MAX_SIZE = 8 * 1024 # 8KB
+AIPP_CONFIG_MAX_SIZE = 12.5 * 1024 # 12.5KB
+
+
+def check_dym_string(value):
+    if not value:
+        return value
+    dym_string = value
+    regex = re.compile(r"[^_A-Za-z0-9,;:/.-]")
+    if regex.search(dym_string):
+        raise argparse.ArgumentTypeError(f"dym string \"{dym_string}\" is not a legal string")
+    return dym_string
+
+
+def check_dym_range_string(value):
+    if not value:
+        return value
+    dym_string = value
+    regex = re.compile(r"[^_A-Za-z0-9,;:/.\-~]")
+    if regex.search(dym_string):
+        raise argparse.ArgumentTypeError(f"dym range string \"{dym_string}\" is not a legal string")
+    return dym_string
+
+
+def check_number_list(value):
+    if not value:
+        return value
+    number_list = value
+    regex = re.compile(r"[^0-9,;]")
+    if regex.search(number_list):
+        raise argparse.ArgumentTypeError(f"number_list \"{number_list}\" is not a legal list")
+    return number_list
+
+
+def str2bool(v):
+    if isinstance(v, bool):
+        return v
+    if v.lower() in ('yes', 'true', 't', 'y', '1'):
+        return True
+    elif v.lower() in ('no', 'false', 'f', 'n', '0'):
+        return False
+    else:
+        raise argparse.ArgumentTypeError('Boolean value expected true, 1, false, 0 with case insensitive.')
+
+
+def check_positive_integer(value):
+    ivalue = int(value)
+    if ivalue <= 0:
+        raise argparse.ArgumentTypeError("%s is an invalid positive int value" % value)
+    return ivalue
+
+
+def check_batchsize_valid(value):
+    # default value is None
+    if value is None:
+        return value
+    # input value no None
+    else:
+        return check_positive_integer(value)
+
+
+def check_nonnegative_integer(value):
+    ivalue = int(value)
+    if ivalue < 0:
+        raise argparse.ArgumentTypeError("%s is an invalid nonnegative int value" % value)
+    return ivalue
+
+
+def check_npu_id_range_vaild(value):
+    # if contain , split to int list
+    min_value = 0
+    max_value = 2048
+    if ',' in value:
+        ilist = [int(v) for v in value.split(',')]
+        for ivalue in ilist:
+            if ivalue < min_value or ivalue > max_value:
+                raise argparse.ArgumentTypeError("{} of npu_id:{} is invalid. valid value range is [{}, {}]".format(
+                    ivalue, value, min_value, max_value))
+        return ilist
+    else:
+        # default as single int value
+        ivalue = int(value)
+        if ivalue < min_value or ivalue > max_value:
+            raise argparse.ArgumentTypeError("npu_id:{} is invalid. valid value range is [{}, {}]".format(
+                ivalue, min_value, max_value))
+        return ivalue
+
+
+def check_device_range_valid(value):
+    # if contain , split to int list
+    min_value = 0
+    max_value = 255
+    try:
+        # Check if the value contains a comma; if so, split into a list of integers
+        if ',' in value:
+            ilist = [int(v) for v in value.split(',')]
+            for ivalue in ilist:
+                if ivalue < min_value or ivalue > max_value:
+                    raise argparse.ArgumentTypeError("{} of device:{} is invalid. valid value range is [{}, {}]".format(
+                        ivalue, value, min_value, max_value))
+            return ilist
+        else:
+            # default as single int value
+            ivalue = int(value)
+            if ivalue < min_value or ivalue > max_value:
+                raise argparse.ArgumentTypeError("device:{} is invalid. valid value range is [{}, {}]".format(
+                    ivalue, min_value, max_value))
+            return ivalue
+    except ValueError:
+        raise argparse.ArgumentTypeError("Argument npu-id invalid input value: {}. "
+                                         "Please provide a valid integer or a comma-separated list of integers.".format(value))
+
+
+
+def check_om_path_legality(value):
+    path_value = value
+    try:
+        file_stat = FileStat(path_value)
+    except Exception as err:
+        raise argparse.ArgumentTypeError(f"om path:{path_value} is illegal. Please check.") from err
+    if not file_stat.is_basically_legal('read'):
+        raise argparse.ArgumentTypeError(f"om path:{path_value} is illegal. Please check.")
+    if not file_stat.is_legal_file_type(["om"]):
+        raise argparse.ArgumentTypeError(f"om path:{path_value} is illegal. Please check.")
+    if not file_stat.is_legal_file_size(OM_MODEL_MAX_SIZE):
+        raise argparse.ArgumentTypeError(f"om path:{path_value} is illegal. Please check.")
+    return path_value
+
+
+def check_input_path_legality(value):
+    if not value:
+        return value
+    inputs_list = value.split(',')
+    for input_path in inputs_list:
+        try:
+            file_stat = FileStat(input_path)
+        except Exception as err:
+            raise argparse.ArgumentTypeError(f"input path:{input_path} is illegal. Please check.") from err
+        if not file_stat.is_basically_legal('read'):
+            raise argparse.ArgumentTypeError(f"input path:{input_path} is illegal. Please check.")
+    return value
+
+
+def check_output_path_legality(value):
+    if not value:
+        return value
+    path_value = value
+    try:
+        file_stat = FileStat(path_value)
+    except Exception as err:
+        raise argparse.ArgumentTypeError(f"weight path:{path_value} is illegal. Please check.") from err
+    if not file_stat.is_basically_legal("write"):
+        raise argparse.ArgumentTypeError(f"output path:{path_value} is illegal. Please check.")
+    return path_value
+
+
+def check_acl_json_path_legality(value):
+    if not value:
+        return value
+    path_value = value
+    try:
+        file_stat = FileStat(path_value)
+    except Exception as err:
+        raise argparse.ArgumentTypeError(f"acl json path:{path_value} is illegal. Please check.") from err
+    if not file_stat.is_basically_legal('read'):
+        raise argparse.ArgumentTypeError(f"acl json path:{path_value} is illegal. Please check.")
+    if not file_stat.is_legal_file_type(["json"]):
+        raise argparse.ArgumentTypeError(f"acl json path:{path_value} is illegal. Please check.")
+    if not file_stat.is_legal_file_size(ACL_JSON_MAX_SIZE):
+        raise argparse.ArgumentTypeError(f"acl json path:{path_value} is illegal. Please check.")
+    return path_value
+
+
+def check_aipp_config_path_legality(value):
+    if not value:
+        return value
+    path_value = value
+    try:
+        file_stat = FileStat(path_value)
+    except Exception as err:
+        raise argparse.ArgumentTypeError(f"aipp config path:{path_value} is illegal. Please check.") from err
+    if not file_stat.is_basically_legal('read'):
+        raise argparse.ArgumentTypeError(f"aipp config path:{path_value} is illegal. Please check.")
+    if not file_stat.is_legal_file_type(["config"]):
+        raise argparse.ArgumentTypeError(f"aipp config path:{path_value} is illegal. Please check.")
+    if not file_stat.is_legal_file_size(AIPP_CONFIG_MAX_SIZE):
+        raise argparse.ArgumentTypeError(f"aipp config path:{path_value} is illegal. Please check.")
+    return path_value
\ No newline at end of file
diff --git a/tools/infer_tool/ais_bench/infer/backends/__init__.py b/tools/infer_tool/ais_bench/infer/backends/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e78cd1be1f30b4826f6e2da993ddaee1372f06cb
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/backends/__init__.py
@@ -0,0 +1,30 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+import os
+
+from ais_bench.infer import registry
+
+BACKEND_REGISTRY = registry.Registry("BACKEND_REGISTRY")
+
+registry.import_all_modules_for_register(
+    os.path.dirname(os.path.abspath(__file__)), "ais_bench.infer.backends"
+)
+
+
+class BackendFactory:
+    @staticmethod
+    def create_backend(name):
+        return BACKEND_REGISTRY[name]
\ No newline at end of file
diff --git a/tools/infer_tool/ais_bench/infer/backends/backend.py b/tools/infer_tool/ais_bench/infer/backends/backend.py
new file mode 100644
index 0000000000000000000000000000000000000000..88476f05d171ca5e5468b2ee715d4b54274855a3
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/backends/backend.py
@@ -0,0 +1,123 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from __future__ import annotations
+
+from abc import ABC, abstractmethod
+from typing import List, Any, Iterable, Union
+
+import attrs
+
+
+@attrs.define
+class AccuracyResult:
+    output: Any = None
+    label: Any = None
+    prediction: Any = None
+
+
+@attrs.define
+class PerformanceStats:
+    min: float = None
+    max: float = None
+    mean: float = None
+    median: float = None
+    percentile: float = None
+
+
+@attrs.define
+class PerformanceResult:
+    h2d_latency: PerformanceStats = None
+    compute_time: PerformanceStats = None
+    d2h_latency: PerformanceStats = None
+    host_wall_time: float = None
+    throughput: float = None
+
+
+@attrs.define
+class InferenceTrace:
+    h2d_start: float = None
+    h2d_end: float = None
+    compute_start: float = None
+    compute_end: float = None
+    d2h_start: float = None
+    d2h_end: float = None
+
+
+class Backend(ABC):
+    """
+    Backend interface
+    """
+
+    @property
+    @abstractmethod
+    def name(self) -> str:
+        """
+        Each of the subclasses must implement this.
+        This is called to return the name of backend.
+        """
+
+    @property
+    def model_extension(self) -> str:
+        return "model"
+
+    def initialize(self) -> bool:
+        """
+        init the resource of backend
+        """
+        return True
+
+    def finalize(self) -> None:
+        """
+        release the resource of backend
+        """
+        pass
+
+    @abstractmethod
+    def load(self, model_path: str) -> Backend:
+        """
+        Each of the subclases must implement this.
+        This is called to load a model.
+        """
+
+    @abstractmethod
+    def warm_up(self, dataloader: Iterable, iterations: int = 100) -> None:
+        """
+        Each of the subclases must implement this.
+        This is called to warmup.
+        """
+
+    @abstractmethod
+    def predict(
+        self, dataloader: Iterable
+        ) -> Union[List[AccuracyResult], None]:
+        """
+        Each of the subclasses must implement this.
+        This is called to inference a model
+        """
+
+    @abstractmethod
+    def build(self) -> None:
+        """
+        Each of the subclasses must implement this.
+        This is called to build a model
+        """
+
+    @abstractmethod
+    def get_perf(self) -> PerformanceResult:
+        """
+        Each of the subclasses must implement this.
+        This is called to get the performance of the model inference.
+        """
\ No newline at end of file
diff --git a/tools/infer_tool/ais_bench/infer/backends/backend_trtexec.py b/tools/infer_tool/ais_bench/infer/backends/backend_trtexec.py
new file mode 100644
index 0000000000000000000000000000000000000000..d1d92367c0753a9687befe66c7f52845051fae43
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/backends/backend_trtexec.py
@@ -0,0 +1,154 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from __future__ import annotations
+
+import os
+import sys
+import logging
+import subprocess
+import re
+from typing import Iterable, List, Dict, Any
+
+from ais_bench.infer.backends import backend, BACKEND_REGISTRY
+from ais_bench.infer.backends.backend import AccuracyResult, PerformanceStats, PerformanceResult, InferenceTrace
+from ais_bench.infer.common.utils import logger
+
+
+class TrtexecConfig(object):
+    def __init__(self):
+        self.iterations = None
+        self.warmup = None
+        self.duration = None
+        self.batch = None
+        self.device = None
+
+
+logging.basicConfig(stream=sys.stdout, level=logging.INFO, format='[%(levelname)s] %(message)s')
+logger = logging.getLogger(__name__)
+
+
+@BACKEND_REGISTRY.register("trtexec")
+class BackendTRTExec(backend.Backend):
+    def __init__(self, config: Any = None) -> None:
+        super(BackendTRTExec, self).__init__()
+        self.config = TrtexecConfig()
+        self.convert_config(config)
+        self.model_path = ""
+        self.output_log = ""
+        self.trace = InferenceTrace()
+
+    @property
+    def name(self) -> str:
+        return "trtexec"
+
+    @property
+    def model_extension(self) -> str:
+        return "plan"
+
+    def convert_config(self, config):
+        if config.loop is not None:
+            self.config.iterations = config.loop
+        if config.warmup_count is not None:
+            self.config.warmup_count = config.warmup_count
+        if config.batchsize is not None:
+            self.config.batch = config.batchsize
+        if config.device is not None:
+            self.config.device = config.device
+
+    def load(
+        self, model_path: str, inputs: list = None, outputs: list = None
+    ) -> BackendTRTExec:
+        if os.path.exists(model_path):
+            logger.info("Load engine from file {}".format(model_path))
+            self.model_path = model_path
+        else:
+            raise Exception("{} not exit".format(model_path))
+        return self
+
+    def parse_perf(self, data: List) -> PerformanceStats:
+        stats = PerformanceStats()
+        stats.min = float(data[0])
+        stats.max = float(data[1])
+        stats.mean = float(data[2])
+        stats.median = float(data[3])
+        stats.percentile = float(data[4])
+        return stats
+
+    def parse_log(self, log: str) -> PerformanceResult:
+        performance = PerformanceResult()
+        log_list = log.splitlines()
+        pattern_1 = re.compile(r"(?<=: )\d+\.?\d*")
+        pattern_2 = re.compile(r"(?<== )\d+\.?\d*")
+        for line in log_list:
+            if "Throughput" in line:
+                throughput = pattern_1.findall(line)
+                performance.throughput = float(throughput[0])
+            elif "H2D Latency" in line:
+                h2d_latency = pattern_2.findall(line)
+                performance.h2d_latency = self.parse_perf(h2d_latency)
+            elif "GPU Compute Time: min" in line:
+                compute_time = pattern_2.findall(line)
+                performance.compute_time = self.parse_perf(compute_time)
+            elif "D2H Latency" in line:
+                d2h_latency = pattern_2.findall(line)
+                performance.d2h_latency = self.parse_perf(d2h_latency)
+            elif "Total Host Walltime" in line:
+                total_host_time = pattern_1.findall(line)
+                performance.host_wall_time = float(total_host_time[0])
+        return performance
+
+    def warm_up(self, dataloader: Iterable, iterations: int = 100) -> None:
+        pass
+
+    def predict(self, dataloader: Iterable) -> List[AccuracyResult]:
+        pass
+
+    def build(self) -> None:
+        pass
+
+    def get_perf(self) -> PerformanceResult:
+        return self.parse_log(self.output_log)
+
+    def run(self):
+        command = [
+            "trtexec",
+            f"--onnx={self.model_path}",
+            f"--fp16",
+        ]
+        if self.config.duration is not None:
+            command.append(f"--duration={self.config.duration}")
+        if self.config.device is not None:
+            command.append(f"--device={self.config.device}")
+        if self.config.iterations is not None:
+            command.append(f"--iterations={self.config.iterations}")
+        if self.config.warmup is not None:
+            command.append(f"--warmUp={self.config.warmup}")
+        if self.config.batch is not None:
+            command.append(f"--batch={self.config.batch}")
+
+        logger.info("Trtexec Build command: " + " ".join(command))
+        process = subprocess.Popen(
+            command, stdout=subprocess.PIPE, shell=False
+        )
+
+        while process.poll() is None:
+            line = process.stdout.readline()
+            self.output_log += line.decode()
+            line = line.strip()
+            if line:
+                logger.info(line.decode())
+
+        return []
\ No newline at end of file
diff --git a/tools/infer_tool/ais_bench/infer/common/__init__.py b/tools/infer_tool/ais_bench/infer/common/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/tools/infer_tool/ais_bench/infer/common/io_operations.py b/tools/infer_tool/ais_bench/infer/common/io_operations.py
new file mode 100644
index 0000000000000000000000000000000000000000..f0e2be562b2619aa558bd53b5ba73927777ee3be
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/common/io_operations.py
@@ -0,0 +1,339 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import math
+import os
+import random
+import time
+import numpy as np
+
+from ais_bench.infer.summary import summary
+from ais_bench.infer.common.utils import (
+    get_file_content,
+    get_file_datasize,
+    get_fileslist_from_dir,
+    list_split,
+    logger,
+    save_data_to_files,
+)
+
+PURE_INFER_FAKE_FILE = "pure_infer_data"
+PURE_INFER_FAKE_FILE_ZERO = "pure_infer_data_zero"
+PURE_INFER_FAKE_FILE_RANDOM = "pure_infer_data_random"
+PADDING_INFER_FAKE_FILE = "padding_infer_fake_file"
+
+
+def convert_real_files(files):
+    real_files = []
+    for file in files:
+        if file == PURE_INFER_FAKE_FILE:
+            raise RuntimeError("not support pure infer")
+        elif file.endswith(".npy") or file.endswith(".NPY"):
+            raise RuntimeError("not support npy file:{}".format(file))
+        elif file == PADDING_INFER_FAKE_FILE:
+            real_files.append(files[0])
+        else:
+            real_files.append(file)
+    return real_files
+
+
+def get_pure_infer_data(size, pure_data_type):
+    lst = []
+    if pure_data_type == "random":
+        # random value from [0, 255]
+        lst = [random.randrange(0, 256) for _ in range(size)]
+    else:
+        # zero value, default
+        lst = [0 for _ in range(size)]
+
+    barray = bytearray(lst)
+    ndata = np.frombuffer(barray, dtype=np.uint8)
+    return ndata
+
+
+# get numpy array from files list combile all files
+def get_narray_from_files_list(files_list, size, pure_data_type, no_combine_tensor_mode=False):
+    ndatalist = []
+    file_path_switch = {
+        PURE_INFER_FAKE_FILE: pure_data_type,
+        PURE_INFER_FAKE_FILE_ZERO: "zero",
+        PURE_INFER_FAKE_FILE_RANDOM: "random",
+    }
+    for i, file_path in enumerate(files_list):
+        logger.debug("get tensor from filepath:{} i:{} of all:{}".format(file_path, i, len(files_list)))
+        if file_path_switch.get(file_path) is not None:
+            ndata = get_pure_infer_data(size, file_path_switch.get(file_path))
+        elif file_path == PADDING_INFER_FAKE_FILE:
+            logger.debug("padding file use fileslist[0]:{}".format(files_list[0]))
+            ndata = get_file_content(files_list[0])
+        elif file_path is None or not os.path.exists(file_path):
+            logger.error('filepath:{} not valid'.format(file_path))
+            raise RuntimeError()
+        else:
+            ndata = get_file_content(file_path)
+        ndatalist.append(ndata)
+    if len(ndatalist) == 1:
+        return ndatalist[0]
+    else:
+        ndata = np.concatenate(ndatalist)
+        if not no_combine_tensor_mode and ndata.nbytes != size:
+            logger.error('ndata size:{} not match {}'.format(ndata.nbytes, size))
+            raise RuntimeError()
+        return ndata
+
+
+# get tensors from files list combile all files
+def get_tensor_from_files_list(files_list, session, size, pure_data_type, no_combine_tensor_mode=False):
+    ndata = get_narray_from_files_list(files_list, size, pure_data_type, no_combine_tensor_mode)
+    tensor = session.create_tensor_from_arrays_to_device(ndata)
+    return tensor
+
+
+# Obtain filesperbatch runcount information according to file information and input description information
+# The strategy is as follows:  Judge according to the realsize and file size of input 0. If the judgment fails,
+# you need to force the desired value to be set
+def get_files_count_per_batch(intensors_desc, fileslist, no_combine_tensor_mode=False):
+    # get filesperbatch
+    filesize = get_file_datasize(fileslist[0][0])
+    tensorsize = intensors_desc[0].realsize
+    if no_combine_tensor_mode:
+        files_count_per_batch = 1
+    else:
+        if filesize == 0 or tensorsize % filesize != 0:
+            logger.error('arg0 tensorsize: {} filesize: {} not match'.format(tensorsize, filesize))
+            raise RuntimeError()
+        else:
+            files_count_per_batch = (int)(tensorsize / filesize)
+    if files_count_per_batch == 0:
+        logger.error('files count per batch is zero')
+        raise RuntimeError()
+    runcount = math.ceil(len(fileslist[0]) / files_count_per_batch)
+
+    logger.info(
+        "get filesperbatch files0 size:{} tensor0size:{} filesperbatch:{} runcount:{}".format(
+            filesize, tensorsize, files_count_per_batch, runcount
+        )
+    )
+    return files_count_per_batch, runcount
+
+
+# Obtain tensor information and files information according to the input filelist. Create intensor form files list
+# len(files_list) should equal len(intensors_desc)
+def create_infileslist_from_fileslist(fileslist, intensors_desc, no_combine_tensor_mode=False):
+    if len(intensors_desc) != len(fileslist):
+        logger.error('fileslist:{} intensor:{} not match'.format(len(fileslist), len(intensors_desc)))
+        raise RuntimeError()
+    files_count_per_batch, runcount = get_files_count_per_batch(intensors_desc, fileslist, no_combine_tensor_mode)
+
+    files_perbatch_list = [
+        list(list_split(fileslist[j], files_count_per_batch, PADDING_INFER_FAKE_FILE))
+        for j in range(len(intensors_desc))
+    ]
+
+    infileslist = []
+    for i in range(runcount):
+        infiles = []
+        for j in range(len(intensors_desc)):
+            logger.debug(
+                "create infileslist i:{} j:{} runcount:{} lists:{} filesPerPatch:{}".format(
+                    i, j, runcount, files_perbatch_list[j][i], files_count_per_batch
+                )
+            )
+            infiles.append(files_perbatch_list[j][i])
+        infileslist.append(infiles)
+    return infileslist
+
+
+#  outapi. Obtain tensor information and files information according to the input filelist.
+#  Create intensor form files list
+def create_intensors_from_infileslist(
+    infileslist, intensors_desc, session, pure_data_type, no_combine_tensor_mode=False
+):
+    intensorslist = []
+    for infiles in infileslist:
+        intensors = []
+        for files, intensor_desc in zip(infiles, intensors_desc):
+            tensor = get_tensor_from_files_list(
+                files, session, intensor_desc.realsize, pure_data_type, no_combine_tensor_mode
+            )
+            intensors.append(tensor)
+        intensorslist.append(intensors)
+    return intensorslist
+
+
+def check_input_parameter(inputs_list, intensors_desc):
+    if len(inputs_list) == 0:
+        logger.error("Invalid args. Input args are empty")
+        raise RuntimeError()
+    if os.path.isfile(inputs_list[0]):
+        for index, file_path in enumerate(inputs_list):
+            realpath = os.readlink(file_path) if os.path.islink(file_path) else file_path
+            if not os.path.isfile(realpath):
+                logger.error(
+                    "Invalid input args.--input:{} input[{}]:{} {} not exist".format(
+                        inputs_list, index, file_path, realpath
+                    )
+                )
+                raise RuntimeError()
+    elif os.path.isdir(inputs_list[0]):
+        if len(inputs_list) != len(intensors_desc):
+            logger.error(
+                "Invalid args. args input dir num:{0} not equal to model inputs num:{1}".format(
+                    len(inputs_list), len(intensors_desc)
+                )
+            )
+            raise RuntimeError()
+
+        for dir_path in inputs_list:
+            real_dir_path = os.readlink(dir_path) if os.path.islink(dir_path) else dir_path
+            if not os.path.isdir(real_dir_path):
+                logger.error("Invalid args. {} of input args is not a real dir path".format(real_dir_path))
+                raise RuntimeError()
+    else:
+        logger.error("Invalid args. {}  of --input is invalid".format(inputs_list[0]))
+        raise RuntimeError()
+
+
+# outapi. get by input parameters of  inputs_List.
+def create_infileslist_from_inputs_list(inputs_list, intensors_desc, no_combine_tensor_mode=False):
+    check_input_parameter(inputs_list, intensors_desc)
+    fileslist = []
+    inputlistcount = len(inputs_list)
+    intensorcount = len(intensors_desc)
+    if os.path.isfile(inputs_list[0]):
+        chunks = inputlistcount // intensorcount
+        fileslist = list(list_split(inputs_list, chunks, PADDING_INFER_FAKE_FILE))
+        logger.debug(
+            "create intensors list file type inlistcount:{} intensorcont:{} chunks:{} files_size:{}".format(
+                inputlistcount, intensorcount, chunks, len(fileslist)
+            )
+        )
+    elif os.path.isdir(inputs_list[0]) and inputlistcount == intensorcount:
+        fileslist = [get_fileslist_from_dir(dir) for dir in inputs_list]
+        logger.debug(
+            "create intensors list dictionary type inlistcount:{} intensorcont:{} files_size:{}".format(
+                inputlistcount, intensorcount, len(fileslist)
+            )
+        )
+    else:
+        logger.error(
+            'create intensors list filelists:{} intensorcont:{} error create'.format(inputlistcount, intensorcount)
+        )
+        raise RuntimeError()
+
+    infileslist = create_infileslist_from_fileslist(fileslist, intensors_desc, no_combine_tensor_mode)
+    if len(infileslist) == 0:
+        logger.error('create_infileslist_from_fileslist return infileslist size: {}'.format(len(infileslist)))
+        raise RuntimeError()
+
+    return infileslist
+
+
+def check_pipeline_fileslist_match_intensors(fileslist, intensors_desc):
+    # check intensor amount matched
+    if len(intensors_desc) != len(fileslist):
+        logger.error('fileslist:{} intensor:{} not match'.format(len(fileslist), len(intensors_desc)))
+        raise RuntimeError()
+    # check intensor size matched
+    for i, files in enumerate(fileslist):
+        filesize = get_file_datasize(files[0])
+        tensorsize = intensors_desc[i].realsize
+        auto_mode = False
+        # auto_dim_mode & auto_shape_mode are exceptional cases
+        if intensors_desc[i].realsize == intensors_desc[i].size:
+            if any(dim <= 0 for dim in intensors_desc[i].shape):
+                auto_mode = True
+        if filesize != tensorsize and not auto_mode:
+            logger.error(f'tensor_num:{i} tensorsize:{tensorsize} filesize:{filesize} not match')
+            raise RuntimeError()
+
+
+# 不组batch的情况
+def create_pipeline_fileslist_from_inputs_list(inputs_list, intensors_desc):
+    check_input_parameter(inputs_list, intensors_desc)
+    fileslist = []
+    inputlistcount = len(inputs_list)
+    intensorcount = len(intensors_desc)
+    if os.path.isfile(inputs_list[0]):
+        chunks = inputlistcount // intensorcount
+        fileslist = list(list_split(inputs_list, chunks, PADDING_INFER_FAKE_FILE))
+        logger.debug(
+            f"create intensors list file type inlistcount:{inputlistcount} \
+                     intensorcont:{intensorcount} chunks:{chunks} files_size:{len(fileslist)}"
+        )
+    elif os.path.isdir(inputs_list[0]) and inputlistcount == intensorcount:
+        fileslist = [get_fileslist_from_dir(dir_) for dir_ in inputs_list]
+        logger.debug(
+            f"create intensors list dictionary type inlistcount:{inputlistcount} \
+                     intensorcont:{intensorcount} files_size:{len(fileslist)}"
+        )
+    else:
+        logger.error('create intensors list filelists:{inputlistcount} intensorcont:{intensorcount} error create')
+        raise RuntimeError()
+    try:
+        check_pipeline_fileslist_match_intensors(fileslist, intensors_desc)
+    except Exception as err:
+        logger.error("fileslist and intensors not matched")
+        raise RuntimeError from err
+    infileslist = list(zip(*fileslist))
+    return infileslist
+
+
+def save_tensors_to_file(outputs, output_prefix, infiles_paths, outfmt, index, output_batchsize_axis):
+    files_count_perbatch = len(infiles_paths[0])
+    infiles_perbatch = np.transpose(infiles_paths)
+    for i, out in enumerate(outputs):
+        ndata = np.array(out)
+        if output_batchsize_axis >= len(ndata.shape):
+            logger.error(
+                "error i:{0} ndata.shape:{1} len:{2} <= output_batchsize_axis:{3}  is invalid".format(
+                    i, ndata.shape, len(ndata.shape), output_batchsize_axis
+                )
+            )
+            raise RuntimeError()
+        if files_count_perbatch == 1 or ndata.shape[output_batchsize_axis] % files_count_perbatch == 0:
+            subdata = np.array_split(ndata, files_count_perbatch, output_batchsize_axis)
+            for j in range(files_count_perbatch):
+                sample_id = index * files_count_perbatch + j
+                if infiles_perbatch[j][0] == PADDING_INFER_FAKE_FILE:
+                    logger.debug(
+                        "sampleid:{} i:{} infiles:{} is padding fake file so continue".format(
+                            sample_id, i, infiles_perbatch[j]
+                        )
+                    )
+                    continue
+                file_path = os.path.join(
+                    output_prefix,
+                    "{}_{}.{}".format(os.path.basename(infiles_perbatch[j][0]).split('.')[0], i, outfmt.lower()),
+                )
+                summary.add_sample_id_infiles(sample_id, infiles_perbatch[j])
+                logger.debug(
+                    "save func: sampleid:{} i:{} infiles:{} outfile:{} fmt:{} axis:{}".format(
+                        sample_id, i, infiles_perbatch[j], file_path, outfmt, output_batchsize_axis
+                    )
+                )
+                summary.append_sample_id_outfile(sample_id, file_path)
+                save_data_to_files(file_path, subdata[j])
+        else:
+            logger.error(
+                'save out files error array shape:{} filesinfo:{} files_count_perbatch:{} ndata.shape\
+                         {}:{}'.format(
+                    ndata.shape,
+                    infiles_paths,
+                    files_count_perbatch,
+                    output_batchsize_axis,
+                    ndata.shape[output_batchsize_axis],
+                )
+            )
+            raise RuntimeError()
diff --git a/tools/infer_tool/ais_bench/infer/common/miscellaneous.py b/tools/infer_tool/ais_bench/infer/common/miscellaneous.py
new file mode 100644
index 0000000000000000000000000000000000000000..21ce01c35eb9c3a02042c889a84d942eba083691
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/common/miscellaneous.py
@@ -0,0 +1,276 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import sys
+import stat
+import subprocess
+import json
+import itertools
+import numpy as np
+
+from ais_bench.infer.common.utils import logger
+from ais_bench.infer.common.path_security_check import ms_open, MAX_SIZE_LIMITE_CONFIG_FILE, MAX_SIZE_LIMITE_NORMAL_FILE
+from ais_bench.infer.args_adapter import AISBenchInferArgsAdapter
+
+PERMISSION_DIR = 0o750
+
+ACL_JSON_CMD_LIST = [
+    "output",
+    "storage_limit",
+    "ascendcl",
+    "runtime_api",
+    "hccl",
+    "task_time",
+    "aicpu",
+    "aic_metrics",
+    "l2",
+    "sys_hardware_mem_freq",
+    "lcc_profiling",
+    "dvpp_freq",
+    "host_sys",
+    "host_sys_usage",
+    "host_sys_usage_freq",
+    "sys_interconnection_freq",
+    "msproftx",
+]
+
+
+def get_modules_version(name):
+    try:
+        import pkg_resources
+    except ImportError as err:
+        raise Exception("importerror") from err
+    pkg = pkg_resources.get_distribution(name)
+    return pkg.version
+
+
+def version_check(args):
+    try:
+        aclruntime_version = get_modules_version('aclruntime')
+    except Exception:
+        url = 'https://gitee.com/ascend/tools/tree/master/ais-bench_workload/tool/ais_bench'
+        logger.warning(f"can't find aclruntime, please visit {url} to install ais_bench(benchmark)"
+                       "to install")
+        args.run_mode = "tensor"
+    if aclruntime_version != "0.0.2":
+        logger.warning(
+            f"aclruntime{aclruntime_version} version is lower please update \
+                        aclruntime follow any one method"
+        )
+        # set old run mode to run ok
+        args.run_mode = "tensor"
+
+
+def get_model_name(model):
+    path_list = model.split('/')
+    return path_list[-1][:-3]
+
+
+def check_valid_acl_json_for_dump(acl_json_path, model):
+    with ms_open(acl_json_path, mode="r", max_size=MAX_SIZE_LIMITE_CONFIG_FILE) as f:
+        acl_json_dict = json.load(f)
+    model_name_correct = get_model_name(model)
+    if acl_json_dict.get("dump") is not None:
+        # check validity of dump_list (model_name)
+        dump_list_val = acl_json_dict["dump"].get("dump_list")
+        if dump_list_val is not None:
+            if dump_list_val == [] or dump_list_val[0].get("model_name") != model_name_correct:
+                logger.warning(
+                    "dump failed, 'model_name' is not set or set incorrectly. correct"
+                    "'model_name' should be {}".format(model_name_correct)
+                )
+        else:
+            logger.warning("dump failed, acl.json need to set 'dump_list' attribute")
+        # check validity of dump_path
+        dump_path_val = acl_json_dict["dump"].get("dump_path")
+        if dump_path_val is not None:
+            if os.path.isdir(dump_path_val) and os.access(dump_path_val, os.R_OK) and os.access(dump_path_val, os.W_OK):
+                pass
+            else:
+                logger.warning("dump failed, 'dump_path' not exists or has no read/write permission")
+        else:
+            logger.warning("dump failed, acl.json need to set 'dump_path' attribute")
+        # check validity of dump_op_switch
+        dump_op_switch_val = acl_json_dict["dump"].get("dump_op_switch")
+        if dump_op_switch_val is not None and dump_op_switch_val not in {"on", "off"}:
+            logger.warning("dump failed, 'dump_op_switch' need to be set as 'on' or 'off'")
+        # check validity of dump_mode
+        dump_mode_val = acl_json_dict["dump"].get("dump_mode")
+        if dump_mode_val is not None and dump_mode_val not in {"input", "output", "all"}:
+            logger.warning("dump failed, 'dump_mode' need to be set as 'input', 'output' or 'all'")
+    return
+
+
+def get_acl_json_path(args):
+    """
+    get acl json path. when args.profiler is true or args.dump is True, create relative acl.json ,
+    default current folder
+    """
+    if args.acl_json_path is not None:
+        check_valid_acl_json_for_dump(args.acl_json_path, args.model)
+        return args.acl_json_path
+    if not args.profiler and not args.dump:
+        return None
+
+    output_json_dict = {}
+    if args.profiler:
+        out_profiler_path = os.path.join(args.output, "profiler")
+
+        if not os.path.exists(out_profiler_path):
+            os.makedirs(out_profiler_path, PERMISSION_DIR)
+        output_json_dict = {"profiler": {"switch": "on", "aicpu": "on", "output": out_profiler_path, "aic_metrics": ""}}
+    elif args.dump:
+        out_dump_path = os.path.join(args.output, "dump")
+
+        if not os.path.exists(out_dump_path):
+            os.makedirs(out_dump_path, PERMISSION_DIR)
+
+        model_name = args.model.split("/")[-1]
+        output_json_dict = {
+            "dump": {
+                "dump_path": out_dump_path,
+                "dump_mode": "all",
+                "dump_list": [{"model_name": model_name.split('.')[0]}],
+            }
+        }
+
+    out_json_file_path = os.path.join(args.output, "acl.json")
+
+    OPEN_FLAGS = os.O_WRONLY | os.O_CREAT | os.O_TRUNC
+    OPEN_MODES = stat.S_IWUSR | stat.S_IRUSR
+    with ms_open(out_json_file_path, mode="w") as f:
+        json.dump(output_json_dict, f, indent=4, separators=(", ", ": "), sort_keys=True)
+    return out_json_file_path
+
+
+def get_batchsize(session, args):
+    intensors_desc = session.get_inputs()
+    batchsize = intensors_desc[0].shape[0]
+    if args.dym_batch != 0:
+        batchsize = int(args.dym_batch)
+    elif args.dym_dims is not None or args.dym_shape is not None:
+        instr = args.dym_dims if args.dym_dims is not None else args.dym_shape
+        elems = instr.split(';')
+        for elem in elems:
+            tmp_idx = elem.rfind(':')
+            name = elem[:tmp_idx]
+            shapestr = elem[tmp_idx + 1 :]
+            if name == intensors_desc[0].name:
+                batchsize = int(shapestr.split(',')[0])
+    return batchsize
+
+
+def get_range_list(ranges):
+    elems = ranges.split(';')
+    info_list = []
+    for elem in elems:
+        shapes = []
+        tmp_idx = elem.rfind(':')
+        name = elem[:tmp_idx]
+        shapestr = elem[tmp_idx + 1 :]
+        for content in shapestr.split(','):
+            step = 1
+            if '~' in content:
+                start = int(content.split('~')[0])
+                end = int(content.split('~')[1])
+                step = int(content.split('~')[2]) if len(content.split('~')) == 3 else 1
+                ranges = [str(i) for i in range(start, end + 1, step)]
+            elif '-' in content:
+                ranges = content.split('-')
+            else:
+                start = int(content)
+                ranges = [str(start)]
+            shapes.append(ranges)
+            logger.debug("content:{} get range{}".format(content, ranges))
+        shape_list = [','.join(s) for s in list(itertools.product(*shapes))]
+        info = ["{}:{}".format(name, s) for s in shape_list]
+        info_list.append(info)
+        logger.debug("name:{} shapes:{} info:{}".format(name, shapes, info))
+
+    res = [';'.join(s) for s in list(itertools.product(*info_list))]
+    logger.debug("range list:{}".format(res))
+    return res
+
+
+# get dymshape list from input_ranges
+# input_ranges can be a string like "name1:1,3,224,224;name2:1,600" or file
+def get_dymshape_list(input_ranges):
+    ranges_list = []
+    if os.path.isfile(input_ranges):
+        with ms_open(input_ranges, mode="rt", max_size=MAX_SIZE_LIMITE_NORMAL_FILE, encoding='utf-8') as finfo:
+            line = finfo.readline()
+            while line:
+                line = line.rstrip('\n')
+                ranges_list.append(line)
+                line = finfo.readline()
+    else:
+        ranges_list.append(input_ranges)
+
+    dymshape_list = []
+    for ranges in ranges_list:
+        dymshape_list.extend(get_range_list(ranges))
+    return dymshape_list
+
+
+# get throughput from out log
+def get_throughtput_from_log(out_log):
+    log_list = out_log.split('\n')
+    for log_txt in log_list:
+        if "throughput" in log_txt:
+            throughput = float(log_txt.split(' ')[-1])
+            return "OK", throughput
+    return "Failed", 0
+
+
+def regenerate_dymshape_cmd(args: AISBenchInferArgsAdapter, dym_shape):
+    args_dict = args.get_all_args_dict()
+    cmd = sys.executable + " -m ais_bench"
+    for key, value in args_dict.items():
+        if key == '--dymShape_range':
+            continue
+        if key == '--dymShape':
+            cmd = cmd + " " + f"{key}={dym_shape}"
+            continue
+        if value:
+            cmd = cmd + " " + f"{key}={value}"
+    cmd_list = cmd.split(' ')
+    return cmd_list
+
+
+def dymshape_range_run(args: AISBenchInferArgsAdapter):
+    dymshape_list = get_dymshape_list(args.dym_shape_range)
+    results = []
+    for dymshape in dymshape_list:
+        cmd = regenerate_dymshape_cmd(args, dymshape)
+        result = {"dymshape": dymshape, "cmd": cmd, "result": "Failed", "throughput": 0}
+        logger.debug("cmd:{}".format(cmd))
+        p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+        stdout, _ = p.communicate(timeout=10)
+        out_log = stdout.decode('utf-8')
+        print(out_log)  # show original log of cmd
+        result["result"], result["throughput"] = get_throughtput_from_log(out_log)
+        logger.info("dymshape:{} end run result:{}".format(dymshape, result["result"]))
+        results.append(result)
+
+    tlist = [result["throughput"] for result in results if result["result"] == "OK"]
+    logger.info("-----------------dyshape_range Performance Summary------------------")
+    logger.info("run_count:{} success_count:{} avg_throughput:{}".format(len(results), len(tlist), np.mean(tlist)))
+    results.sort(key=lambda x: x['throughput'], reverse=True)
+    for i, result in enumerate(results):
+        logger.info(
+            "{} dymshape:{}  result:{} throughput:{}".format(
+                i, result["dymshape"], result["result"], result["throughput"]
+            )
+        )
+    logger.info("------------------------------------------------------")
diff --git a/tools/infer_tool/ais_bench/infer/common/path_security_check.py b/tools/infer_tool/ais_bench/infer/common/path_security_check.py
new file mode 100644
index 0000000000000000000000000000000000000000..81ef7f4c98260f3be7f2bbcf05098e34df07f19a
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/common/path_security_check.py
@@ -0,0 +1,293 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+
+#     http://www.apache.org/licenses/LICENSE-2.0
+
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# this file is as same as components/utils/file_opem_check.py, because benchmark might be install without ait
+
+import os
+import sys
+import stat
+import re
+import logging
+
+
+MAX_SIZE_UNLIMITE = -1  # 不限制，必须显式表示不限制，读取必须传入
+MAX_SIZE_LIMITE_CONFIG_FILE = 10 * 1024 * 1024  # 10M 普通配置文件，可以根据实际要求变更
+MAX_SIZE_LIMITE_NORMAL_FILE = 4 * 1024 * 1024 * 1024  # 4G 普通模型文件，可以根据实际要求变更
+MAX_SIZE_LIMITE_MODEL_FILE = 100 * 1024 * 1024 * 1024  # 100G 超大模型文件，需要确定能处理大文件，可以根据实际要求变更
+
+PATH_WHITE_LIST_REGEX_WIN = re.compile(r"[^_:\\A-Za-z0-9/.-]")
+PATH_WHITE_LIST_REGEX = re.compile(r"[^_A-Za-z0-9/.-]")
+
+PERMISSION_NORMAL = 0o640  # 普通文件
+PERMISSION_KEY = 0o600  # 密钥文件
+READ_FILE_NOT_PERMITTED_STAT = stat.S_IWGRP | stat.S_IWOTH
+WRITE_FILE_NOT_PERMITTED_STAT = stat.S_IWGRP | stat.S_IWOTH | stat.S_IROTH | stat.S_IXOTH
+
+SOLUTION_LEVEL = 35
+SOLUTION_LEVEL_WIN = 45
+logging.addLevelName(SOLUTION_LEVEL, "\033[1;32m" + "SOLUTION" + "\033[0m")  # green [SOLUTION]
+logging.addLevelName(SOLUTION_LEVEL_WIN, "SOLUTION_WIN")
+logging.basicConfig(stream=sys.stdout, level=logging.INFO, format='[%(levelname)s] %(message)s')
+logger = logging.getLogger(__name__)
+
+
+SOLUTION_BASE_URL = 'https://gitee.com/ascend/ait/wikis/ait_security_error_log_solution'
+SOFT_LINK_SUB_URL = '/soft_link_error_log_solution'
+PATH_LENGTH_SUB_URL = '/path_length_overflow_error_log_solution'
+OWNER_SUB_URL = '/owner_or_ownergroup_error_log_solution'
+PERMISSION_SUB_URL = '/path_permission_error_log_solution'
+ILLEGAL_CHAR_SUB_URL = '/path_contain_illegal_char_error_log_solution'
+
+
+def solution_log(content):
+    logger.log(SOLUTION_LEVEL, f"visit \033[1;32m {content} \033[0m for detailed solution")  # green content
+
+
+def solution_log_win(content):
+    logger.log(SOLUTION_LEVEL_WIN, f"visit {content} for detailed solution")
+
+
+def is_legal_path_length(path):
+    if len(path) > 4096 and not sys.platform.startswith("win"):  # linux total path length limit
+        logger.error(f"file total path{path} length out of range (4096), please check the file(or directory) path")
+        solution_log(SOLUTION_BASE_URL + PATH_LENGTH_SUB_URL)
+        return False
+
+    if len(path) > 260 and sys.platform.startswith("win"):  # windows total path length limit
+        logger.error(f"file total path{path} length out of range (260), please check the file(or directory) path")
+        solution_log_win(SOLUTION_BASE_URL + PATH_LENGTH_SUB_URL)
+        return False
+
+    dirnames = path.split("/")
+    for dirname in dirnames:
+        if len(dirname) > 255:  # linux single file path length limit
+            logger.error(f"file name{dirname} length out of range (255), please check the file(or directory) path")
+            solution_log(SOLUTION_BASE_URL + PATH_LENGTH_SUB_URL)
+            return False
+    return True
+
+
+def is_match_path_white_list(path):
+    if PATH_WHITE_LIST_REGEX.search(path) and not sys.platform.startswith("win"):
+        logger.error(f"path:{path} contains illegal char, legal chars include A-Z a-z 0-9 _ - / .")
+        solution_log(SOLUTION_BASE_URL + ILLEGAL_CHAR_SUB_URL)
+        return False
+    if PATH_WHITE_LIST_REGEX_WIN.search(path) and sys.platform.startswith("win"):
+        logger.error(f"path:{path} contains illegal char, legal chars include A-Z a-z 0-9 _ - / . : \\")
+        solution_log_win(SOLUTION_BASE_URL + ILLEGAL_CHAR_SUB_URL)
+        return False
+    return True
+
+
+def is_legal_args_path_string(path):
+    # only check path string
+    if not path:
+        return True
+    if not is_legal_path_length(path):
+        return False
+    if not is_match_path_white_list(path):
+        return False
+    return True
+
+
+class OpenException(Exception):
+    pass
+
+
+class FileStat:
+    def __init__(self, file) -> None:
+        if not is_legal_path_length(file) or not is_match_path_white_list(file):
+            raise OpenException(f"create FileStat failed")
+        self.file = file
+        self.is_file_exist = os.path.exists(file)
+        if self.is_file_exist:
+            self.file_stat = os.stat(file)
+            self.realpath = os.path.realpath(file)
+        else:
+            self.file_stat = None
+
+    @property
+    def is_exists(self):
+        return self.is_file_exist
+
+    @property
+    def is_softlink(self):
+        return os.path.islink(self.file) if self.file_stat else False
+
+    @property
+    def is_file(self):
+        return stat.S_ISREG(self.file_stat.st_mode) if self.file_stat else False
+
+    @property
+    def is_dir(self):
+        return stat.S_ISDIR(self.file_stat.st_mode) if self.file_stat else False
+
+    @property
+    def file_size(self):
+        return self.file_stat.st_size if self.file_stat else 0
+
+    @property
+    def permission(self):
+        return stat.S_IMODE(self.file_stat.st_mode) if self.file_stat else 0o777
+
+    @property
+    def owner(self):
+        return self.file_stat.st_uid if self.file_stat else -1
+
+    @property
+    def group_owner(self):
+        return self.file_stat.st_gid if self.file_stat else -1
+
+    @property
+    def is_owner(self):
+        return self.owner == (os.geteuid() if hasattr(os, "geteuid") else 0)
+
+    @property
+    def is_group_owner(self):
+        return self.group_owner in (os.getgroups() if hasattr(os, "getgroups") else [0])
+
+    @property
+    def is_user_or_group_owner(self):
+        return self.is_owner or self.is_group_owner
+
+    @property
+    def is_user_and_group_owner(self):
+        return self.is_owner and self.is_group_owner
+
+    def is_basically_legal(self, perm='none'):
+        if sys.platform.startswith("win"):
+            return self.check_windows_permission(perm)
+        else:
+            return self.check_linux_permission(perm)
+
+    def check_linux_permission(self, perm='none'):
+        if not self.is_exists and perm != 'write':
+            logger.error(f"path: {self.file} not exist, please check if file or dir is exist")
+            return False
+        if self.is_softlink:
+            logger.error(f"path :{self.file} is a soft link, not supported, please import file(or directory) directly")
+            solution_log(SOLUTION_BASE_URL + SOFT_LINK_SUB_URL)
+            return False
+        if not self.is_user_or_group_owner and self.is_exists:
+            logger.error(
+                f"current user isn't path:{self.file}'s owner or ownergroup, make sure current user belong to file(or directory)'s owner or ownergroup"
+            )
+            solution_log(SOLUTION_BASE_URL + OWNER_SUB_URL)
+            return False
+        if perm == 'read':
+            if self.permission & READ_FILE_NOT_PERMITTED_STAT > 0:
+                logger.error(
+                    f"The file {self.file} is group writable, or is others writable, as import file(or directory), "
+                    "permission should not be over 0o755(rwxr-xr-x)"
+                )
+                solution_log(SOLUTION_BASE_URL + PERMISSION_SUB_URL)
+                return False
+            if not os.access(self.realpath, os.R_OK) or self.permission & stat.S_IRUSR == 0:
+                logger.error(
+                    f"Current user doesn't have read permission to the file {self.file}, as import file(or directory), "
+                    "permission should be at least 0o400(r--------) "
+                )
+                solution_log(SOLUTION_BASE_URL + PERMISSION_SUB_URL)
+                return False
+        elif perm == 'write' and self.is_exists:
+            if self.permission & WRITE_FILE_NOT_PERMITTED_STAT > 0:
+                logger.error(
+                    f"The file {self.file} is group writable, or is others writable, as export file(or directory), "
+                    "permission should not be over 0o750(rwxr-x---)"
+                )
+                solution_log(SOLUTION_BASE_URL + PERMISSION_SUB_URL)
+                return False
+            if not os.access(self.realpath, os.W_OK):
+                logger.error(
+                    f"Current user doesn't have write permission to the file {self.file}, as export file(or directory), "
+                    "permission should be at least 0o200(-w-------) "
+                )
+                solution_log(SOLUTION_BASE_URL + PERMISSION_SUB_URL)
+                return False
+        return True
+
+    def check_windows_permission(self, perm='none'):
+        if not self.is_exists and perm != 'write':
+            logger.error(f"path: {self.file} not exist, please check if file or dir is exist")
+            return False
+        if self.is_softlink:
+            logger.error(f"path :{self.file} is a soft link, not supported, please import file(or directory) directly")
+            solution_log(SOLUTION_BASE_URL + SOFT_LINK_SUB_URL)
+            return False
+        return True
+
+    def is_legal_file_size(self, max_size):
+        if not self.is_file:
+            logger.error(f"path: {self.file} is not a file")
+            return False
+        if self.file_size > max_size:
+            logger.error(f"file_size:{self.file_size} byte out of max limit {max_size} byte")
+            return False
+        else:
+            return True
+
+    def is_legal_file_type(self, file_types: list):
+        if not self.is_file and self.is_exists:
+            logger.error(f"path: {self.file} is not a file")
+            return False
+        for file_type in file_types:
+            if os.path.splitext(self.file)[1] == f".{file_type}":
+                return True
+        logger.error(f"path:{self.file}, file type not in {file_types}")
+        return False
+
+
+def ms_open(file, mode="r", max_size=None, softlink=False, write_permission=PERMISSION_NORMAL, **kwargs):
+    file_stat = FileStat(file)
+
+    if file_stat.is_exists and file_stat.is_dir:
+        raise OpenException(f"Expecting a file, but it's a folder. {file}")
+
+    if "r" in mode:
+        if not file_stat.is_exists:
+            raise OpenException(f"No such file or directory {file}")
+        if max_size is None:
+            raise OpenException(f"Reading files must have a size limit control. {file}")
+        if max_size != MAX_SIZE_UNLIMITE and max_size < file_stat.file_size:
+            raise OpenException(f"The file size has exceeded the specifications and cannot be read. {file}")
+
+    if "w" in mode:
+        if file_stat.is_exists and not file_stat.is_owner:
+            raise OpenException(
+                f"The file owner is inconsistent with the current process user and is not allowed to write. {file}"
+            )
+        if file_stat.is_exists:
+            os.remove(file)
+
+    if not softlink and file_stat.is_softlink:
+        raise OpenException(f"Softlink is not allowed to be opened. {file}")
+
+    if "a" in mode:
+        if not file_stat.is_owner:
+            raise OpenException(
+                f"The file owner is inconsistent with the current process user and is not allowed to write. {file}"
+            )
+        if file_stat.permission != (file_stat.permission & write_permission):
+            os.chmod(file, file_stat.permission & write_permission)
+
+    flags = os.O_RDONLY
+    if "+" in mode:
+        flags = flags | os.O_RDWR
+    elif "w" in mode or "a" in mode or "x" in mode:
+        flags = flags | os.O_WRONLY
+
+    if "w" in mode or "x" in mode:
+        flags = flags | os.O_TRUNC | os.O_CREAT
+    if "a" in mode:
+        flags = flags | os.O_APPEND | os.O_CREAT
+    return os.fdopen(os.open(file, flags, mode=write_permission), mode, **kwargs)
diff --git a/tools/infer_tool/ais_bench/infer/common/utils.py b/tools/infer_tool/ais_bench/infer/common/utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..a3a88f116ddb96bbcd3a2ac414a410c129e55e4b
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/common/utils.py
@@ -0,0 +1,274 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+import os
+import sys
+import stat
+import re
+import uuid
+from pickle import NONE
+import logging
+from random import sample
+from string import digits, ascii_uppercase, ascii_lowercase
+import json
+import shutil
+import shlex
+import subprocess
+import numpy as np
+from ais_bench.infer.common.path_security_check import (
+    ms_open,
+    MAX_SIZE_LIMITE_NORMAL_FILE,
+    MAX_SIZE_LIMITE_CONFIG_FILE,
+    FileStat,
+    is_legal_args_path_string,
+)
+
+logging.basicConfig(stream=sys.stdout, level=logging.INFO, format='[%(levelname)s] %(message)s')
+logger = logging.getLogger(__name__)
+
+PERMISSION_DIR = 0o750
+READ_WRITE_FLAGS = os.O_RDWR | os.O_CREAT
+WRITE_FLAGS = os.O_WRONLY | os.O_CREAT | os.O_TRUNC
+WRITE_MODES = stat.S_IWUSR | stat.S_IRUSR
+MSACCUCMP_FILE_PATH = "tools/operator_cmp/compare/msaccucmp.py"
+CANN_PATH = "/usr/local/Ascend/ascend-toolkit/latest"
+
+
+# Split a List Into Even Chunks of N Elements
+def list_split(list_a, n, padding_file):
+    for x in range(0, len(list_a), n):
+        every_chunk = list_a[x : n + x]
+
+        if len(every_chunk) < n:
+            every_chunk = every_chunk + [padding_file for _ in range(n - len(every_chunk))]
+        yield every_chunk
+
+
+def list_share(list_a, count, num, left):
+    head = 0
+    for i in range(count):
+        if i < left:
+            every_chunk = list_a[head : head + num + 1]
+            head = head + num + 1
+        else:
+            every_chunk = list_a[head : head + num]
+            head = head + num
+        yield every_chunk
+
+
+def natural_sort(lst):
+    convert = lambda text: int(text) if text.isdigit() else text.lower()
+    alphanum_key = lambda key: [convert(c) for c in re.split('([0-9]+)', key)]
+    return sorted(lst, key=alphanum_key)
+
+
+def get_fileslist_from_dir(dir_):
+    files_list = []
+
+    for f in os.listdir(dir_):
+        f_true_path = os.path.join(dir_, f)
+        f_stat = FileStat(f_true_path)
+        if not f_stat.is_basically_legal('read'):
+            raise RuntimeError(f'input data:{f_true_path} is illegal')
+        if f_stat.is_dir:
+            continue
+        if f.endswith(".npy") or f.endswith(".NPY") or f.endswith(".bin") or f.endswith(".BIN"):
+            files_list.append(os.path.join(dir_, f))
+
+    if len(files_list) == 0:
+        logger.error('{} of input args not find valid file,valid file format:[*.npy *.NPY *.bin *.BIN]'.format(dir_))
+        raise RuntimeError()
+    files_list.sort()
+    return natural_sort(files_list)
+
+
+def get_file_datasize(file_path):
+    if file_path.endswith(".NPY") or file_path.endswith(".npy"):
+        ndata = np.load(file_path)
+        return ndata.nbytes
+    else:
+        return os.path.getsize(file_path)
+
+
+def get_file_content(file_path):
+    if file_path.endswith(".NPY") or file_path.endswith(".npy"):
+        return np.load(file_path)
+    else:
+        with ms_open(file_path, mode="rb", max_size=MAX_SIZE_LIMITE_NORMAL_FILE) as fd:
+            barray = fd.read()
+            return np.frombuffer(barray, dtype=np.int8)
+
+
+def get_ndata_fmt(ndata):
+    if ndata.dtype == np.float32 or ndata.dtype == np.float16 or ndata.dtype == np.float64:
+        fmt = "%f"
+    else:
+        fmt = "%d"
+    return fmt
+
+
+def save_data_to_files(file_path, ndata):
+    if file_path.endswith(".NPY") or file_path.endswith(".npy"):
+        with ms_open(file_path, mode="wb") as f:
+            np.save(f, ndata)
+    elif file_path.endswith(".TXT") or file_path.endswith(".txt"):
+        outdata = ndata.reshape(-1, ndata.shape[-1])
+        fmt = get_ndata_fmt(outdata)
+        with ms_open(file_path, mode="wb") as f:
+            for i in range(outdata.shape[0]):
+                np.savetxt(f, np.c_[outdata[i]], fmt=fmt, newline=" ")
+                f.write(b"\n")
+    else:
+        with ms_open(file_path, mode="wb") as f:
+            ndata.tofile(f)
+
+
+def create_fake_file_name(pure_data_type, index):
+    suffix = "_" + pure_data_type + "_" + str(index)
+    loop_max = 1000
+    for _ in range(loop_max):
+        fname = os.path.join(os.getcwd(), "tmp-" + "".join(str(uuid.uuid4())) + suffix)
+        if not os.path.exists(fname):
+            return fname
+    raise RuntimeError(f'create_fake_file_name failed: inner error')
+
+
+def get_dump_relative_paths(output_dir, timestamp):
+    if output_dir is None or timestamp is None:
+        return []
+    dump_dir = os.path.join(output_dir, timestamp)
+    dump_relative_paths = []
+    for subdir, _, files in os.walk(dump_dir):
+        if len(files) > 0:
+            dump_relative_paths.append(os.path.relpath(subdir, dump_dir))
+    return dump_relative_paths
+
+
+def get_msaccucmp_path():
+    ascend_toolkit_path = os.environ.get("ASCEND_TOOLKIT_HOME")
+    if not is_legal_args_path_string(ascend_toolkit_path):
+        raise TypeError(f"ASCEND_TOOLKIT_HOME:{ascend_toolkit_path} is illegal")
+    if ascend_toolkit_path is None:
+        ascend_toolkit_path = CANN_PATH
+    msaccucmp_path = os.path.join(ascend_toolkit_path, MSACCUCMP_FILE_PATH)
+    return msaccucmp_path if os.path.exists(msaccucmp_path) else None
+
+
+def make_dirs(path):
+    ret = 0
+    if not os.path.exists(path):
+        try:
+            os.makedirs(path, PERMISSION_DIR)
+        except Exception as e:
+            logger.warning(f"make dir {path} failed")
+            ret = -1
+    return ret
+
+
+def create_tmp_acl_json(acl_json_path):
+    with ms_open(acl_json_path, mode="r", max_size=MAX_SIZE_LIMITE_CONFIG_FILE) as f:
+        acl_json_dict = json.load(f)
+    tmp_acl_json_path, real_dump_path, tmp_dump_path = None, None, None
+
+    # create tmp acl.json path
+    acl_json_path_list = acl_json_path.split("/")
+    acl_json_path_list[-1] = str(uuid.uuid4()) + "_" + acl_json_path_list[-1]
+    tmp_acl_json_path = "/".join(acl_json_path_list)
+
+    # change acl_json_dict
+    if acl_json_dict.get("dump") is not None and acl_json_dict["dump"].get("dump_path") is not None:
+        real_dump_path = acl_json_dict["dump"]["dump_path"]
+        dump_path_list = real_dump_path.split("/")
+        if dump_path_list[-1] == "":
+            dump_path_list.pop()
+        dump_path_list.append(str(uuid.uuid4()))
+        tmp_dump_path = "/".join(dump_path_list)
+        acl_json_dict["dump"]["dump_path"] = tmp_dump_path
+        if make_dirs(tmp_dump_path) != 0:
+            tmp_dump_path = None
+            os.remove(tmp_acl_json_path)
+            tmp_acl_json_path = None
+
+    if tmp_acl_json_path is not None:
+        with ms_open(tmp_acl_json_path, mode="w") as f:
+            json.dump(acl_json_dict, f)
+
+    return tmp_acl_json_path, real_dump_path, tmp_dump_path
+
+
+def convert_helper(output_dir, timestamp):  # convert bin file in src path and output the npy file in dest path
+    '''
+    before:
+    output_dir--|--2023***2--...  (原来可能存在的时间戳路径)
+                |--2023***3--...  (原来可能存在的时间戳路径)
+                |--timestamp--...  (移动过的bin file目录)
+
+    after:
+    output_dir--|--2023***2--...  (原来可能存在的时间戳路径)
+                |--2023***3--...  (原来可能存在的时间戳路径)
+                |--timestamp--...  (移动过的bin file目录)
+                |--timestamp_npy--...  (转换后npy保存的目录)
+    '''
+    dump_relative_paths = get_dump_relative_paths(output_dir, timestamp)
+    msaccucmp_path = get_msaccucmp_path()
+    python_path = sys.executable
+    if python_path is None:
+        logger.error("convert_helper failed: python executable is not found. NPY file transfer failed.")
+        return
+    if msaccucmp_path is None:
+        logger.error("convert_helper failed: msaccucmp.py is not found. NPY file transfer failed.")
+        return
+    if dump_relative_paths == []:
+        logger.error("convert_helper failed: dump_relative_paths is empty. NPY file transfer failed.")
+        return
+    for dump_relative_path in dump_relative_paths:
+        dump_npy_path = os.path.join(output_dir, timestamp + "_npy", dump_relative_path)
+        real_dump_path = os.path.join(output_dir, timestamp, dump_relative_path)
+        convert_cmd = f"{python_path} {msaccucmp_path} convert -d {real_dump_path} -out {dump_npy_path}"
+        convert_cmd_list = shlex.split(convert_cmd)
+        ret = subprocess.call(convert_cmd_list, shell=False)
+        if ret != 0:
+            logger.error(f"convert_helper failed: cmd {convert_cmd} execute failed")
+
+
+def move_subdir(src_dir, dest_dir):
+    # move the subdir in src_dir to dest_dir return dest_dir/subdir
+    # and remove the src_dir
+    '''
+    before:
+    src_dir--2023***1--...  (bin file存在的路径)
+
+    dest_dir--|--2023***2--...  (原来可能存在的时间戳路径)
+              |--2023***3--...  (原来可能存在的时间戳路径)
+
+    after:
+    dest_dir--|--2023***2--...  (原来可能存在的时间戳路径)
+              |--2023***3--...  (原来可能存在的时间戳路径)
+              |--2023***1--...  (bin file移动到新的目录下)
+    '''
+    res_dest, res_subdir = None, None
+    subdirs = os.listdir(src_dir)
+    if len(subdirs) != 1:
+        logger.error(
+            "move_subdir failed: multiple or none directory under src dir %s. " "The reason might be dump failed.",
+            src_dir,
+        )
+    else:
+        if os.path.exists(os.path.join(dest_dir, subdirs[0])):
+            logger.error("move_subdir failed: dest dir %s exists" % os.path.join(dest_dir, subdirs[0]))
+        else:
+            shutil.move(os.path.join(src_dir, subdirs[0]), os.path.join(dest_dir, subdirs[0]))
+            res_dest, res_subdir = dest_dir, subdirs[0]
+    return res_dest, res_subdir
diff --git a/tools/infer_tool/ais_bench/infer/infer_process.py b/tools/infer_tool/ais_bench/infer/infer_process.py
new file mode 100644
index 0000000000000000000000000000000000000000..3ae667657a7009be09165ad781310c682488527a
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/infer_process.py
@@ -0,0 +1,753 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import logging
+import math
+import os
+import sys
+import time
+import json
+import shutil
+import copy
+import shlex
+import re
+import subprocess
+import fcntl
+from multiprocessing import Pool
+from multiprocessing import Manager
+import numpy as np
+
+from tqdm import tqdm
+
+from ais_bench.infer.interface import InferSession, MemorySummary
+from ais_bench.infer.common.io_operations import (create_infileslist_from_inputs_list,
+                                          create_pipeline_fileslist_from_inputs_list,
+                                          create_intensors_from_infileslist,
+                                          get_narray_from_files_list,
+                                          get_tensor_from_files_list,
+                                          convert_real_files,
+                                          PURE_INFER_FAKE_FILE_ZERO,
+                                          PURE_INFER_FAKE_FILE_RANDOM,
+                                          PURE_INFER_FAKE_FILE, save_tensors_to_file,
+                                          get_pure_infer_data)
+from ais_bench.infer.summary import summary
+from ais_bench.infer.common.miscellaneous import (dymshape_range_run, get_acl_json_path, version_check,
+                                           get_batchsize, ACL_JSON_CMD_LIST)
+from ais_bench.infer.common.utils import (get_file_content, get_file_datasize,
+                                   get_fileslist_from_dir, list_split, list_share,
+                                   save_data_to_files, create_fake_file_name, logger,
+                                   create_tmp_acl_json, move_subdir, convert_helper)
+from ais_bench.infer.common.path_security_check import is_legal_args_path_string
+from ais_bench.infer.args_adapter import AISBenchInferArgsAdapter
+from ais_bench.infer.backends import BackendFactory
+from ais_bench.infer.common.path_security_check import ms_open, MAX_SIZE_LIMITE_CONFIG_FILE
+
+PERMISSION_DIR = 0o750
+logging.basicConfig(stream=sys.stdout, level=logging.INFO, format='[%(levelname)s] %(message)s')
+logger = logging.getLogger(__name__)
+
+
+def set_session_options(session, args):
+    # 增加校验
+    aipp_batchsize = -1
+    if args.dym_batch != 0:
+        session.set_dynamic_batchsize(args.dym_batch)
+        aipp_batchsize = session.get_max_dym_batchsize()
+    elif args.dym_hw is not None:
+        hwstr = args.dym_hw.split(",")
+        session.set_dynamic_hw((int)(hwstr[0]), (int)(hwstr[1]))
+    elif args.dym_dims is not None:
+        session.set_dynamic_dims(args.dym_dims)
+    elif args.dym_shape is not None:
+        session.set_dynamic_shape(args.dym_shape)
+    else:
+        session.set_staticbatch()
+
+    if args.batchsize is None:
+        args.batchsize = get_batchsize(session, args)
+        logger.info(f"try get model batchsize:{args.batchsize}")
+
+    if not args.auto_set_dymshape_mode and not args.auto_set_dymdims_mode:
+        if args.batchsize < 0 and not args.dym_batch and not args.dym_dims and not args.dym_shape:
+            raise RuntimeError('dynamic batch om model detected, but dymbatch, dymdims or dymshape not set!')
+
+    if aipp_batchsize < 0:
+        aipp_batchsize = args.batchsize
+
+    # 确认模型只有一个动态 aipp input
+    if args.dym_shape is not None or args.auto_set_dymshape_mode:
+        aipp_input_exist = 0
+    else:
+        aipp_input_exist = session.get_dym_aipp_input_exist()
+    logger.debug(f"aipp_input_exist: {aipp_input_exist}")
+    if (args.aipp_config is not None) and (aipp_input_exist == 1):
+        session.load_aipp_config_file(args.aipp_config, aipp_batchsize)
+        session.check_dym_aipp_input_exist()
+    elif (args.aipp_config is None) and (aipp_input_exist == 1):
+        logger.error("can't find aipp config file for model with dym aipp input , please check it!")
+        raise RuntimeError('aipp model without aipp config!')
+    elif (aipp_input_exist > 1):
+        logger.error(f"don't support more than one dynamic aipp input in model, \
+                     amount of aipp input is {aipp_input_exist}")
+        raise RuntimeError('aipp model has more than 1 aipp input!')
+    elif (aipp_input_exist == -1):
+        raise RuntimeError('aclmdlGetAippType failed!')
+
+    # 设置custom out tensors size
+    if args.output_size is not None:
+        customsizes = [int(n) for n in args.output_size.split(',')]
+        logger.debug(f"set customsize:{customsizes}")
+        session.set_custom_outsize(customsizes)
+
+
+def init_inference_session(args, acl_json_path):
+    session = InferSession(args.device, args.model, acl_json_path, args.debug, args.loop)
+
+    set_session_options(session, args)
+    logger.debug(f"session info:{session.session}")
+    return session
+
+
+def set_dymshape_shape(session, inputs):
+    shape_list = []
+    intensors_desc = session.get_inputs()
+    for i, input_ in enumerate(inputs):
+        str_shape = [str(shape) for shape in input_.shape]
+        shapes = ",".join(str_shape)
+        dyshape = f"{intensors_desc[i].name}:{shapes}"
+        shape_list.append(dyshape)
+    dyshapes = ';'.join(shape_list)
+    logger.debug(f"set dymshape shape:{dyshapes}")
+    session.set_dynamic_shape(dyshapes)
+    summary.add_batchsize(inputs[0].shape[0])
+
+
+def set_dymdims_shape(session, inputs):
+    shape_list = []
+    intensors_desc = session.get_inputs()
+    for i, input_ in enumerate(inputs):
+        str_shape = [str(shape) for shape in input_.shape]
+        shapes = ",".join(str_shape)
+        dydim = f"{intensors_desc[i].name}:{shapes}"
+        shape_list.append(dydim)
+    dydims = ';'.join(shape_list)
+    logger.debug(f"set dymdims shape:{dydims}")
+    session.set_dynamic_dims(dydims)
+    summary.add_batchsize(inputs[0].shape[0])
+
+
+def warmup(session, args, intensors_desc, infiles):
+    # prepare input data
+    infeeds = []
+    for j, files in enumerate(infiles):
+        if args.run_mode == "tensor":
+            tensor = get_tensor_from_files_list(files, session, intensors_desc[j].realsize,
+                                                args.pure_data_type, args.no_combine_tensor_mode)
+            infeeds.append(tensor)
+        else:
+            narray = get_narray_from_files_list(files, intensors_desc[j].realsize,
+                                                args.pure_data_type, args.no_combine_tensor_mode)
+            infeeds.append(narray)
+    session.set_loop_count(1)
+    # warmup
+    for _ in range(args.warmup_count):
+        outputs = run_inference(session, args, infeeds, out_array=True)
+
+    session.set_loop_count(args.loop)
+
+    # reset summary info
+    summary.reset()
+    session.reset_summaryinfo()
+    MemorySummary.reset()
+    logger.info(f"warm up {args.warmup_count} done")
+
+
+def run_inference(session, args, inputs, out_array=False):
+    if args.auto_set_dymshape_mode:
+        set_dymshape_shape(session, inputs)
+    elif args.auto_set_dymdims_mode:
+        set_dymdims_shape(session, inputs)
+    outputs = session.run(inputs, out_array)
+    return outputs
+
+
+def run_pipeline_inference(session, args, infileslist, output_prefix, extra_session):
+    out = output_prefix if output_prefix is not None else ""
+    pure_infer_mode = False
+    if args.input is None:
+        pure_infer_mode = True
+    session.run_pipeline(infileslist,
+                         out,
+                         args.auto_set_dymshape_mode,
+                         args.auto_set_dymdims_mode,
+                         args.outfmt,
+                         pure_infer_mode,
+                         [s.session for s in extra_session])
+
+
+# tensor to loop infer
+def infer_loop_tensor_run(session, args, intensors_desc, infileslist, output_prefix):
+    for i, infiles in enumerate(tqdm(infileslist, file=sys.stdout, desc='Inference tensor Processing')):
+        intensors = []
+        for j, files in enumerate(infiles):
+            tensor = get_tensor_from_files_list(files, session, intensors_desc[j].realsize,
+                                                args.pure_data_type, args.no_combine_tensor_mode)
+            intensors.append(tensor)
+        outputs = run_inference(session, args, intensors)
+        session.convert_tensors_to_host(outputs)
+        if output_prefix is not None:
+            save_tensors_to_file(
+                outputs, output_prefix, infiles,
+                args.outfmt, i, args.output_batchsize_axis
+            )
+
+
+# files to loop iner
+def infer_loop_files_run(session, args, intensors_desc, infileslist, output_prefix):
+    for i, infiles in enumerate(tqdm(infileslist, file=sys.stdout, desc='Inference files Processing')):
+        intensors = []
+        for j, files in enumerate(infiles):
+            real_files = convert_real_files(files)
+            tensor = session.create_tensor_from_fileslist(intensors_desc[j], real_files)
+            intensors.append(tensor)
+        outputs = run_inference(session, args, intensors)
+        session.convert_tensors_to_host(outputs)
+        if output_prefix is not None:
+            save_tensors_to_file(
+                outputs, output_prefix, infiles,
+                args.outfmt, i, args.output_batchsize_axis
+            )
+
+
+# First prepare the data, then execute the reference, and then write the file uniformly
+def infer_fulltensors_run(session, args, intensors_desc, infileslist, output_prefix):
+    outtensors = []
+    intensorslist = create_intensors_from_infileslist(infileslist, intensors_desc, session,
+                                                      args.pure_data_type, args.no_combine_tensor_mode)
+
+    for inputs in tqdm(intensorslist, file=sys.stdout, desc='Inference Processing full'):
+        outputs = run_inference(session, args, inputs)
+        outtensors.append(outputs)
+
+    for i, outputs in enumerate(outtensors):
+        session.convert_tensors_to_host(outputs)
+        if output_prefix is not None:
+            save_tensors_to_file(
+                outputs, output_prefix, infileslist[i],
+                args.outfmt, i, args.output_batchsize_axis
+            )
+
+
+# loop numpy array to infer
+def infer_loop_array_run(session, args, intensors_desc, infileslist, output_prefix):
+    for i, infiles in enumerate(tqdm(infileslist, file=sys.stdout, desc='Inference array Processing')):
+        innarrays = []
+        for j, files in enumerate(infiles):
+            narray = get_narray_from_files_list(files, intensors_desc[j].realsize, args.pure_data_type)
+            innarrays.append(narray)
+        outputs = run_inference(session, args, innarrays)
+        session.convert_tensors_to_host(outputs)
+        if args.output is not None:
+            save_tensors_to_file(
+                outputs, output_prefix, infiles,
+                args.outfmt, i, args.output_batchsize_axis
+            )
+
+
+def infer_pipeline_run(session, args, infileslist, output_prefix, extra_session):
+    logger.info(f"run in pipeline mode with computing threadsnumber:{args.threads}")
+    run_pipeline_inference(session, args, infileslist, output_prefix, extra_session)
+
+
+def get_file_name(file_path: str, suffix: str, res_file_path: list) -> list:
+    """获取路径下的指定文件类型后缀的文件
+    Args:
+        file_path: 文件夹的路径
+        suffix: 要提取的文件类型的后缀
+        res_file_path: 保存返回结果的列表
+    Returns: 文件路径
+    """
+    for file in os.listdir(file_path):
+
+        if os.path.isdir(os.path.join(file_path, file)):
+            get_file_name(os.path.join(file_path, file), suffix, res_file_path)
+        else:
+            res_file_path.append(os.path.join(file_path, file))
+    # endswith：表示以suffix结尾。可根据需要自行修改；如：startswith：表示以suffix开头，__contains__：包含suffix字符串
+    if suffix == '' or suffix is None:
+        return res_file_path
+    else:
+        return list(filter(lambda x: x.endswith(suffix), res_file_path))
+
+
+def get_legal_json_content(acl_json_path):
+    cmd_dict = {}
+    with ms_open(acl_json_path, mode="r", max_size=MAX_SIZE_LIMITE_CONFIG_FILE) as f:
+        json_dict = json.load(f)
+    profile_dict = json_dict.get("profiler")
+    for option_cmd in ACL_JSON_CMD_LIST:
+        if profile_dict.get(option_cmd):
+            if option_cmd == "output" and not is_legal_args_path_string(profile_dict.get(option_cmd)):
+                raise Exception(f"output path in acl_json is illegal!")
+            cmd_dict.update({"--" + option_cmd.replace('_', '-'): profile_dict.get(option_cmd)})
+            if (option_cmd == "sys_hardware_mem_freq"):
+                cmd_dict.update({"--sys-hardware-mem": "on"})
+            if (option_cmd == "sys_interconnection_freq"):
+                cmd_dict.update({"--sys-interconnection-profiling": "on"})
+            if (option_cmd == "dvpp_freq"):
+                cmd_dict.update({"--dvpp-profiling": "on"})
+    return cmd_dict
+
+
+def json_to_msprof_cmd(acl_json_path):
+    json_dict = get_legal_json_content(acl_json_path)
+    msprof_option_cmd = " ".join([f"{key}={value}" for key, value in json_dict.items()])
+    return msprof_option_cmd
+
+
+def regenerate_cmd(args:AISBenchInferArgsAdapter):
+    args_dict = args.get_all_args_dict()
+    cmd = sys.executable + " -m ais_bench"
+    for key, value in args_dict.items():
+        if key == '--acl_json_path':
+            continue
+        if key == '--warmup_count':
+            cmd = cmd + " " + f"{key}={0}"
+            continue
+        if key == '--profiler':
+            cmd = cmd + " " + f"{key}={0}"
+            continue
+        if value:
+            cmd = cmd + " " + f"{key}={value}"
+    return cmd
+
+
+def msprof_run_profiling(args, msprof_bin):
+    if args.acl_json_path is not None:
+        # acl.json to msprof cmd
+        args.profiler_rename = False
+        cmd = regenerate_cmd(args)
+        msprof_cmd = f"{msprof_bin} --application=\"{cmd}\" " + json_to_msprof_cmd(args.acl_json_path)
+    else:
+        # default msprof cmd
+        cmd = regenerate_cmd(args)
+        msprof_cmd = f"{msprof_bin} --output={args.output}/profiler --application=\"{cmd}\" --model-execution=on \
+                    --sys-hardware-mem=on --sys-cpu-profiling=off --sys-profiling=off --sys-pid-profiling=off \
+                    --dvpp-profiling=on --runtime-api=on --task-time=on --aicpu=on" \
+
+    ret = -1
+    msprof_cmd_list = shlex.split(msprof_cmd)
+    logger.info(f"msprof cmd:{msprof_cmd} begin run")
+    if (args.profiler_rename):
+        p = subprocess.Popen(msprof_cmd_list, stdout=subprocess.PIPE, shell=False, bufsize=0)
+        flags = fcntl.fcntl(p.stdout, fcntl.F_GETFL)
+        fcntl.fcntl(p.stdout, fcntl.F_SETFL, flags | os.O_NONBLOCK)
+
+        get_path_flag = True
+        sub_str = ""
+        for line in iter(p.stdout.read, b''):
+            if not line:
+                continue
+            line = line.decode()
+            if (get_path_flag and line.find("PROF_") != -1):
+                get_path_flag = False
+                start_index = line.find("PROF_")
+                sub_str = line[start_index:(start_index + 46)] # PROF_XXXX的目录长度为46
+            print(f'{line}', flush=True, end="")
+        p.stdout.close()
+        p.wait()
+
+        output_prefix = os.path.join(args.output, "profiler")
+        output_prefix = os.path.join(output_prefix, sub_str)
+        hash_str = sub_str.rsplit('_')[-1]
+        file_name = get_file_name(output_prefix, ".csv", [])
+        file_name_json = get_file_name(output_prefix, ".json", [])
+
+        model_name = os.path.basename(args.model).split(".")[0]
+        for file in file_name:
+            real_file = os.path.splitext(file)[0]
+            os.rename(file, real_file + "_" + model_name + "_" + hash_str + ".csv")
+        for file in file_name_json:
+            real_file = os.path.splitext(file)[0]
+            os.rename(file, real_file + "_" + model_name + "_" + hash_str + ".json")
+        ret = 0
+    else:
+        ret = subprocess.call(msprof_cmd_list, shell=False)
+        logger.info(f"msprof cmd:{msprof_cmd} end run ret:{ret}")
+    return ret
+
+
+def get_energy_consumption(npu_id):
+    cmd = f"npu-smi info -t power -i {npu_id}"
+    get_npu_id = subprocess.run(cmd.split(), shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+    npu_id = get_npu_id.stdout.decode('gb2312')
+    power = []
+    npu_id = npu_id.split("\n")
+    for key in npu_id:
+        if key.find("Power Dissipation(W)", 0, len(key)) != -1:
+            power = key[34:len(key)]
+            break
+
+    return power
+
+
+def convert(tmp_acl_json_path, real_dump_path, tmp_dump_path):
+    if real_dump_path is not None and tmp_dump_path is not None:
+        output_dir, timestamp = move_subdir(tmp_dump_path, real_dump_path)
+        convert_helper(output_dir, timestamp)
+    if tmp_dump_path is not None:
+        shutil.rmtree(tmp_dump_path)
+    if tmp_acl_json_path is not None:
+        os.remove(tmp_acl_json_path)
+
+
+def main(args, index=0, msgq=None, device_list=None):
+    # if msgq is not None,as subproces run
+    if msgq is not None:
+        logger.info(f"subprocess_{index} main run")
+
+    if args.debug:
+        logger.setLevel(logging.DEBUG)
+
+    acl_json_path = get_acl_json_path(args)
+    tmp_acl_json_path = None
+    if args.dump_npy and acl_json_path is not None:
+        tmp_acl_json_path, real_dump_path, tmp_dump_path = create_tmp_acl_json(acl_json_path)
+
+    session = init_inference_session(args, tmp_acl_json_path if tmp_acl_json_path is not None else acl_json_path)
+    # if pipeline is set and threads number is > 1, create a session pool for extra computing
+    extra_session = []
+    if args.pipeline:
+        extra_session = [init_inference_session(args, tmp_acl_json_path if tmp_acl_json_path is not None\
+                                                else acl_json_path) for _ in range(args.threads - 1)]
+
+    intensors_desc = session.get_inputs()
+    if device_list is not None and len(device_list) > 1:
+        if args.output is not None:
+            if args.output_dirname is None:
+                timestr = time.strftime("%Y_%m_%d-%H_%M_%S")
+                output_prefix = os.path.join(args.output, timestr)
+                output_prefix = os.path.join(output_prefix, "device" + str(device_list[index]) + "_" + str(index))
+            else:
+                output_prefix = os.path.join(args.output, args.output_dirname)
+                output_prefix = os.path.join(output_prefix, "device" + str(device_list[index]) + "_" + str(index))
+            if not os.path.exists(output_prefix):
+                os.makedirs(output_prefix, PERMISSION_DIR)
+            os.chmod(args.output, PERMISSION_DIR)
+            logger.info(f"output path:{output_prefix}")
+        else:
+            output_prefix = None
+    else:
+        if args.output is not None:
+            if args.output_dirname is None:
+                timestr = time.strftime("%Y_%m_%d-%H_%M_%S")
+                output_prefix = os.path.join(args.output, timestr)
+            else:
+                output_prefix = os.path.join(args.output, args.output_dirname)
+            if not os.path.exists(output_prefix):
+                os.makedirs(output_prefix, PERMISSION_DIR)
+            os.chmod(args.output, PERMISSION_DIR)
+            logger.info(f"output path:{output_prefix}")
+        else:
+            output_prefix = None
+
+    inputs_list = [] if args.input is None else args.input.split(',')
+
+    # create infiles list accord inputs list
+    if len(inputs_list) == 0:
+        # Pure reference scenario. Create input zero data
+        if not args.pipeline:
+            infileslist = [[[PURE_INFER_FAKE_FILE] for _ in intensors_desc]]
+        else:
+            infileslist = [[]]
+            pure_file = PURE_INFER_FAKE_FILE_ZERO if args.pure_data_type == "zero" else PURE_INFER_FAKE_FILE_RANDOM
+            for _ in intensors_desc:
+                infileslist[0].append(pure_file)
+    else:
+        if not args.pipeline:
+            infileslist = create_infileslist_from_inputs_list(inputs_list, intensors_desc, args.no_combine_tensor_mode)
+        else:
+            infileslist = create_pipeline_fileslist_from_inputs_list(inputs_list, intensors_desc)
+    if not args.pipeline:
+        warmup(session, args, intensors_desc, infileslist[0])
+    else:
+        # prepare for pipeline case
+        infiles = []
+        for file in infileslist[0]:
+            infiles.append([file])
+        warmup(session, args, intensors_desc, infiles)
+        for sess in extra_session:
+            warmup(sess, args, intensors_desc, infiles)
+
+    if args.pipeline and (args.auto_set_dymshape_mode or args.auto_set_dymdims_mode):
+        for file_list in infileslist:
+            input_first = np.load(file_list[0])
+            summary.add_batchsize(input_first.shape[0])
+
+    if msgq is not None:
+        # wait subprocess init ready, if time eplapsed, force ready run
+        logger.info(f"subprocess_{index} qsize:{msgq.qsize()} now waiting")
+        msgq.put(index)
+        time_sec = 0
+        while True:
+            if msgq.qsize() >= args.subprocess_count:
+                break
+            time_sec = time_sec + 1
+            if time_sec > 10:
+                logger.warning(f"subprocess_{index} qsize:{msgq.qsize()} time:{time_sec} s elapsed")
+                break
+            time.sleep(1)
+        logger.info(f"subprocess_{index} qsize:{msgq.qsize()} ready to infer run")
+
+    start_time = time.time()
+    if args.energy_consumption:
+        start_energy_consumption = get_energy_consumption(args.npu_id)
+    if args.pipeline:
+        infer_pipeline_run(session, args, infileslist, output_prefix, extra_session)
+    else:
+        run_mode_switch = {
+            "array": infer_loop_array_run,
+            "files": infer_loop_files_run,
+            "full": infer_fulltensors_run,
+            "tensor": infer_loop_tensor_run
+        }
+        if run_mode_switch.get(args.run_mode) is not None:
+            run_mode_switch.get(args.run_mode)(session, args, intensors_desc, infileslist, output_prefix)
+        else:
+            raise RuntimeError(f'wrong run_mode:{args.run_mode}')
+    if args.energy_consumption:
+        end_energy_consumption = get_energy_consumption(args.npu_id)
+    end_time = time.time()
+
+    multi_threads_mode = args.threads > 1 and args.pipeline
+    summary.add_args(sys.argv)
+    s = session.summary()
+    if multi_threads_mode:
+        summary.npu_compute_time_interval_list = s.exec_time_list
+    else:
+        summary.npu_compute_time_list = [end_time - start_time for start_time, end_time in s.exec_time_list]
+    summary.h2d_latency_list = MemorySummary.get_h2d_time_list()
+    summary.d2h_latency_list = MemorySummary.get_d2h_time_list()
+    summary.report(args.batchsize, output_prefix, args.display_all_summary, multi_threads_mode)
+    try:
+        if args.energy_consumption:
+            energy_consumption = ((float(end_energy_consumption) + float(start_energy_consumption)) / 2.0) \
+                * (end_time - start_time)
+            logger.info(f"NPU ID:{args.npu_id} energy consumption(J):{energy_consumption}")
+    except AttributeError as err:
+        logger.error(f"Attribute Access Error: {err}")
+        raise RuntimeError("Error accessing an attribute, please verify if the NPU ID is correct. ") from err
+    except Exception as err:
+        logger.error(f"Unexpected Error: {err}")
+        raise RuntimeError(
+            "Energy consumption append an unexpected error occurred, please check the input parameters.") from err
+
+    if msgq is not None:
+        # put result to msgq
+        msgq.put([index, summary.infodict['throughput'], start_time, end_time])
+
+    session.free_resource()
+    for sess in extra_session:
+        sess.free_resource()
+
+    InferSession.finalize()
+
+    if args.dump_npy and acl_json_path is not None:
+        convert(tmp_acl_json_path, real_dump_path, tmp_dump_path)
+
+
+def print_subproces_run_error(value):
+    logger.error(f"subprocess run failed error_callback:{value}")
+
+
+def seg_input_data_for_multi_process(args, inputs, jobs):
+    inputs_list = [] if inputs is None else inputs.split(',')
+    if inputs_list is None:
+        return inputs_list
+
+    fileslist = []
+    if os.path.isfile(inputs_list[0]):
+        fileslist = inputs_list
+    elif os.path.isdir(inputs_list[0]):
+        for dir_path in inputs_list:
+            fileslist.extend(get_fileslist_from_dir(dir_path))
+    else:
+        logger.error(f'error {inputs_list[0]} not file or dir')
+        raise RuntimeError()
+
+    args.device = 0
+    acl_json_path = get_acl_json_path(args)
+    session = init_inference_session(args, acl_json_path)
+    intensors_desc = session.get_inputs()
+    try:
+        chunks_elements = math.ceil(len(fileslist) / len(intensors_desc))
+    except ZeroDivisionError as err:
+        logger.error("ZeroDivisionError: intensors_desc is empty")
+        raise RuntimeError("error zero division") from err
+    chunks = list(list_split(fileslist, chunks_elements, None))
+    fileslist = [[] for _ in range(jobs)]
+    for _, chunk in enumerate(chunks):
+        try:
+            splits_elements = int(len(chunk) / jobs)
+        except ZeroDivisionError as err:
+            logger.error("ZeroDivisionError: intensors_desc is empty")
+            raise RuntimeError("error zero division") from err
+        splits_left = len(chunk) % jobs
+        splits = list(list_share(chunk, jobs, splits_elements, splits_left))
+        for j, split in enumerate(splits):
+            fileslist[j].extend(split)
+    res = []
+    for files in fileslist:
+        res.append(','.join(list(filter(None, files))))
+    return res
+
+
+def multidevice_run(args):
+    logger.info(f"multidevice:{args.device} run begin")
+    device_list = args.device
+    npu_id_list = args.npu_id
+    p = Pool(len(device_list))
+    msgq = Manager().Queue()
+    args.subprocess_count = len(device_list)
+    splits = None
+    if (args.input is not None and args.divide_input):
+        jobs = args.subprocess_count
+        splits = seg_input_data_for_multi_process(args, args.input, jobs)
+
+    for i, device in enumerate(device_list):
+        cur_args = copy.deepcopy(args)
+        cur_args.device = int(device)
+        if args.energy_consumption:
+            cur_args.npu_id = int(npu_id_list[i])
+        if args.divide_input:
+            cur_args.input = None if splits is None else list(splits)[i]
+        p.apply_async(main, args=(cur_args, i, msgq, device_list), error_callback=print_subproces_run_error)
+
+    p.close()
+    p.join()
+    result = 0 if 2 * len(device_list) == msgq.qsize() else 1
+    logger.info(f"multidevice run end qsize:{msgq.qsize()} result:{result}")
+    tlist = []
+    while msgq.qsize() != 0:
+        ret = msgq.get()
+        if type(ret) == list:
+            logger.info(f"i:{ret[0]} device_{device_list[ret[0]]} throughput:{ret[1]} \
+                start_time:{ret[2]} end_time:{ret[3]}")
+            tlist.append(ret[1])
+    logger.info(f'summary throughput:{sum(tlist)}')
+    return result
+
+
+def args_rules(args):
+    if args.profiler and args.dump:
+        logger.error("parameter --profiler cannot be true at the same time as parameter --dump, please check them!\n")
+        raise RuntimeError('error bad parameters --profiler and --dump')
+
+    if (args.profiler or args.dump) and (args.output is None):
+        logger.error("when dump or profiler, miss output path, please check them!")
+        raise RuntimeError('miss output parameter!')
+
+    if not args.auto_set_dymshape_mode and not args.auto_set_dymdims_mode:
+        args.no_combine_tensor_mode = False
+    else:
+        args.no_combine_tensor_mode = True
+
+    if args.profiler and args.warmup_count != 0 and args.input is not None:
+        logger.info("profiler mode with input change warmup_count to 0")
+        args.warmup_count = 0
+
+    if args.output is None and args.output_dirname is not None:
+        logger.error(
+            "parameter --output_dirname cann't be used alone. Please use it together with the parameter --output!\n")
+        raise RuntimeError('error bad parameters --output_dirname')
+
+    if args.threads > 1 and not args.pipeline:
+        logger.info("need to set --pipeline when setting threads number to be more than one.")
+        args.threads = 1
+
+    return args
+
+
+def acl_json_base_check(args):
+    if args.acl_json_path is None:
+        return args
+    json_path = args.acl_json_path
+    try:
+        with ms_open(json_path, mode="r", max_size=MAX_SIZE_LIMITE_CONFIG_FILE) as f:
+            json_dict = json.load(f)
+    except Exception as err:
+        logger.error(f"can't read acl_json_path:{json_path}")
+        raise Exception from err
+    if json_dict.get("profiler") is not None and json_dict.get("profiler").get("switch") == "on":
+        args.profiler = True
+    if json_dict.get("dump") is not None:
+        args.profiler = False
+    return args
+
+
+def config_check(config_path):
+    if not config_path:
+        return
+    max_config_size = 12800
+    if os.path.splitext(config_path)[1] != ".config":
+        logger.error(f"aipp_config:{config_path} is not a .config file")
+        raise TypeError(f"aipp_config:{config_path} is not a .config file")
+    config_size = os.path.getsize(config_path)
+    if config_size > max_config_size:
+        logger.error(f"aipp_config_size:{config_size} byte out of max limit {max_config_size} byte")
+        raise MemoryError(f"aipp_config_size:{config_size} byte out of max limit")
+    return
+
+
+def backend_run(args):
+    backend_class = BackendFactory.create_backend(args.backend)
+    backend = backend_class(args)
+    backend.load(args.model)
+    backend.run()
+    perf = backend.get_perf()
+    logger.info(f"perf info:{perf}")
+
+
+def infer_process(args:AISBenchInferArgsAdapter):
+    args = args_rules(args)
+    version_check(args)
+    args = acl_json_base_check(args)
+
+    if args.perf:
+        backend_run(args)
+        return 0
+
+    if args.profiler:
+        # try use msprof to run
+        msprof_bin = shutil.which('msprof')
+        if msprof_bin is None:
+            logger.info("find no msprof continue use acl.json mode, result won't be parsed as csv")
+        elif os.getenv('AIT_NO_MSPROF_MODE') == '1':
+            logger.info("find AIT_NO_MSPROF_MODE set, continue use acl.json mode, result won't be parsed as csv")
+        else:
+            ret = msprof_run_profiling(args, msprof_bin)
+            return ret
+
+    if args.dym_shape_range is not None and args.dym_shape is None:
+        # dymshape range run,according range to run each shape infer get best shape
+        dymshape_range_run(args)
+        return 0
+
+    if type(args.device) == list:
+        # args has multiple device, run single process for each device
+        ret = multidevice_run(args)
+        return ret
+
+    main(args)
+    return 0
diff --git a/tools/infer_tool/ais_bench/infer/interface.py b/tools/infer_tool/ais_bench/infer/interface.py
new file mode 100644
index 0000000000000000000000000000000000000000..7719a8e6b3b7138bb34154497b8f7bcbb4bf9774
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/interface.py
@@ -0,0 +1,889 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import logging
+import time
+import sys
+from configparser import ConfigParser
+from multiprocessing import Pool
+from multiprocessing import Manager
+import numpy as np
+import aclruntime
+
+
+SRC_IMAGE_SIZE_W_MIN = 2
+SRC_IMAGE_SIZE_W_MAX = 4096
+SRC_IMAGE_SIZE_H_MIN = 1
+SRC_IMAGE_SIZE_H_MAX = 4096
+RBUV_SWAP_SWITCH_OFF = 0
+RBUV_SWAP_SWITCH_ON = 1
+AX_SWAP_SWITCH_OFF = 0
+AX_SWAP_SWITCH_ON = 1
+CSC_SWITCH_OFF = 0
+CSC_SWITCH_ON = 0
+CSC_MATRIX_MIN = -32677
+CSC_MATRIX_MAX = 32676
+CROP_SWITCH_OFF = 0
+CROP_SWITCH_ON = 1
+LOAD_START_POS_W_MIN = 0
+LOAD_START_POS_W_MAX = 4095
+LOAD_START_POS_H_MIN = 0
+LOAD_START_POS_H_MAX = 4095
+CROP_POS_W_MIN = 1
+CROP_POS_W_MAX = 4096
+CROP_POS_H_MIN = 1
+CROP_POS_H_MAX = 4096
+PADDING_SWITCH_OFF = 0
+PADDING_SWITCH_ON = 1
+PADDING_SIZE_MIN = 0
+PADDING_SIZE_MAX = 32
+PIXEL_MEAN_CHN_MIN = 0
+PIXEL_MEAN_CHN_MAX = 255
+PIXEL_MIN_CHN_MIN = 0
+PIXEL_MIN_CHN_MAX = 255
+PIXEL_VAR_RECI_CHN_MIN = -65504
+PIXEL_VAR_RECI_CHN_MAX = 65504
+
+TORCH_TENSOR_LIST = [
+    'torch.FloatTensor', 'torch.DoubleTensor', 'torch.HalfTensor', 'torch.BFloat16Tensor',
+    'torch.ByteTensor', 'torch.CharTensor', 'torch.ShortTensor', 'torch.LongTensor',
+    'torch.BoolTensor', 'torch.IntTensor'
+]
+NP_TYPE_LIST = [
+    np.int8, np.int16, np.int32, np.int64, np.uint8, np.uint16,
+    np.uint32, np.float16, np.float32, np.float64
+]
+
+logger = logging.getLogger(__name__)
+
+
+class InferSession:
+    def __init__(self, device_id: int, model_path: str, acl_json_path: str = None,
+                 debug: bool = False, loop: int = 1):
+        """
+        init InferSession
+
+        Args:
+            device_id: device id for npu device
+            model_path: om model path to load
+            acl_json_path: set acl_json_path to enable profiling or dump function
+            debug: enable debug log.  Default: False
+            loop: loop count for one inference. Default: 1
+        """
+        self.device_id = device_id
+        self.model_path = model_path
+        self.loop = loop
+        self.options = aclruntime.session_options()
+        self.acl_json_path = acl_json_path
+        self.debug = debug
+        if acl_json_path is not None:
+            self.options.acl_json_path = self.acl_json_path
+        self.options.log_level = 1 if self.debug else 2
+        self.options.loop = self.loop
+        self.session = aclruntime.InferenceSession(self.model_path, self.device_id, self.options)
+        self.outputs_names = [meta.name for meta in self.session.get_outputs()]
+        self.intensors_desc = self.session.get_inputs()
+        self.outtensors_desc = self.session.get_outputs()
+        self.infer_mode_switch = {
+            "static": self._static_prepare,
+            "dymbatch": self._dymbatch_prepare,
+            "dymhw": self._dymhw_prepare,
+            "dymdims": self._dymdims_prepare,
+            "dymshape": self._dymshape_prepare
+        }
+
+    @staticmethod
+    def convert_tensors_to_host(tensors):
+        for tensor in tensors:
+            tensor.to_host()
+
+    @staticmethod
+    def convert_tensors_to_arrays(tensors):
+        arrays = []
+        for tensor in tensors:
+            # convert acltensor to numpy array
+            arrays.append(np.array(tensor))
+        return arrays
+
+    @staticmethod
+    def finalize():
+        if hasattr(aclruntime.InferenceSession, 'finalize'):
+            aclruntime.InferenceSession.finalize()
+
+    def get_inputs(self):
+        """
+        get inputs info of model
+        """
+        self.intensors_desc = self.session.get_inputs()
+        return self.intensors_desc
+
+    def get_outputs(self):
+        """
+        get outputs info of model
+        """
+        self.outtensors_desc = self.session.get_outputs()
+        return self.outtensors_desc
+
+    def set_loop_count(self, loop):
+        options = self.session.options()
+        options.loop = loop
+
+    # 默认设置为静态batch
+    def set_staticbatch(self):
+        self.session.set_staticbatch()
+
+    def set_dynamic_batchsize(self, dym_batch: str):
+        self.session.set_dynamic_batchsize(dym_batch)
+
+    def set_dynamic_hw(self, w: int, h: int):
+        self.session.set_dynamic_hw(w, h)
+
+    def get_max_dym_batchsize(self):
+        return self.session.get_max_dym_batchsize()
+
+    def set_dynamic_dims(self, dym_dims: str):
+        self.session.set_dynamic_dims(dym_dims)
+
+    def set_dynamic_shape(self, dym_shape: str):
+        self.session.set_dynamic_shape(dym_shape)
+
+    def set_custom_outsize(self, custom_sizes):
+        self.session.set_custom_outsize(custom_sizes)
+
+    def create_tensor_from_fileslist(self, desc, files):
+        return self.session.create_tensor_from_fileslist(desc, files)
+
+    def create_tensor_from_arrays_to_device(self, arrays):
+        tensor = aclruntime.Tensor(arrays)
+        tensor.to_device(self.device_id)
+        return tensor
+
+    def get_dym_aipp_input_exist(self):
+        return self.session.get_dym_aipp_input_exist()
+
+    def check_dym_aipp_input_exist(self):
+        self.session.check_dym_aipp_input_exist()
+
+    def load_aipp_config_file(self, config_file, batchsize):
+        cfg = ConfigParser()
+        cfg.read(config_file, 'UTF-8')
+        session_list = cfg.sections()
+        #多个aipp输入不支持
+        if (session_list.count('aipp_op') != 1):
+            logger.error("nums of section aipp_op in .config file is not supported, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+        option_list = cfg.options('aipp_op')
+        if (option_list.count('input_format') == 1):
+            self.aipp_set_input_format(cfg)
+        else:
+            logger.error("can not find input_format in config file, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+        if (option_list.count('src_image_size_w') == 1 and option_list.count('src_image_size_h') == 1):
+            self.aipp_set_src_image_size(cfg)
+        else:
+            logger.error("can not find src_image_size in config file, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+        self.session.aipp_set_max_batch_size(batchsize)
+        self.aipp_set_rbuv_swap_switch(cfg, option_list)
+        self.aipp_set_ax_swap_switch(cfg, option_list)
+        self.aipp_set_csc_params(cfg, option_list)
+        self.aipp_set_crop_params(cfg, option_list)
+        self.aipp_set_padding_params(cfg, option_list)
+        self.aipp_set_dtc_pixel_mean(cfg, option_list)
+        self.aipp_set_dtc_pixel_min(cfg, option_list)
+        self.aipp_set_pixel_var_reci(cfg, option_list)
+
+        ret = self.session.set_dym_aipp_info_set()
+        return ret
+
+    def aipp_set_input_format(self, cfg):
+        input_format = cfg.get('aipp_op', 'input_format')
+        legal_format = ["YUV420SP_U8", "XRGB8888_U8", "RGB888_U8", "YUV400_U8"]
+        if (legal_format.count(input_format) == 1):
+            self.session.aipp_set_input_format(input_format)
+        else:
+            logger.error("input_format in config file is illegal, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+    def aipp_set_src_image_size(self, cfg):
+        src_image_size = list()
+        tmp_size_w = cfg.getint('aipp_op', 'src_image_size_w')
+        tmp_size_h = cfg.getint('aipp_op', 'src_image_size_h')
+        if (SRC_IMAGE_SIZE_W_MIN <= tmp_size_w <= SRC_IMAGE_SIZE_W_MAX):
+            src_image_size.append(tmp_size_w)
+        else:
+            logger.error("src_image_size_w in config file out of range, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+        if (SRC_IMAGE_SIZE_H_MIN <= tmp_size_h <= SRC_IMAGE_SIZE_H_MAX):
+            src_image_size.append(tmp_size_h)
+        else:
+            logger.error("src_image_size_h in config file out of range, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+        self.session.aipp_set_src_image_size(src_image_size)
+
+    def aipp_set_rbuv_swap_switch(self, cfg, option_list):
+        if (option_list.count('rbuv_swap_switch') == 0):
+            self.session.aipp_set_rbuv_swap_switch(RBUV_SWAP_SWITCH_OFF)
+            return
+        tmp_rs_switch = cfg.getint('aipp_op', 'rbuv_swap_switch')
+        if (tmp_rs_switch == RBUV_SWAP_SWITCH_OFF or tmp_rs_switch == RBUV_SWAP_SWITCH_ON):
+            self.session.aipp_set_rbuv_swap_switch(tmp_rs_switch)
+        else:
+            logger.error("rbuv_swap_switch in config file out of range, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+    def aipp_set_ax_swap_switch(self, cfg, option_list):
+        if (option_list.count('ax_swap_switch') == 0):
+            self.session.aipp_set_ax_swap_switch(AX_SWAP_SWITCH_OFF)
+            return
+        tmp_as_switch = cfg.getint('aipp_op', 'ax_swap_switch')
+        if (tmp_as_switch == AX_SWAP_SWITCH_OFF or tmp_as_switch == AX_SWAP_SWITCH_ON):
+            self.session.aipp_set_ax_swap_switch(tmp_as_switch)
+        else:
+            logger.error("ax_swap_switch in config file out of range, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+    def aipp_set_csc_params(self, cfg, option_list):
+        if (option_list.count('csc_switch') == 0):
+            tmp_csc_switch = CSC_SWITCH_OFF
+        else:
+            tmp_csc_switch = cfg.getint('aipp_op', 'csc_switch')
+
+        if (tmp_csc_switch == CSC_SWITCH_OFF):
+            tmp_csc_params = [0] * 16
+        elif (tmp_csc_switch == CSC_SWITCH_ON):
+            tmp_csc_params = list()
+            tmp_csc_params.append(tmp_csc_switch)
+            options = [
+                'matrix_r0c0', 'matrix_r0c1', 'matrix_r0c2', 'matrix_r1c0', 'matrix_r1c1', 'matrix_r1c2',
+                'matrix_r2c0', 'matrix_r2c1', 'matrix_r2c2', 'output_bias_0', 'output_bias_1', 'output_bias_2',
+                'input_bias_0', 'input_bias_1', 'input_bias_2'
+            ]
+            for option in options:
+                tmp_csc_params.append(0 if option_list.count(option) == 0 else cfg.getint('aipp_op', option))
+
+            range_ok = True
+            for i in range(1, 9):
+                range_ok = range_ok and (CSC_MATRIX_MIN <= tmp_csc_params[i] <= CSC_MATRIX_MAX)
+            for i in range(10, 15):
+                range_ok = range_ok and (0 <= tmp_csc_params[i] <= 255)
+            if (range_ok is False):
+                logger.error("csc_params in config file out of range, please check it!")
+                raise RuntimeError('wrong aipp config file content!')
+        else:
+            logger.error("csc_switch in config file out of range, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+        self.session.aipp_set_csc_params(tmp_csc_params)
+
+    def aipp_set_crop_params(self, cfg, option_list):
+        if (option_list.count('crop') == 0):
+            tmp_crop_switch = CROP_SWITCH_OFF
+        else:
+            tmp_crop_switch = cfg.getint('aipp_op', 'crop')
+
+        if (tmp_crop_switch == CROP_SWITCH_OFF):
+            tmp_crop_params = [0, 0, 0, 416, 416]
+        elif (tmp_crop_switch == CROP_SWITCH_ON):
+            tmp_crop_params = list()
+            tmp_crop_params.append(tmp_crop_switch)
+            tmp_crop_params.append(
+                0 if option_list.count('load_start_pos_w') == 0 else cfg.getint('aipp_op', 'load_start_pos_w')
+            )
+            tmp_crop_params.append(
+                0 if option_list.count('load_start_pos_h') == 0 else cfg.getint('aipp_op', 'load_start_pos_h')
+            )
+            tmp_crop_params.append(
+                0 if option_list.count('crop_size_w') == 0 else cfg.getint('aipp_op', 'crop_size_w')
+            )
+            tmp_crop_params.append(
+                0 if option_list.count('crop_size_h') == 0 else cfg.getint('aipp_op', 'crop_size_h')
+            )
+
+            range_ok = True
+            range_ok = range_ok and (LOAD_START_POS_W_MIN <= tmp_crop_params[1] <= LOAD_START_POS_W_MAX)
+            range_ok = range_ok and (LOAD_START_POS_H_MIN <= tmp_crop_params[2] <= LOAD_START_POS_H_MAX)
+            range_ok = range_ok and (CROP_POS_W_MIN <= tmp_crop_params[3] <= CROP_POS_W_MAX)
+            range_ok = range_ok and (CROP_POS_H_MIN <= tmp_crop_params[4] <= CROP_POS_H_MAX)
+            if (range_ok is False):
+                logger.error("crop_params in config file out of range, please check it!")
+                raise RuntimeError('wrong aipp config file content!')
+        else:
+            logger.error("crop_switch(crop) in config file out of range, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+        self.session.aipp_set_crop_params(tmp_crop_params)
+
+    def aipp_set_padding_params(self, cfg, option_list):
+        if (option_list.count('padding') == 0):
+            tmp_padding_switch = PADDING_SWITCH_OFF
+        else:
+            tmp_padding_switch = cfg.getint('aipp_op', 'padding')
+
+        if (tmp_padding_switch == PADDING_SWITCH_OFF):
+            tmp_padding_params = [0] * 5
+        elif (tmp_padding_switch == PADDING_SWITCH_ON):
+            tmp_padding_params = list()
+            tmp_padding_params.append(tmp_padding_switch)
+            tmp_padding_params.append(
+                0 if option_list.count('padding_size_top') == 0 else cfg.getint('aipp_op', 'padding_size_top')
+            )
+            tmp_padding_params.append(
+                0 if option_list.count('padding_size_bottom') == 0 else cfg.getint('aipp_op', 'padding_size_bottom')
+            )
+            tmp_padding_params.append(
+                0 if option_list.count('padding_size_left') == 0 else cfg.getint('aipp_op', 'padding_size_left')
+            )
+            tmp_padding_params.append(
+                0 if option_list.count('padding_size_right') == 0 else cfg.getint('aipp_op', 'padding_size_right')
+            )
+
+            range_ok = True
+            for i in range(1, 5):
+                range_ok = range_ok and (PADDING_SIZE_MIN <= tmp_padding_params[i] <= PADDING_SIZE_MAX)
+            if (range_ok is False):
+                logger.error("padding_params in config file out of range, please check it!")
+                raise RuntimeError('wrong aipp config file content!')
+        else:
+            logger.error("padding_switch in config file out of range, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+        self.session.aipp_set_padding_params(tmp_padding_params)
+
+    def aipp_set_dtc_pixel_mean(self, cfg, option_list):
+        tmp_mean_params = list()
+        tmp_mean_params.append(
+            0 if option_list.count('mean_chn_0') == 0 else cfg.getint('aipp_op', 'mean_chn_0')
+        )
+        tmp_mean_params.append(
+            0 if option_list.count('mean_chn_1') == 0 else cfg.getint('aipp_op', 'mean_chn_1')
+        )
+        tmp_mean_params.append(
+            0 if option_list.count('mean_chn_2') == 0 else cfg.getint('aipp_op', 'mean_chn_2')
+        )
+        tmp_mean_params.append(
+            0 if option_list.count('mean_chn_3') == 0 else cfg.getint('aipp_op', 'mean_chn_3')
+        )
+
+        range_ok = True
+        for i in range(0, 4):
+            range_ok = range_ok and (PIXEL_MEAN_CHN_MIN <= tmp_mean_params[i] <= PIXEL_MEAN_CHN_MAX)
+        if (range_ok is False):
+            logger.error("mean_chn_params in config file out of range, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+        self.session.aipp_set_dtc_pixel_mean(tmp_mean_params)
+
+    def aipp_set_dtc_pixel_min(self, cfg, option_list):
+        tmp_min_params = list()
+        tmp_min_params.append(
+            0 if option_list.count('min_chn_0') == 0 else cfg.getfloat('aipp_op', 'min_chn_0')
+        )
+        tmp_min_params.append(
+            0 if option_list.count('min_chn_1') == 0 else cfg.getfloat('aipp_op', 'min_chn_1')
+        )
+        tmp_min_params.append(
+            0 if option_list.count('min_chn_2') == 0 else cfg.getfloat('aipp_op', 'min_chn_2')
+        )
+        tmp_min_params.append(
+            0 if option_list.count('min_chn_3') == 0 else cfg.getfloat('aipp_op', 'min_chn_3')
+        )
+
+        range_ok = True
+        for i in range(0, 4):
+            range_ok = range_ok and (PIXEL_MIN_CHN_MIN <= tmp_min_params[i] <= PIXEL_MIN_CHN_MAX)
+        if (range_ok is False):
+            logger.error("min_chn_params in config file out of range, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+        self.session.aipp_set_dtc_pixel_min(tmp_min_params)
+
+    def aipp_set_pixel_var_reci(self, cfg, option_list):
+        tmp_reci_params = list()
+        tmp_reci_params.append(
+            0 if option_list.count('var_reci_chn_0') == 0 else cfg.getfloat('aipp_op', 'var_reci_chn_0')
+        )
+        tmp_reci_params.append(
+            0 if option_list.count('var_reci_chn_1') == 0 else cfg.getfloat('aipp_op', 'var_reci_chn_1')
+        )
+        tmp_reci_params.append(
+            0 if option_list.count('var_reci_chn_2') == 0 else cfg.getfloat('aipp_op', 'var_reci_chn_2')
+        )
+        tmp_reci_params.append(
+            0 if option_list.count('var_reci_chn_3') == 0 else cfg.getfloat('aipp_op', 'var_reci_chn_3')
+        )
+
+        range_ok = True
+        for i in range(0, 4):
+            range_ok = range_ok and (PIXEL_VAR_RECI_CHN_MIN <= tmp_reci_params[i] <= PIXEL_VAR_RECI_CHN_MAX)
+        if (range_ok is False):
+            logger.error("var_reci_chn_params in config file out of range, please check it!")
+            raise RuntimeError('wrong aipp config file content!')
+
+        self.session.aipp_set_pixel_var_reci(tmp_reci_params)
+
+    def run(self, feeds, out_array=False):
+        if len(feeds) > 0 and isinstance(feeds[0], np.ndarray):
+            # if feeds is ndarray list, convert to baseTensor
+            inputs = []
+            for array in feeds:
+                basetensor = aclruntime.BaseTensor(array.__array_interface__['data'][0], array.nbytes)
+                inputs.append(basetensor)
+        else:
+            inputs = feeds
+        outputs = self.session.run(self.outputs_names, inputs)
+        if out_array:
+            # convert to host tensor
+            self.convert_tensors_to_host(outputs)
+            # convert tensor to narray
+            return self.convert_tensors_to_arrays(outputs)
+        else:
+            return outputs
+
+    def run_pipeline(self, infilelist, output, auto_shape=False,
+                     auto_dims=False, outfmt="BIN", pure_infer_mode=False, extra_session=None):
+        infer_options = aclruntime.infer_options()
+        infer_options.output_dir = output
+        infer_options.auto_dym_shape = auto_shape
+        infer_options.auto_dym_dims = auto_dims
+        infer_options.out_format = outfmt
+        infer_options.pure_infer_mode = pure_infer_mode
+        extra_session = [] if extra_session is None else extra_session
+        self.session.run_pipeline(infilelist, infer_options, extra_session)
+
+    def reset_summaryinfo(self):
+        self.session.reset_sumaryinfo()
+
+    def infer(self, feeds, mode='static', custom_sizes=100000, out_array=True):
+        '''
+        Parameters:
+            feeds: input data
+            mode: static dymdims dymshape...
+        '''
+        inputs = []
+        shapes = []
+        for feed in feeds:
+            if type(feed) is np.ndarray:
+                infer_input = feed
+                if not infer_input.flags.c_contiguous:
+                    infer_input = np.ascontiguousarray(infer_input)
+                shapes.append(infer_input.shape)
+            elif type(feed) in NP_TYPE_LIST:
+                infer_input = np.array(feed)
+                if not infer_input.flags.c_contiguous:
+                    infer_input = np.ascontiguousarray(infer_input)
+                shapes.append([feed.size])
+            elif type(feed) is aclruntime.Tensor:
+                infer_input = feed
+                shapes.append(infer_input.shape)
+            elif hasattr(feed, 'type') and feed.type() in TORCH_TENSOR_LIST:
+                infer_input = feed.numpy()
+                if not feed.is_contiguous():
+                    infer_input = np.ascontiguousarray(infer_input)
+                shapes.append(infer_input.shape)
+            else:
+                raise RuntimeError('type:{} invalid'.format(type(feed)))
+            inputs.append(infer_input)
+
+        if self.infer_mode_switch.get(mode) is not None:
+            self.infer_mode_switch.get(mode)(shapes, custom_sizes)
+        else:
+            raise RuntimeError('wrong infer_mode:{}, only support \"static\",\"dymbatch\",\"dymhw\", \
+                \"dymdims\",\"dymshape\"'.format(mode))
+
+        return self.run(inputs, out_array)
+
+    def free_resource(self):
+        if hasattr(self.session, "free_resource"):
+            self.session.free_resource()
+
+    def infer_pipeline(self, feeds_list, mode='static', custom_sizes=100000):
+        '''
+        Parameters:
+            feeds_list: input data list
+            mode: static dymdims dymshape...
+        '''
+        inputs_list = []
+        shapes_list = []
+        for feeds in feeds_list:
+            inputs = []
+            shapes = []
+            for feed in feeds:
+                if type(feed) is np.ndarray:
+                    infer_input = feed
+                    if not infer_input.flags.c_contiguous:
+                        infer_input = np.ascontiguousarray(infer_input)
+                    shape = feed.shape
+                elif type(feed) in NP_TYPE_LIST:
+                    infer_input = np.array(feed)
+                    if not infer_input.flags.c_contiguous:
+                        infer_input = np.ascontiguousarray(infer_input)
+                    shape = [feed.size]
+                elif type(feed) is aclruntime.Tensor:
+                    infer_input = np.array(feed)
+                    shape = infer_input.shape
+                elif hasattr(feed, 'type') and feed.type() in TORCH_TENSOR_LIST:
+                    infer_input = feed.numpy()
+                    infer_input = np.ascontiguousarray(infer_input) if not feed.is_contiguous() else infer_input
+                    shape = infer_input.shape
+                else:
+                    raise RuntimeError('type:{} invalid'.format(type(feed)))
+                basetensor = aclruntime.BaseTensor(infer_input.__array_interface__['data'][0], infer_input.nbytes)
+                inputs.append(basetensor)
+                shapes.append(shape)
+            inputs_list.append(inputs)
+            shapes_list.append(shapes)
+        if self.infer_mode_switch.get(mode) is not None and mode != "dymshape" and mode != "dymdims":
+            self.infer_mode_switch.get(mode)(shapes, custom_sizes)
+        elif mode == "dymshape":
+            if isinstance(custom_sizes, int):
+                custom_sizes = [custom_sizes] * len(self.get_outputs())
+            elif not isinstance(custom_sizes, list):
+                raise RuntimeError('custom_sizes:{} type:{} invalid'.format(
+                    custom_sizes, type(custom_sizes)))
+            self.session.set_custom_outsize(custom_sizes)
+        elif mode == "dymdims":
+            pass
+        else:
+            raise RuntimeError('wrong infer_mode:{}, only support \"static\",\"dymbatch\",\"dymhw\", \
+                \"dymdims\",\"dymshape\"'.format(mode))
+        outputs = self.session.run_pipeline(self.outputs_names, inputs_list, shapes_list,
+                                            mode == 'dymshape', mode == 'dymdims')
+        for i, output in enumerate(outputs):
+            outputs[i] = self.convert_tensors_to_arrays(output)
+        return outputs
+
+    def inner_run(self, in_out_list, get_outputs=False, mem_copy=True):
+        '''
+        Parameters:
+            in_out_list: relation between current input datas and last output datas
+            get_outputs: get outputs from device or not
+            mem_copy: the way inputs get data from outputs
+        '''
+        if (get_outputs):
+            outputs = self.session.inner_run(in_out_list, self.outputs_names, get_outputs, mem_copy)
+            return outputs
+        else:
+            self.session.inner_run(in_out_list, self.outputs_names, get_outputs, mem_copy)
+            outputs = None
+            return outputs
+
+    def first_inner_run(self, feeds, mode='static', custom_sizes=100000):
+        '''
+        Parameters:
+            feeds: input data
+            mode: static dymdims dymshapes ...
+            custom_sizes: must equal to the realsize of outputs
+        '''
+        inputs = []
+        shapes = []
+        for feed in feeds:
+            if type(feed) is np.ndarray:
+                infer_input = feed
+                if not infer_input.flags.c_contiguous:
+                    infer_input = np.ascontiguousarray(infer_input)
+                shapes.append(infer_input.shape)
+            elif type(feed) in NP_TYPE_LIST:
+                infer_input = np.array(feed)
+                if not infer_input.flags.c_contiguous:
+                    infer_input = np.ascontiguousarray(infer_input)
+                shapes.append([feed.size])
+            elif hasattr(feed, 'type') and feed.type() in TORCH_TENSOR_LIST:
+                infer_input = feed.numpy()
+                if not feed.is_contiguous():
+                    infer_input = np.ascontiguousarray(infer_input)
+                shapes.append(infer_input.shape)
+            else:
+                raise RuntimeError('type:{} invalid'.format(type(feed)))
+            basetensor = aclruntime.BaseTensor(infer_input.__array_interface__['data'][0], infer_input.nbytes)
+            inputs.append(basetensor)
+
+        if self.infer_mode_switch.get(mode) is not None:
+            self.infer_mode_switch.get(mode)(shapes, custom_sizes)
+        else:
+            raise RuntimeError('wrong infer_mode:{}, only support \"static\",\"dymbatch\",\"dymhw\", \
+                \"dymdims\",\"dymshape\"'.format(mode))
+
+        return self.session.first_inner_run(self.outputs_names, inputs)
+
+    def infer_iteration(self, feeds, in_out_list=None, iteration_times=1, mode='static',
+            custom_sizes=100000, mem_copy=True):
+        '''
+        Parameters:
+            feeds: input datas
+            in_out_list: relation between current input datas and last output datas
+            iteration_times: inner iteration infer loop times
+            mode: static dymdims dymshape ...
+            custom_sizes: only dymshape needs
+        '''
+        if not in_out_list:
+            in_out_list = []
+        if len(in_out_list) != len(self.get_inputs()):
+            raise RuntimeError(f"inputs' amount and length of in_out_list not matched!")
+        if (iteration_times == 1):
+            outputs = self.infer(feeds, mode, custom_sizes)
+            return outputs
+        else:
+            self.first_inner_run(feeds, mode, custom_sizes)
+            for _ in range(iteration_times - 2):
+                self.inner_run(in_out_list, False, mem_copy)
+            outputs = self.inner_run(in_out_list, True, mem_copy)
+            # convert to host tensor
+            self.convert_tensors_to_host(outputs)
+            # convert tensor to narray
+            return self.convert_tensors_to_arrays(outputs)
+
+    def summary(self):
+        return self.session.sumary()
+
+    def _static_prepare(self, shapes, custom_sizes):
+        self.set_staticbatch()
+
+    def _dymbatch_prepare(self, shapes, custom_sizes):
+        indesc = self.get_inputs()
+        if (len(shapes) != len(indesc)):
+            raise RuntimeError("input datas and intensors nums not matched!")
+        for i, shape in enumerate(shapes):
+            for j, dim in enumerate(shape):
+                if (indesc[i].shape[j] < 0):
+                    self.set_dynamic_batchsize(dim)
+                    return
+                if (indesc[i].shape[j] != dim):
+                    raise RuntimeError("input datas and intensors dim not matched!")
+        raise RuntimeError("not a dymbatch model!")
+
+    def _dymhw_prepare(self, shapes, custom_sizes):
+        indesc = self.get_inputs()
+        if (len(shapes) != len(indesc)):
+            raise RuntimeError("input datas and intensors nums not matched!")
+        for i, shape in enumerate(shapes):
+            if (indesc[i].shape[2] < 0 and indesc[i].shape[3] < 0):
+                self.set_dynamic_hw(shape[2], shape[3])
+                return
+        raise RuntimeError("not a dymhw model!")
+
+    def _dymdims_prepare(self, shapes, custom_sizes):
+        dym_list = []
+        indesc = self.get_inputs()
+        if (len(shapes) != len(indesc)):
+            raise RuntimeError("input datas and intensors nums not matched!")
+        for i, shape in enumerate(shapes):
+            str_shape = [str(val) for val in shape]
+            dyshape = "{}:{}".format(indesc[i].name, ",".join(str_shape))
+            dym_list.append(dyshape)
+        dyshapes = ';'.join(dym_list)
+        self.session.set_dynamic_dims(dyshapes)
+
+    def _dymshape_prepare(self, shapes, custom_sizes):
+        dym_list = []
+        indesc = self.get_inputs()
+        if (len(shapes) != len(indesc)):
+            raise RuntimeError("input datas and intensors nums not matched!")
+        outdesc = self.get_outputs()
+        for i, shape in enumerate(shapes):
+            str_shape = [str(val) for val in shape]
+            dyshape = "{}:{}".format(indesc[i].name, ",".join(str_shape))
+            dym_list.append(dyshape)
+        dyshapes = ';'.join(dym_list)
+        self.session.set_dynamic_shape(dyshapes)
+        if isinstance(custom_sizes, int):
+            custom_sizes = [custom_sizes] * len(outdesc)
+        elif not isinstance(custom_sizes, list):
+            raise RuntimeError('custom_sizes:{} type:{} invalid'.format(
+                custom_sizes, type(custom_sizes)))
+        self.session.set_custom_outsize(custom_sizes)
+
+
+class MultiDeviceSession():
+    def __init__(self, model_path: str, acl_json_path: str = None, debug: bool = False, loop: int = 1):
+        self.model_path = model_path
+        self.acl_json_path = acl_json_path
+        self.debug = debug
+        self.loop = loop
+        self.summary = {}
+
+    @classmethod
+    def print_subprocess_run_error(cls, value):
+        logger.error(f"subprocess run failed error_callback:{value}")
+
+    def summary(self):
+        return self.summary
+
+    def infer(self, device_feeds:dict, mode='static', custom_sizes=100000):
+        '''
+        Parameters:
+            device_feeds: device match [input datas1, input datas2...] (Dict)
+        '''
+        subprocess_num = 0
+        for _, device in device_feeds.items():
+            subprocess_num += len(device)
+        p = Pool(subprocess_num)
+        outputs_queue = Manager().Queue()
+        for device_id, feeds in device_feeds.items():
+            for feed in feeds:
+                p.apply_async(
+                    self.subprocess_infer,
+                    args=(outputs_queue, device_id, feed, mode, custom_sizes),
+                    error_callback=self.print_subprocess_run_error
+                )
+        p.close()
+        p.join()
+        result = 0 if 2 * len(device_feeds) == outputs_queue.qsize() else 1
+        logger.info(f"multidevice run end qsize:{outputs_queue.qsize()} result:{result}")
+        outputs_dict = {}
+        self.summary.clear()
+        while outputs_queue.qsize() != 0:
+            ret = outputs_queue.get()
+            if type(ret) == list:
+                if (not outputs_dict.get(ret[0])):
+                    outputs_dict.update({ret[0]: []})
+                    self.summary.update({ret[0]: []})
+                outputs_dict.get(ret[0]).append(ret[1])
+                self.summary.get(ret[0]).append((ret[3] - ret[2]) * 1000)
+                logger.info(f"device {ret[0]}, start_time:{ret[2]}, end_time:{ret[3]}")
+        return outputs_dict
+
+    def infer_pipeline(self, device_feeds_list:dict, mode='static', custom_sizes=100000):
+        '''
+        Parameters:
+            device_feeds: device match [input datas1, input datas2...] (Dict)
+        '''
+        subprocess_num = 0
+        for _, device in device_feeds_list.items():
+            subprocess_num += len(device)
+        p = Pool(subprocess_num)
+        outputs_queue = Manager().Queue()
+        for device_id, feeds in device_feeds_list.items():
+            for feed in feeds:
+                p.apply_async(
+                    self.subprocess_infer_pipeline,
+                    args=(outputs_queue, device_id, feed, mode, custom_sizes),
+                    error_callback=self.print_subprocess_run_error
+                )
+        p.close()
+        p.join()
+        result = 0 if 2 * len(device_feeds_list) == outputs_queue.qsize() else 1
+        logger.info(f"multidevice run pipeline end qsize:{outputs_queue.qsize()} result:{result}")
+        outputs_dict = {}
+        self.summary.clear()
+        while outputs_queue.qsize() != 0:
+            ret = outputs_queue.get()
+            if type(ret) == list:
+                if (not outputs_dict.get(ret[0])):
+                    outputs_dict.update({ret[0]: []})
+                    self.summary.update({ret[0]: []})
+                outputs_dict.get(ret[0]).append(ret[1])
+                self.summary.get(ret[0]).append((ret[3] - ret[2]) * 1000)
+                logger.info(f"device {ret[0]}, start_time:{ret[2]}, end_time:{ret[3]}")
+        return outputs_dict
+
+    def infer_iteration(self, device_feeds:dict, in_out_list=None, iteration_times=1, mode='static', custom_sizes=None, mem_copy=True):
+        '''
+        Parameters:
+            device_feeds: device match [input datas1, input datas2...] (Dict)
+        '''
+        subprocess_num = 0
+        for _, device in device_feeds.items():
+            subprocess_num += len(device)
+        p = Pool(subprocess_num)
+        outputs_queue = Manager().Queue()
+        for device_id, feeds in device_feeds.items():
+            for feed in feeds:
+                p.apply_async(
+                    self.subprocess_infer_iteration,
+                    args=(outputs_queue, device_id, feed, in_out_list, iteration_times, mode, custom_sizes, mem_copy),
+                    error_callback=self.print_subprocess_run_error
+                )
+        p.close()
+        p.join()
+        result = 0 if 2 * len(device_feeds) == outputs_queue.qsize() else 1
+        logger.info(f"multidevice run iteration end qsize:{outputs_queue.qsize()} result:{result}")
+        outputs_dict = {}
+        self.summary.clear()
+        while outputs_queue.qsize() != 0:
+            ret = outputs_queue.get()
+            if type(ret) == list:
+                if (not outputs_dict.get(ret[0])):
+                    outputs_dict.update({ret[0]: []})
+                    self.summary.update({ret[0]: []})
+                outputs_dict.get(ret[0]).append(ret[1])
+                self.summary.get(ret[0]).append((ret[3] - ret[2]) * 1000)
+                logger.info(f"device {ret[0]}, start_time:{ret[2]}, end_time:{ret[3]}")
+        return outputs_dict
+
+    def subprocess_infer(self, outputs_queue, device_id, feeds, mode='static', custom_sizes=100000):
+        sub_session = InferSession(
+            device_id=device_id,
+            model_path=self.model_path,
+            acl_json_path=self.acl_json_path,
+            debug=self.debug,
+            loop=self.loop
+        )
+        start_time = time.time()
+        outputs = sub_session.infer(feeds, mode, custom_sizes, out_array=True)
+        end_time = time.time()
+        outputs_queue.put([device_id, outputs, start_time, end_time])
+        return
+
+    def subprocess_infer_pipeline(self, outputs_queue, device_id, feeds_list, mode='static', custom_sizes=100000):
+        sub_session = InferSession(
+            device_id=device_id,
+            model_path=self.model_path,
+            acl_json_path=self.acl_json_path,
+            debug=self.debug,
+            loop=self.loop
+        )
+        start_time = time.time()
+        outputs = sub_session.infer_pipeline(feeds_list, mode, custom_sizes)
+        end_time = time.time()
+        outputs_queue.put([device_id, outputs, start_time, end_time])
+        return
+
+    def subprocess_infer_iteration(self, outputs_queue, device_id, feeds, in_out_list=None,
+            iteration_times=1, mode='static', custom_sizes=None, mem_copy=True):
+        sub_session = InferSession(
+            device_id=device_id,
+            model_path=self.model_path,
+            acl_json_path=self.acl_json_path,
+            debug=self.debug,
+            loop=self.loop
+        )
+        start_time = time.time()
+        outputs = sub_session.infer_iteration(feeds, in_out_list, iteration_times, mode, custom_sizes, mem_copy)
+        end_time = time.time()
+        outputs_queue.put([device_id, outputs, start_time, end_time])
+        return
+
+
+class MemorySummary:
+    @staticmethod
+    def get_h2d_time_list():
+        if hasattr(aclruntime, 'MemorySummary'):
+            return aclruntime.MemorySummary().H2D_time_list
+        else:
+            return []
+
+    @staticmethod
+    def get_d2h_time_list():
+        if hasattr(aclruntime, 'MemorySummary'):
+            return aclruntime.MemorySummary().D2H_time_list
+        else:
+            return []
+
+    @staticmethod
+    def reset():
+        if hasattr(aclruntime, 'MemorySummary'):
+            aclruntime.MemorySummary().reset()
diff --git a/tools/infer_tool/ais_bench/infer/registry.py b/tools/infer_tool/ais_bench/infer/registry.py
new file mode 100644
index 0000000000000000000000000000000000000000..60f4784c6fb674bd413411154290fbb1dcf774c3
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/registry.py
@@ -0,0 +1,103 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+import logging
+from typing import Any, Dict, Iterable, Iterator, Tuple
+from ais_bench.infer.common.utils import logger
+
+
+class Registry(Iterable[Tuple[str, Any]]):
+    """
+    The registry that provides name -> object mapping, to support third-party
+    users' custom modules.
+    """
+    def register(self, obj: Any = None) -> Any:
+        """
+        Register the given object under the the name `obj.__name__`.
+        Can be used as either a decorator or not.See docstring of this class for usage.
+        """
+        if callable(obj):
+            return add(None, obj)
+
+        def add(name: str, obj: Any) -> Any:
+            self[name] = obj
+            return obj
+
+        return lambda x: add(obj, x)
+
+    def __init__(self, name: str) -> None:
+        """
+        Args:
+            name (str): the name of this registry
+        """
+        self._name: str = name
+        self._obj_map: Dict[str, Any] = {}
+
+    def __setitem__(self, name: str, obj: Any) -> None:
+        if not callable(obj):
+            raise ValueError("Value of a Registry must be a callable!")
+
+        if name is None:
+            name = obj.__name__
+
+        if name in self._obj_map:
+            raise ValueError(
+                f"An object named '{name}' was already registered in '{self._name}' registry!"
+            )
+        self._obj_map[name] = obj
+
+    def __getitem__(self, name: str) -> Any:
+        return self._obj_map[name]
+
+    def __call__(self, obj: Any) -> Any:
+        return self.register(obj)
+
+    def __contains__(self, name: str) -> bool:
+        return name in self._obj_map
+
+    def __repr__(self) -> str:
+        from tabulate import tabulate
+
+        table_headers = ["Names", "Objects"]
+        table = tabulate(
+            self._obj_map.items(), headers=table_headers, tablefmt="fancy_grid"
+        )
+        return "Registry of {}:\n".format(self._name) + table
+
+    def __iter__(self) -> Iterator[Tuple[str, Any]]:
+        return iter(self._obj_map.items())
+
+
+def import_all_modules_for_register(module_paths, base_model_name):
+    import os
+    import importlib
+
+    modules = []
+    for _, _, files in os.walk(module_paths):
+        for filename in files:
+            if not filename.endswith(".py") or filename == "__init__.py":
+                continue
+            model_name = base_model_name + "." + filename.rsplit(".", 1)[0]
+            modules.append(model_name)
+
+    errors = []
+    for module in modules:
+        try:
+            importlib.import_module(module)
+        except ImportError as e:
+            errors.append((module, e))
+            logger.info(f"import {module} error: {e}")
+
+    return errors
\ No newline at end of file
diff --git a/tools/infer_tool/ais_bench/infer/summary.py b/tools/infer_tool/ais_bench/infer/summary.py
new file mode 100644
index 0000000000000000000000000000000000000000..65d1cb8e8729e8cad6dcbc0e6ed2e40b406a6a19
--- /dev/null
+++ b/tools/infer_tool/ais_bench/infer/summary.py
@@ -0,0 +1,229 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+import json
+import os
+import stat
+
+import numpy as np
+from ais_bench.infer.common.utils import logger
+from ais_bench.infer.common.path_security_check import ms_open
+
+
+class ListInfo(object):
+    def __init__(self):
+        self.min = 0.0
+        self.max = 0.0
+        self.mean = 0.0
+        self.median = 0.0
+        self.percentile = 0.0
+
+
+class Result(object):
+    def __init__(self):
+        self.npu_compute_time = None
+        self.h2d_latency = None
+        self.d2h_latency = None
+        self.throughput = None
+        self.scale = None
+        self.batchsize = None
+
+
+class Summary(object):
+    def __init__(self):
+        self.reset()
+        self.infodict = {"filesinfo": {}}
+
+    @staticmethod
+    def merge_intervals(intervals):
+        intervals.sort(key=lambda x: x[0])
+        merged = []
+        for interval in intervals:
+            if not merged or merged[-1][1] < interval[0]:
+                merged.append(list(interval))
+            else:
+                merged[-1][1] = max(merged[-1][1], interval[1])
+        return merged
+
+    @staticmethod
+    def get_list_info(work_list, percentile_scale, merge=False):
+        list_info = ListInfo()
+        if merge:  # work_list is a 2-dim vector each element is a pair containing start and end time
+            n = len(work_list)
+            if n == 0:
+                raise RuntimeError(f'summary.get_list_info failed: inner error')
+            merged_intervals = Summary.merge_intervals(work_list)
+            sum_time = sum(end_time - start_time for start_time, end_time in merged_intervals)
+            list_info.mean = sum_time / n
+
+        elif len(work_list) != 0:
+            list_info.min = np.min(work_list)
+            list_info.max = np.max(work_list)
+            list_info.mean = np.mean(work_list)
+            list_info.median = np.median(work_list)
+            list_info.percentile = np.percentile(work_list, percentile_scale)
+
+        return list_info
+
+    def reset(self):
+        self.h2d_latency_list = []
+        self.d2h_latency_list = []
+        self.npu_compute_time_list = []
+        self.npu_compute_time_interval_list = []
+        self._batchsizes = []
+
+    def add_batchsize(self, n: int):
+        self._batchsizes.append(n)
+
+    def add_sample_id_infiles(self, sample_id, infiles):
+        if self.infodict["filesinfo"].get(sample_id) is None:
+            self.infodict["filesinfo"][sample_id] = {"infiles": [], "outfiles": []}
+        if len(self.infodict["filesinfo"][sample_id]["infiles"]) == 0:
+            for files in infiles:
+                self.infodict["filesinfo"][sample_id]["infiles"].append(files)
+
+    def append_sample_id_outfile(self, sample_id, outfile):
+        if self.infodict["filesinfo"].get(sample_id) is None:
+            self.infodict["filesinfo"][sample_id] = {"infiles": [], "outfiles": []}
+        self.infodict["filesinfo"][sample_id]["outfiles"].append(outfile)
+
+    def add_args(self, args):
+        self.infodict["args"] = args
+
+    def record(self, result, multi_threads=False):
+        if multi_threads:
+            self.infodict['NPU_compute_time'] = {
+                "mean": result.npu_compute_time.mean,
+                "count": len(self.npu_compute_time_interval_list),
+            }
+            self.infodict['H2D_latency'] = {"mean": result.h2d_latency.mean, "count": len(self.h2d_latency_list)}
+            self.infodict['D2H_latency'] = {"mean": result.d2h_latency.mean, "count": len(self.d2h_latency_list)}
+            self.infodict['npu_compute_time_list'] = self.npu_compute_time_interval_list
+        else:
+            self.infodict['NPU_compute_time'] = {
+                "min": result.npu_compute_time.min,
+                "max": result.npu_compute_time.max,
+                "mean": result.npu_compute_time.mean,
+                "median": result.npu_compute_time.median,
+                "percentile({}%)".format(result.scale): result.npu_compute_time.percentile,
+                "count": len(self.npu_compute_time_list),
+            }
+            self.infodict['H2D_latency'] = {
+                "min": result.h2d_latency.min,
+                "max": result.h2d_latency.max,
+                "mean": result.h2d_latency.mean,
+                "median": result.h2d_latency.median,
+                "percentile({}%)".format(result.scale): result.h2d_latency.percentile,
+                "count": len(self.h2d_latency_list),
+            }
+            self.infodict['D2H_latency'] = {
+                "min": result.d2h_latency.min,
+                "max": result.d2h_latency.max,
+                "mean": result.d2h_latency.mean,
+                "median": result.d2h_latency.median,
+                "percentile({}%)".format(result.scale): result.d2h_latency.percentile,
+                "count": len(self.d2h_latency_list),
+            }
+            self.infodict['npu_compute_time_list'] = self.npu_compute_time_list
+        self.infodict['throughput'] = result.throughput
+        self.infodict['pid'] = os.getpid()
+
+    def display(self, result, display_all_summary, multi_threads):
+        logger.info("-----------------Performance Summary------------------")
+        if multi_threads:
+            if display_all_summary is True:
+                logger.info("H2D_latency (ms): mean = {0}".format(result.h2d_latency.mean))
+            logger.info("NPU_compute_time (ms): mean = {0}".format(result.npu_compute_time.mean))
+            if display_all_summary is True:
+                logger.info("D2H_latency (ms): mean = {0}".format(result.d2h_latency.mean))
+        else:
+            if display_all_summary is True:
+                logger.info(
+                    "H2D_latency (ms): min = {0}, max = {1}, mean = {2}, median = {3}, percentile({4}%) = {5}".format(
+                        result.h2d_latency.min,
+                        result.h2d_latency.max,
+                        result.h2d_latency.mean,
+                        result.h2d_latency.median,
+                        result.scale,
+                        result.h2d_latency.percentile,
+                    )
+                )
+
+            logger.info(
+                "NPU_compute_time (ms): min = {0}, max = {1}, mean = {2}, median = {3}, percentile({4}%) = {5}".format(
+                    result.npu_compute_time.min,
+                    result.npu_compute_time.max,
+                    result.npu_compute_time.mean,
+                    result.npu_compute_time.median,
+                    result.scale,
+                    result.npu_compute_time.percentile,
+                )
+            )
+            if display_all_summary is True:
+                logger.info(
+                    "D2H_latency (ms): min = {0}, max = {1}, mean = {2}, median = {3}, percentile({4}%) = {5}".format(
+                        result.d2h_latency.min,
+                        result.d2h_latency.max,
+                        result.d2h_latency.mean,
+                        result.d2h_latency.median,
+                        result.scale,
+                        result.d2h_latency.percentile,
+                    )
+                )
+        logger.info(
+            "throughput 1000*batchsize.mean({})/NPU_compute_time.mean({}): {}".format(
+                result.batchsize, result.npu_compute_time.mean, result.throughput
+            )
+        )
+        logger.info("------------------------------------------------------")
+
+    def report(self, batchsize, output_prefix, display_all_summary=False, multi_threads=False):
+        scale = 99
+
+        if self.npu_compute_time_list and self.npu_compute_time_interval_list:
+            logger.error("npu_compute_time_list and npu_compute_time_interval_list exits at the same time")
+            raise Exception
+        if self.npu_compute_time_list:
+            npu_compute_time = Summary.get_list_info(self.npu_compute_time_list, scale)
+        else:
+            npu_compute_time = Summary.get_list_info(self.npu_compute_time_interval_list, scale, True)
+        h2d_latency = Summary.get_list_info(self.h2d_latency_list, scale)
+        d2h_latency = Summary.get_list_info(self.d2h_latency_list, scale)
+        if self._batchsizes:
+            batchsize = sum(self._batchsizes) / len(self._batchsizes)
+        else:
+            pass
+        if npu_compute_time.mean == 0:
+            throughput = 0
+        else:
+            throughput = 1000 * batchsize / npu_compute_time.mean
+
+        result = Result()
+        result.npu_compute_time = npu_compute_time
+        result.d2h_latency = d2h_latency
+        result.h2d_latency = h2d_latency
+        result.throughput = throughput
+        result.scale = scale
+        result.batchsize = batchsize
+
+        self.record(result, multi_threads)
+        self.display(result, display_all_summary, multi_threads)
+
+        if output_prefix is not None:
+            with ms_open(output_prefix + "_summary.json", mode="w") as f:
+                json.dump(self.infodict, f)
+
+
+summary = Summary()
diff --git a/tools/infer_tool/requirements.txt b/tools/infer_tool/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..e6094bfabeeab6ceaaa636f38b2f618ee081d560
--- /dev/null
+++ b/tools/infer_tool/requirements.txt
@@ -0,0 +1,3 @@
+numpy
+tqdm
+attrs >= 21.3.0
\ No newline at end of file
diff --git a/tools/infer_tool/setup.py b/tools/infer_tool/setup.py
new file mode 100644
index 0000000000000000000000000000000000000000..df023b7ee2daf2f4c72e96d58311024111f40618
--- /dev/null
+++ b/tools/infer_tool/setup.py
@@ -0,0 +1,51 @@
+# Copyright (c) 2023-2023 Huawei Technologies Co., Ltd.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import subprocess
+from setuptools import setup, find_packages  # type: ignore
+
+
+with open('requirements.txt', encoding='utf-8') as f:
+    required = f.read().splitlines()
+
+with open('README.md', encoding='utf-8') as f:
+    long_description = f.read()
+
+# 使用Git命令获取最新的提交哈希
+try:
+    git_hash = subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('utf-8').strip()
+except Exception:
+    git_hash = ""
+# 使用Git命令获取最新的提交日期和时间
+try:
+    git_date = subprocess.check_output(['git', 'show', '-s', '--format=%cd', 'HEAD']).decode('utf-8').strip()
+except Exception:
+    git_date = ""
+
+setup(
+    name='ais_bench',
+    version='0.0.2',
+    description='ais_bench tool',
+    long_description=long_description,
+    url=f"https://gitee.com/ascend/tools/, commit id: {git_hash}, release_date: {git_date}",
+    release_date = git_date,
+    packages=find_packages(),
+    include_package_data=True,
+    keywords='ais_bench tool',
+    install_requires=required,
+    python_requires='>=3.7',
+    entry_points={
+        'benchmark_sub_task': ['benchmark=ais_bench.infer.main_cli:get_cmd_instance'],
+    },
+
+)
\ No newline at end of file