Skip to content

Latest commit

 

History

History
89 lines (62 loc) · 6.23 KB

README_EN.md

File metadata and controls

89 lines (62 loc) · 6.23 KB

简体中文 | English

Introduction

nndeploy is a cross-platform, high-performing, and straightforward AI model deployment framework. We strive to deliver a consistent and user-friendly experience across various inference framework backends in complex deployment environments and focus on performance.

Architecture

Architecture

Fetures

1. cross-platform and consistent

As long as the environment is supported, the code for deploying models through nndeploy can be used across multiple platforms without modification, regardless of the operating system and inference framework.

The current supported environment is as follows, which will continue to be updated in the future:

Inference/OS Linux Windows Android MacOS IOS developer remarks
TensorRT - - - - Always
OpenVINO - - - Always
ONNXRuntime - - - Always
MNN - - Always
TNN - - 02200059Z
ncnn - - - - Always
coreML - - - - JoDio-zd
paddle-lite - - - - - qixuxiang
AscendCL - - - - CYYAI

Notice: TFLite, TVM, OpenPPL, Tengine, AITemplate, RKNN, sophgo, MindSpore-lite, Horizon are also on the agenda as we work to cover mainstream inference frameworks

2. High Performance

The difference of model structure, inference framework and hardware resource will lead to different inference performance. nndeploy deeply understands and preserves as much as possible the features of the back-end inference framework without compromising the computational efficiency of the native inference framework with a consistent code experience. In addition, we realize the efficient connection between the pre/post-processing and the model inference process through the exquisitely designed memory zero copy, which effectively guarantees the end-to-end delay of model inference.

What's more, we are developing and refining the following:

  • Thread Pool
  • Memory Pool: more efficient memory allocation and release
  • HPC Operators: optimize pre/post-processing efficiency

3. Models built-in

Out-of-the-box AI models are our goal, but our are focusing on development of the system at this time. Nevertheless, YOLOv5, YOLOv6, YOLOv8 are already supported, and it is believed that the list will soon be expanded.

model Inference developer remarks
YOLOV5 TensorRt/OpenVINO/ONNXRuntime/MNN 02200059ZAlways
YOLOV6 TensorRt/OpenVINO/ONNXRuntime 02200059ZAlways
YOLOV8 TensorRt/OpenVINO/ONNXRuntime/MNN 02200059ZAlways

4. user-friendly

nndeploy's primary purpose is user friendliness and high performance. We have built-in support for the major inference frameworks and provide them with a unified interface abstraction on which you can implement platform/framework independent inference code without worrying about performance loss. We now provide additional templates for the pre/post-processing for AI algorithms, which can help you simplify the end-to-end deployment process of the model, and the built-in algorithms mentioned above are also part of the ease of use.

If you have any related questions, feel free to contact us. 😁

5. Parallel

  • task parallel
  • pipeline parallel

Document

Roadmap

  • Parallel
  • More Model
  • More Inference
  • OP

Reference

Contact Us

nndeploy is still in its infancy, welcome to join us.

  • Wechat:titian5566