Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ascend backend support #11477

Open
wangxiyuan opened this issue May 10, 2022 · 5 comments
Open

Add Ascend backend support #11477

wangxiyuan opened this issue May 10, 2022 · 5 comments
Labels
ep:ACL issues related to ACL execution provider ep:CUDA issues related to the CUDA execution provider feature request request for unsupported feature or enhancement

Comments

@wangxiyuan
Copy link
Contributor

wangxiyuan commented May 10, 2022

Is your feature request related to a problem? Please describe.

Base on the processor, Huawei build a series AI related hardwares which shown in blue rectangles. They’re called Atlas. Here I’d like to say more abort Atlas 300. It’s a kind of PCI card and used widely on data/ai process servers. Our develop and test work is base on it as well.

Then, base on the hardware, Ascend ecosystem also provides a software layer called CANN. It’s the yellow rectangles in the picture. CANN provides APIs to help developers quickly build AI applications and services based on the Ascend platform. It’s similar with CUDA in Nvidia ecosystem.

In ONNX case, users need convert it to Ascend model first using a transport tool called ATC . It's a little complex. And sometimes, the performance may be poor or accuracy maybe drops.

It's good that if onnxruntime can support Ascend processor as a backend. If so, users can uses onnxruntime on Ascend processor directly.

image

For software, CANN is the main point that both developer and AI framework should know. Let’s focus on CANN. This is the CANN Technical Stack view in Ascend ecosystem. Last year, my colleague, zhipeng had shared the CANN stack already in the onnx meetup. Well, it was based on CANN 3.0 version which is out of date. The picture here shows the newest version called CANN 5.0. As you can see, there are multi layer in CANN. It contains service layer, compilation layer ,execution layer and the base layer. For example. service layer provide operator library, optimization engine and framework adapter.

In general, developers do not need to know them. You need only focus on Ascend Computing Language, ie ACL. It’s the APIs part to help you control Ascend hardware via CANN.

System information

  • ONNX Runtime version (you are using):
    ONNX Runtime master branch

Describe the solution you'd like

image
Currently, if a user want to run onnx model on Ascend hardware, he should first use the model translation tool provided by CANN to translate the model from onnx to ascend . The flow is a little complex. And the translated model may lost some precision, and the performance may poor. Even in some case, the model may can’t work correctly.

To solve the problem, a better way is find a way that onnx model can work on Ascend directly. So In onnxruntime, we’d like to add CANN as a new execution provider. Once it’s done, users can use onnx model on Ascend hardware via onnxruntime. Of cause, we’ll add the related CI as well. For example, we can donate VM resoucres which contains Ascend hardware to the community.

The line below the our roadmap. First we’ll push the basic code to upstream. The end to end flow will be done in it. And the ResNet model should work correctly on CANN EP. At the end of year, we’ll finish all the onnx operator support and make sure all the models in onnx model zoo works well on Ascend.

In the next year, we’ll focus on optimizing work. Like performance improvement and so on

Basing the Execution provider mechanism in onnxruntime. It's easy to integrate Ascend processor as a new EP in onnxruntime.

The new EP can be named as CANN. CANN is the AI-oriented heterogeneous compute architecture in Ascend ecosystem. It provides hierarchical APIs to help users quickly build AI applications. Frankly speaking, it's similar with CUDA in GPU ecosystem.

Additionally, we'll add the CI supports as well. We can donate the VM which supports Ascend processor to onnxruntime CI system. Then the community can keep testing the new EP CANN easily.

We hope that the community can accept this feature request. Wish to get your feedback.

Thanks.

Describe alternatives you've considered
Use the library provied by Ascend without using onnxruntime

Additional context
Ascend official website
CANN

@KnightYao
Copy link
Contributor

you think too much

@ashbhandare ashbhandare added the feature request request for unsupported feature or enhancement label May 10, 2022
@wangxiyuan
Copy link
Contributor Author

you think too much

what do you mean?

@wangxiyuan
Copy link
Contributor Author

wangxiyuan commented May 13, 2022

This basic PR is ready for review now.

There are about 10 operators are added in the PR. With this change, the ResNet-v1.12 can runs well on Ascend backend with onnxruntime.

Any committer can take a look? Thanks

I'd like to know what should I do to push it forward.

About the test environment, We can donate ascend based VM to community as well.

jywu-msft pushed a commit that referenced this issue Sep 22, 2022
**Description**: This PR adds Ascend CANN execution provider support.

**Motivation and Context**
- Why is this change required? What problem does it solve?
As the info shown in the issue. CANN is the API layer for Ascend
processor. Add CANN EP can allow user run onnx model on Ascend hardware
via onnxruntime
  The detail change:
  1. Added CANN EP framework.
  2. Added the basic operators to support ResNet and VGG model.
  3. Added C/C++、Python API support
- If it fixes an open issue, please link to the issue here.
   #11477

Author: 
lijiawei <[email protected]>
wangxiyuan <[email protected]>

Co-authored-by: FFrog <[email protected]>
linnealovespie pushed a commit that referenced this issue Sep 30, 2022
**Description**: This PR adds Ascend CANN execution provider support.

**Motivation and Context**
- Why is this change required? What problem does it solve?
As the info shown in the issue. CANN is the API layer for Ascend
processor. Add CANN EP can allow user run onnx model on Ascend hardware
via onnxruntime
  The detail change:
  1. Added CANN EP framework.
  2. Added the basic operators to support ResNet and VGG model.
  3. Added C/C++、Python API support
- If it fixes an open issue, please link to the issue here.
   #11477

Author: 
lijiawei <[email protected]>
wangxiyuan <[email protected]>

Co-authored-by: FFrog <[email protected]>
@github-actions github-actions bot added ep:ACL issues related to ACL execution provider ep:CUDA issues related to the CUDA execution provider labels Jun 1, 2023
@johnnynunez
Copy link

how is going on?

@FFFrog
Copy link
Contributor

FFFrog commented Nov 29, 2023

how is going on?

Hey! Refer to the doc first if you have any questions, please. And CI releated is here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:ACL issues related to ACL execution provider ep:CUDA issues related to the CUDA execution provider feature request request for unsupported feature or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants