Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CANN EP #12416

Merged
merged 2 commits into from
Sep 22, 2022
Merged

Add CANN EP #12416

merged 2 commits into from
Sep 22, 2022

Conversation

wangxiyuan
Copy link
Contributor

@wangxiyuan wangxiyuan commented Aug 2, 2022

Description: This PR adds Ascend CANN execution provider support.

Motivation and Context

  • Why is this change required? What problem does it solve?
    As the info shown in the issue. CANN is the API layer for Ascend processor. Add CANN EP can allow user run onnx model on Ascend hardware via onnxruntime
    The detail change:
    1. Added CANN EP framework.
    2. Added the basic operators to support ResNet and VGG model.
    3. Added C/C++、Python API support
  • If it fixes an open issue, please link to the issue here.
    Add Ascend backend support #11477

Author:
lijiawei [email protected]
wangxiyuan [email protected]

@wangxiyuan
Copy link
Contributor Author

Any maintainer can apporve the CI workflows? Thanks

@justinchuby justinchuby requested review from edgchen1 and skottmckay and removed request for edgchen1 August 24, 2022 01:10
@wangxiyuan
Copy link
Contributor Author

@skottmckay @edgchen1 Hi, could you please take a look at this PR? BIG thanks. I'm eager to know what should I do to push forward this code.

@@ -3568,6 +3626,14 @@ ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_CUDA, _In_ OrtSessionOpt
*/
ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_MIGraphX, _In_ OrtSessionOptions* options, int device_id);

/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should be deleted. The other provider ones like it were left for backwards compatibility. New provider functionality should be done only through the OrtApi as it is versioned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it

@RyanUnderhill
Copy link
Member

@skottmckay @edgchen1 Hi, could you please take a look at this PR? BIG thanks. I'm eager to know what should I do to push forward this code.

To other reviewers: I reviewed the shared provider library/API aspects of it, I didn't look in great detail into the provider code itself.

}

bool GPUDataTransfer::CanCopy(const OrtDevice& src_device, const OrtDevice& dst_device) const {
return src_device.Type() == OrtDevice::GPU || src_device.MemType() == OrtDevice::MemType::CANN_PINNED ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GPU data transfer? and you want to reuse the OrtDevice::GPU?

auto& src_device = src.Location().device;
auto& dst_device = dst.Location().device;

if (dst_device.Type() == OrtDevice::GPU) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we using a tool similar to AMD's HIP to convert the ort cuda EP to this EP? THe code here looks exactly the same as cuda EP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, thanks for point out this. Ascend is not GPU, it's a NPU device. We reused GPU type before which is not correct. There are three device type in onnxruntime (CPU GPU FPGA), how about add a new one called NPU? It's good for other NPU integration as well.

@wangxiyuan
Copy link
Contributor Author

@souptc @RyanUnderhill @skottmckay @edgchen1 Hi, sorry to @ you again. The new commit is ready for review. All the GPU related code has been removed. Big thanks for your help.

@@ -0,0 +1,115 @@
// Copyright (c) Huawei. All rights reserved.
// Copyright (c) Huawei. All rights reserved.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate copyright line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Done

@FFFrog
Copy link
Contributor

FFFrog commented Sep 5, 2022

@souptc @RyanUnderhill @skottmckay @edgchen1 Hi, sorry to @ you again. could you please take a look at this PR?

@FFFrog
Copy link
Contributor

FFFrog commented Sep 16, 2022

@souptc @RyanUnderhill @skottmckay @edgchen1 Hi, sorry to @ you again. could you please take a look at this PR?Thanks a lot.

@jywu-msft
Copy link
Member

/azp run Linux CPU CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -37,6 +37,8 @@ namespace onnxruntime {
constexpr const char* CPU = "Cpu";
constexpr const char* CUDA = "Cuda";
constexpr const char* CUDA_PINNED = "CudaPinned";
constexpr const char* CANN = "Cann";
constexpr const char* CANN_PINNED = "CannPinned";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just confirming that there really need for pinned memory type same as in cuda?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ascend NPU provides aclrtMallocHost API to allocate pin memory on host, which may improve performance in most scenarios. So that there is a case that data need be transferred between CANN EP and CPU EP during model running. So it's required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation!

@jywu-msft
Copy link
Member

/azp run Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline

@jywu-msft
Copy link
Member

/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

@azure-pipelines
Copy link

Azure Pipelines successfully started running 8 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@@ -413,6 +413,53 @@ typedef struct OrtCUDAProviderOptions {

} OrtCUDAProviderOptions;

/** \brief CANN Provider Options
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you guys anticipate this struct changing much in the future? i.e. will you add/remove any fields here?
ORT maintains binary compatibility for C API (i.e. applications linked against libonnxruntime should expect to continue working with newer versions)
That means if you go with the struct approach for passing options, you would have to create a new version of the struct.
if that is okay, please proceed.
otherwise, consider using an opaque struct instead, so fields can be updated without impact to binary compatibility.
see comments at
https://github.com/microsoft/onnxruntime/blob/main/include/onnxruntime/core/session/onnxruntime_c_api.h#L3232
and
https://github.com/microsoft/onnxruntime/blob/main/include/onnxruntime/core/session/onnxruntime_c_api.h#L2743
the caller does not directly access the struct.
instead, they call an interface to create the struct, update it, and release it when they're done.
e.g. CreateCannProviderOptions, UpdateCannProviderOptions, ReleaseCannOptions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, this structure may change with hardware updates, so it's good to use the opaque structure instead. We'll submit a new commit later, thank you for your comments. Please wait more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new commit about opaque structure is ready for review.
please take a look at this, thanks a lot.

std::shared_ptr<KernelRegistry> GetKernelRegistry() const override;
std::unique_ptr<onnxruntime::IDataTransfer> GetDataTransfer() const override;

std::vector<std::unique_ptr<ComputeCapability>> GetCapability(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, you will need to make a small change to this due to #12791 which was merged yesterday

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

jywu-msft
jywu-msft previously approved these changes Sep 21, 2022
@jywu-msft
Copy link
Member

jywu-msft commented Sep 21, 2022

A couple more follow-ups.

  1. you'll need to fix a build break due to GetCapability() signature change caused by Update kernel matching logic: decouple from op schemas and remove kernel def hashes #12791
  2. add some documentation. our docs are in the gh-pages branch. see https://github.com/microsoft/onnxruntime/tree/gh-pages/docs
    you'll need to create a separate docs PR and make appropriate modifications to build/eps.md and add a execution-providers/CANN-ExecutionProvider.md
  3. if you have some code examples/jupyter notebooks etc. they can go in https://github.com/microsoft/onnxruntime-inference-examples

@jywu-msft jywu-msft dismissed their stale review September 21, 2022 14:59

more changes pending

@wangxiyuan
Copy link
Contributor Author

A couple more follow-ups.

  1. you'll need to fix a build break due to GetCapability() signature change caused by Update kernel matching logic: decouple from op schemas and remove kernel def hashes #12791
  2. add some documentation. our docs are in the gh-pages branch. see https://github.com/microsoft/onnxruntime/tree/gh-pages/docs
    you'll need to create a separate docs PR and make appropriate modifications to build/eps.md and add a execution-providers/CANN-ExecutionProvider.md
  3. if you have some code examples/jupyter notebooks etc. they can go in https://github.com/microsoft/onnxruntime-inference-examples

Thanks for your suggestion. It's really helpful. We'll refresh the PR ASAP.

Docs、Example、More operator、 Performance Optimizing and so on are on our TODO list. We'll keep contributing in the future.

This PR adds Ascend CANN execution provider support.

Detail:
1. Added CANN EP framework.
2. Added the basic operators to support resnet model.

Co-Author: wangxiyuan <[email protected]>
Detail:
1. Adapt to the latest GetCapability()
2. Use the opaque structure instead and four functions are provided

Co-Author: wangxiyuan <[email protected]>
@jywu-msft
Copy link
Member

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 9 pipeline(s).

@jywu-msft
Copy link
Member

/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

Copy link
Member

@jywu-msft jywu-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

@jywu-msft jywu-msft merged commit 952c993 into microsoft:main Sep 22, 2022
linnealovespie pushed a commit that referenced this pull request Sep 30, 2022
**Description**: This PR adds Ascend CANN execution provider support.

**Motivation and Context**
- Why is this change required? What problem does it solve?
As the info shown in the issue. CANN is the API layer for Ascend
processor. Add CANN EP can allow user run onnx model on Ascend hardware
via onnxruntime
  The detail change:
  1. Added CANN EP framework.
  2. Added the basic operators to support ResNet and VGG model.
  3. Added C/C++、Python API support
- If it fixes an open issue, please link to the issue here.
   #11477

Author: 
lijiawei <[email protected]>
wangxiyuan <[email protected]>

Co-authored-by: FFrog <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants