-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CANN EP #12416
Add CANN EP #12416
Conversation
Any maintainer can apporve the CI workflows? Thanks |
@skottmckay @edgchen1 Hi, could you please take a look at this PR? BIG thanks. I'm eager to know what should I do to push forward this code. |
@@ -3568,6 +3626,14 @@ ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_CUDA, _In_ OrtSessionOpt | |||
*/ | |||
ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_MIGraphX, _In_ OrtSessionOptions* options, int device_id); | |||
|
|||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function should be deleted. The other provider ones like it were left for backwards compatibility. New provider functionality should be done only through the OrtApi as it is versioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
To other reviewers: I reviewed the shared provider library/API aspects of it, I didn't look in great detail into the provider code itself. |
} | ||
|
||
bool GPUDataTransfer::CanCopy(const OrtDevice& src_device, const OrtDevice& dst_device) const { | ||
return src_device.Type() == OrtDevice::GPU || src_device.MemType() == OrtDevice::MemType::CANN_PINNED || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GPU data transfer? and you want to reuse the OrtDevice::GPU?
auto& src_device = src.Location().device; | ||
auto& dst_device = dst.Location().device; | ||
|
||
if (dst_device.Type() == OrtDevice::GPU) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we using a tool similar to AMD's HIP to convert the ort cuda EP to this EP? THe code here looks exactly the same as cuda EP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, thanks for point out this. Ascend is not GPU, it's a NPU device. We reused GPU type before which is not correct. There are three device type in onnxruntime (CPU GPU FPGA), how about add a new one called NPU? It's good for other NPU integration as well.
@souptc @RyanUnderhill @skottmckay @edgchen1 Hi, sorry to @ you again. The new commit is ready for review. All the GPU related code has been removed. Big thanks for your help. |
@@ -0,0 +1,115 @@ | |||
// Copyright (c) Huawei. All rights reserved. | |||
// Copyright (c) Huawei. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate copyright line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Done
@souptc @RyanUnderhill @skottmckay @edgchen1 Hi, sorry to @ you again. could you please take a look at this PR? |
@souptc @RyanUnderhill @skottmckay @edgchen1 Hi, sorry to @ you again. could you please take a look at this PR?Thanks a lot. |
/azp run Linux CPU CI Pipeline |
Azure Pipelines successfully started running 1 pipeline(s). |
@@ -37,6 +37,8 @@ namespace onnxruntime { | |||
constexpr const char* CPU = "Cpu"; | |||
constexpr const char* CUDA = "Cuda"; | |||
constexpr const char* CUDA_PINNED = "CudaPinned"; | |||
constexpr const char* CANN = "Cann"; | |||
constexpr const char* CANN_PINNED = "CannPinned"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just confirming that there really need for pinned memory type same as in cuda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ascend NPU provides aclrtMallocHost API to allocate pin memory on host, which may improve performance in most scenarios. So that there is a case that data need be transferred between CANN EP and CPU EP during model running. So it's required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the explanation!
/azp run Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline |
/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed |
Azure Pipelines successfully started running 8 pipeline(s). |
Azure Pipelines successfully started running 6 pipeline(s). |
@@ -413,6 +413,53 @@ typedef struct OrtCUDAProviderOptions { | |||
|
|||
} OrtCUDAProviderOptions; | |||
|
|||
/** \brief CANN Provider Options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you guys anticipate this struct changing much in the future? i.e. will you add/remove any fields here?
ORT maintains binary compatibility for C API (i.e. applications linked against libonnxruntime should expect to continue working with newer versions)
That means if you go with the struct approach for passing options, you would have to create a new version of the struct.
if that is okay, please proceed.
otherwise, consider using an opaque struct instead, so fields can be updated without impact to binary compatibility.
see comments at
https://github.com/microsoft/onnxruntime/blob/main/include/onnxruntime/core/session/onnxruntime_c_api.h#L3232
and
https://github.com/microsoft/onnxruntime/blob/main/include/onnxruntime/core/session/onnxruntime_c_api.h#L2743
the caller does not directly access the struct.
instead, they call an interface to create the struct, update it, and release it when they're done.
e.g. CreateCannProviderOptions, UpdateCannProviderOptions, ReleaseCannOptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future, this structure may change with hardware updates, so it's good to use the opaque structure instead. We'll submit a new commit later, thank you for your comments. Please wait more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new commit about opaque structure is ready for review.
please take a look at this, thanks a lot.
std::shared_ptr<KernelRegistry> GetKernelRegistry() const override; | ||
std::unique_ptr<onnxruntime::IDataTransfer> GetDataTransfer() const override; | ||
|
||
std::vector<std::unique_ptr<ComputeCapability>> GetCapability( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, you will need to make a small change to this due to #12791 which was merged yesterday
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
A couple more follow-ups.
|
Thanks for your suggestion. It's really helpful. We'll refresh the PR ASAP. Docs、Example、More operator、 Performance Optimizing and so on are on our TODO list. We'll keep contributing in the future. |
This PR adds Ascend CANN execution provider support. Detail: 1. Added CANN EP framework. 2. Added the basic operators to support resnet model. Co-Author: wangxiyuan <[email protected]>
Detail: 1. Adapt to the latest GetCapability() 2. Use the opaque structure instead and four functions are provided Co-Author: wangxiyuan <[email protected]>
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline |
Azure Pipelines successfully started running 9 pipeline(s). |
/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed |
Azure Pipelines successfully started running 6 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution!
**Description**: This PR adds Ascend CANN execution provider support. **Motivation and Context** - Why is this change required? What problem does it solve? As the info shown in the issue. CANN is the API layer for Ascend processor. Add CANN EP can allow user run onnx model on Ascend hardware via onnxruntime The detail change: 1. Added CANN EP framework. 2. Added the basic operators to support ResNet and VGG model. 3. Added C/C++、Python API support - If it fixes an open issue, please link to the issue here. #11477 Author: lijiawei <[email protected]> wangxiyuan <[email protected]> Co-authored-by: FFrog <[email protected]>
Description: This PR adds Ascend CANN execution provider support.
Motivation and Context
As the info shown in the issue. CANN is the API layer for Ascend processor. Add CANN EP can allow user run onnx model on Ascend hardware via onnxruntime
The detail change:
Add Ascend backend support #11477
Author:
lijiawei [email protected]
wangxiyuan [email protected]