From e679d998ada8ffb3b9a9ec5c22d01f6770043ccc Mon Sep 17 00:00:00 2001 From: RandySheriffH <48490400+RandySheriffH@users.noreply.github.com> Date: Wed, 17 Jan 2024 16:01:09 -0800 Subject: [PATCH] Doc TP API and cuda resources (#19172) Doc TP API and cuda resources. --------- Co-authored-by: Randy Shuai --- docs/performance/tune-performance/threading.md | 4 +++- docs/reference/operators/add-custom-op.md | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/performance/tune-performance/threading.md b/docs/performance/tune-performance/threading.md index 18620eb4add9f..2b546422c080c 100644 --- a/docs/performance/tune-performance/threading.md +++ b/docs/performance/tune-performance/threading.md @@ -201,5 +201,7 @@ int main() { Note that `CreateThreadCustomized` and `JoinThreadCustomized`, once set, will be applied to both ORT intra op and inter op thread pools uniformly. +## Usage in custom ops +Since 1.17, custom op developers are entitled to parallelize their cpu code with ort intra-op thread pool. - +Please refer to the [API](https://github.com/microsoft/onnxruntime/blob/rel-1.17.0/include/onnxruntime/core/session/onnxruntime_c_api.h#L4543), and [example](https://github.com/microsoft/onnxruntime/blob/rel-1.17.0/onnxruntime/test/testdata/custom_op_library/cpu/cpu_ops.cc#L87) for usage. \ No newline at end of file diff --git a/docs/reference/operators/add-custom-op.md b/docs/reference/operators/add-custom-op.md index 0cb3626efb38f..b4b43b2324eb5 100644 --- a/docs/reference/operators/add-custom-op.md +++ b/docs/reference/operators/add-custom-op.md @@ -133,7 +133,7 @@ void KernelOne(const Ort::Custom::CudaContext& cuda_ctx, cuda_add(Z.NumberOfElement(), z_raw, X.Data(), Y.Data(), cuda_ctx.cuda_stream); // launch a kernel inside } ``` -Details could be found [here](https://github.com/microsoft/onnxruntime/tree/rel-1.16.0/onnxruntime/test/testdata/custom_op_library/cuda). +Full example could be found [here](https://github.com/microsoft/onnxruntime/tree/rel-1.17.0/onnxruntime/test/testdata/custom_op_library/cuda). To further facilitate development, a wide variety of cuda ep resources and configurations are exposed via CudaContext, please refer to the [header](https://github.com/microsoft/onnxruntime/blob/rel-1.17.0/include/onnxruntime/core/providers/cuda/cuda_resource.h#L8) for detail. For ROCM, it is like: