Skip to content

Commit

Permalink
Doc TP API and cuda resources (#19172)
Browse files Browse the repository at this point in the history
Doc TP API and cuda resources.

---------

Co-authored-by: Randy Shuai <[email protected]>
  • Loading branch information
RandySheriffH and RandyShuai authored Jan 18, 2024
1 parent e882d61 commit e679d99
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 2 deletions.
4 changes: 3 additions & 1 deletion docs/performance/tune-performance/threading.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,5 +201,7 @@ int main() {

Note that `CreateThreadCustomized` and `JoinThreadCustomized`, once set, will be applied to both ORT intra op and inter op thread pools uniformly.

## Usage in custom ops
Since 1.17, custom op developers are entitled to parallelize their cpu code with ort intra-op thread pool.


Please refer to the [API](https://github.com/microsoft/onnxruntime/blob/rel-1.17.0/include/onnxruntime/core/session/onnxruntime_c_api.h#L4543), and [example](https://github.com/microsoft/onnxruntime/blob/rel-1.17.0/onnxruntime/test/testdata/custom_op_library/cpu/cpu_ops.cc#L87) for usage.
2 changes: 1 addition & 1 deletion docs/reference/operators/add-custom-op.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ void KernelOne(const Ort::Custom::CudaContext& cuda_ctx,
cuda_add(Z.NumberOfElement(), z_raw, X.Data(), Y.Data(), cuda_ctx.cuda_stream); // launch a kernel inside
}
```
Details could be found [here](https://github.com/microsoft/onnxruntime/tree/rel-1.16.0/onnxruntime/test/testdata/custom_op_library/cuda).
Full example could be found [here](https://github.com/microsoft/onnxruntime/tree/rel-1.17.0/onnxruntime/test/testdata/custom_op_library/cuda). To further facilitate development, a wide variety of cuda ep resources and configurations are exposed via CudaContext, please refer to the [header](https://github.com/microsoft/onnxruntime/blob/rel-1.17.0/include/onnxruntime/core/providers/cuda/cuda_resource.h#L8) for detail.
For ROCM, it is like:
Expand Down

0 comments on commit e679d99

Please sign in to comment.