Skip to content

Commit

Permalink
doc tp API and new cuda resources
Browse files Browse the repository at this point in the history
  • Loading branch information
RandyShuai committed Jan 17, 2024
1 parent e882d61 commit b456398
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 2 deletions.
5 changes: 3 additions & 2 deletions docs/performance/tune-performance/threading.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,5 +201,6 @@ int main() {

Note that `CreateThreadCustomized` and `JoinThreadCustomized`, once set, will be applied to both ORT intra op and inter op thread pools uniformly.



## Usage in custom ops
Since 1.17, custom op developers are entitled to accelerate their code on cpu with ort intra-op thread pool.
Please see the API and example for usage.
1 change: 1 addition & 0 deletions docs/reference/operators/add-custom-op.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ void KernelOne(const Ort::Custom::CudaContext& cuda_ctx,
}
```
Details could be found [here](https://github.com/microsoft/onnxruntime/tree/rel-1.16.0/onnxruntime/test/testdata/custom_op_library/cuda).
To facilitate the development, a wide variety of cuda ep resources/configurations are exposed via CudaContext, please see the header and usage for detail.
For ROCM, it is like:
Expand Down

0 comments on commit b456398

Please sign in to comment.