Fix GPU profiling per PR

microsoft · Feb 7, 2024 · c87a05c · c87a05c
1 parent d95c0f4
commit c87a05c
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/docs/performance/tune-performance/profiling-tools.md b/docs/performance/tune-performance/profiling-tools.md
@@ -64,7 +64,7 @@ As covered in [logging](logging_tracing.md) ONNX supports dynamic enablement of
   - greater than 5 = profiling_level=detailed (individual ops are logged with inference perf hit)  
 - Event: [QNNProfilingEvent](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/qnn/builder/qnn_backend_manager.cc#L1083)
 
-## CUDA Profiling
+## GPU Profiling
 
 To profile CUDA kernels, please add the cupti library to your PATH and use the onnxruntime binary built from source with `--enable_cuda_profiling`.
 To profile ROCm kernels, please add the roctracer library to your PATH and use the onnxruntime binary built from source with `--enable_rocm_profiling`. 
@@ -83,4 +83,4 @@ If an operator called multiple kernels during execution, the performance numbers
 {"cat":"Node", "name":<name of the node>, ...}
 {"cat":"Kernel", "name":<name of the kernel called first>, ...}
 {"cat":"Kernel", "name":<name of the kernel called next>, ...}
-```
+```