From 120cb5a80455b5aa13f29df607d4f3572a9de74f Mon Sep 17 00:00:00 2001 From: Tianlei Wu Date: Sat, 2 Nov 2024 12:51:37 -0700 Subject: [PATCH] [Doc] Add I/O binding example using onnx data type in python API summary (#22695) ### Description Add I/O binding example using onnx data type in python API summary. The API is available since 1.20 release. ### Motivation and Context Follow up of https://github.com/microsoft/onnxruntime/pull/22306 to add some documentation. --- docs/python/api_summary.rst | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/docs/python/api_summary.rst b/docs/python/api_summary.rst index 092b42010a5c6..fb2850c547463 100644 --- a/docs/python/api_summary.rst +++ b/docs/python/api_summary.rst @@ -244,9 +244,36 @@ You can also bind inputs and outputs directly to a PyTorch tensor. ) session.run_with_iobinding(binding) - + You can also see code examples of this API in in the `ONNX Runtime inferences examples `_. +Some onnx data type (like TensorProto.BFLOAT16, TensorProto.FLOAT8E4M3FN and TensorProto.FLOAT8E5M2) are not supported by Numpy. You can directly bind input or output with Torch tensor of corresponding data type +(like torch.bfloat16, torch.float8_e4m3fn and torch.float8_e5m2) in GPU memory. + +.. code-block:: python + + x = torch.ones([3], dtype=torch.float8_e5m2, device='cuda:0') + y = torch.empty([3], dtype=torch.bfloat16, device='cuda:0') + + binding = session.io_binding() + binding.bind_input( + name='X', + device_type='cuda', + device_id=0, + element_type=TensorProto.FLOAT8E5M2, + shape=tuple(x.shape), + buffer_ptr=x.data_ptr(), + ) + binding.bind_output( + name='Y', + device_type='cuda', + device_id=0, + element_type=TensorProto.BFLOAT16, + shape=tuple(y.shape), + buffer_ptr=y.data_ptr(), + ) + session.run_with_iobinding(binding) + API Details ===========