[TensorRT EP] support user_compute_stream in python API #20168

yf711 · 2024-04-01T17:26:00Z

Description

Implement user_compute_stream python api for TensorRT EP
- Using this option will implicitly set has_user_compute_stream as true
Extend existing TRTEP unit test to verify user_compute_stream option
- This has been verified in local pytorch env, with torch.cuda.Stream() passing into user_compute_stream:

...
# Before inference
if torch.cuda.is_available():
    s = torch.cuda.Stream()
    option = {"user_compute_stream": str(s.cuda_stream)}
    sess.set_providers(["TensorrtExecutionProvider"], [option])
    options = sess.get_provider_options()

    assert "TensorrtExecutionProvider" in options
    assert options["TensorrtExecutionProvider"].get("user_compute_stream", "") == str(s.cuda_stream)
    assert options["TensorrtExecutionProvider"].get("has_user_compute_stream", "") == "1"
...

Motivation and Context

Align with existing user_compute_stream python implementations for CUDA EP/ROCm EP

onnxruntime/python/onnxruntime_pybind_state.cc

jywu-msft · 2024-04-15T16:25:36Z

merge latest main to get Big Models pipelines to pass.

) ### Description  * Implement `user_compute_stream` python api for TensorRT EP * Using this option will implicitly set `has_user_compute_stream` as `true` * Extend existing TRTEP unit test to verify `user_compute_stream` option * This has been verified in local pytorch env, with `torch.cuda.Stream()` passing into `user_compute_stream`: ```python ... # Before inference if torch.cuda.is_available(): s = torch.cuda.Stream() option = {"user_compute_stream": str(s.cuda_stream)} sess.set_providers(["TensorrtExecutionProvider"], [option]) options = sess.get_provider_options() assert "TensorrtExecutionProvider" in options assert options["TensorrtExecutionProvider"].get("user_compute_stream", "") == str(s.cuda_stream) assert options["TensorrtExecutionProvider"].get("has_user_compute_stream", "") == "1" ... ``` ### Motivation and Context  Align with existing `user_compute_stream` python implementations for [CUDA EP](https://github.com/microsoft/onnxruntime/pull/19229)/[ROCm EP](microsoft#19619)

yf711 added 8 commits March 29, 2024 05:31

update trtep info

ea0a057

unit test

f876652

fix

90691fe

test

28cedbb

update

d60333f

slim test

e899a4e

fix

a2e16e5

fix

ab3edfc

yf711 changed the title ~~[TensorRT EP] user_compute_stream~~ [TensorRT EP] support user_compute_stream in python API Apr 11, 2024

yf711 marked this pull request as ready for review April 11, 2024 19:55

yf711 added 2 commits April 11, 2024 15:31

update

aaa5bb8

add

311d309

yf711 requested a review from jywu-msft April 12, 2024 16:52

jywu-msft reviewed Apr 12, 2024

View reviewed changes

onnxruntime/python/onnxruntime_pybind_state.cc Show resolved Hide resolved

yf711 added 2 commits April 12, 2024 15:25

extend case when pytorch env is available

dff6d04

lint

ecb64c3

Merge branch 'main' into yifanl/trtep_user_compute_stream

8d06a6f

jywu-msft approved these changes Apr 16, 2024

View reviewed changes

yf711 merged commit 54f91ea into main Apr 16, 2024
90 of 94 checks passed

yf711 deleted the yifanl/trtep_user_compute_stream branch April 16, 2024 19:49

chilo-ms mentioned this pull request Jul 16, 2024

[TensorRT] Enable refitting an embedded engine when provided as byte stream #21357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TensorRT EP] support user_compute_stream in python API #20168

[TensorRT EP] support user_compute_stream in python API #20168

yf711 commented Apr 1, 2024 •

edited

Loading

jywu-msft commented Apr 15, 2024

[TensorRT EP] support user_compute_stream in python API #20168

[TensorRT EP] support user_compute_stream in python API #20168

Conversation

yf711 commented Apr 1, 2024 • edited Loading

Description

Motivation and Context

jywu-msft commented Apr 15, 2024

yf711 commented Apr 1, 2024 •

edited

Loading