Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[TensorRT EP] Fix bug for DDS output handling for empty tensor (micro…
…soft#19575) When the DDS output is empty tensor (i.e. any of the dimension is 0), TRT EP won't perform either cudaMemcpyAsync() nor cuda::Impl_Cast(), to prevent accidentally overwriting other location that might belong to other tensors. This PR also refactors the code to only allocate single bytes for all empty tensors. #TODO: add unit tests to cover the DDS code paths or doing more testing with concurrent,sequential, threaded faster-rcnn using onnx_test_runner and verifying outputs --------- Co-authored-by: Chi Lo <[email protected]>
- Loading branch information