[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for MemcpyToHost(1) node with name 'Memcpy_token_167' #17837
Labels
ep:CUDA
issues related to the CUDA execution provider
Describe the issue
I am using the "nightdessert/WeCheck" model from hugging face. I am trying to use ONNX optimization on the ORT based model but getting the below error when I am using O4 or gpu specific optimizations. I tried using O1, O2, O3 optimizations, but I don't see much benefit in the performance of the original model and the optimized models.
I am getting the below error when loading the O4 optimized model:
NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for MemcpyToHost(1) node with name 'Memcpy_token_167'
Added the code to reproduce the error on any environment having a CUDA GPU.
To reproduce
Urgency
No response
Platform
Linux
OS Version
20.04.5 LTS (Focal Fossa)
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
onnxruntime-gpu-1.16.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: