You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For large language model, it's a common practice to loop it, and after each loop the kvcache gets longer, however, Shape mismatch attempting to re-use buffer was raised
2024-06-30 01:50:54.266368919 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Concat node. Name:'/block/attn/Concat_9' Status Message: /onnxruntime_src/onnxruntime/core/framework/op_kernel.cc:83 virtual OrtValue* onnxruntime::OpKernelContext::OutputMLValue(int, const onnxruntime::TensorShape&) status.IsOK() was false. Shape mismatch attempting to re-use buffer. {1,1,16,128} != {1,2,16,128}. Validate usage of dim_value (values should be > 0) and dim_param (all values with the same string should equate to the same size) in shapes in the model.
The problems seems to be with the past_key_values, it only allocates a static shape of buffer, and next iteration, it tries to reuse it, but found shape mismatch and raise it.
Describe the issue
For large language model, it's a common practice to loop it, and after each loop the kvcache gets longer, however, Shape mismatch attempting to re-use buffer was raised
To reproduce
model is provided
Urgency
No response
Platform
Linux
OS Version
ubuntu 2004
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.18.0
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: