-
Notifications
You must be signed in to change notification settings - Fork 862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] <title> Inference error. Replacing the LLM part with Llama-3.1 70B quantized causing error ( RuntimeError: shape mismatch: value tensor of shape [1024] cannot be broadcast to indexing result of shape [1025] ) #643
Comments
Hello, this method is not possible because we have aligned the language model and the image model. If you do this, even if you can run through the code, the effect will be very poor. In addition, llama3.0-7b and llama3. The model latitude of 1-70b is also inconsistent |
@LDLINGLINGLING Thanks for response. So it's not possible to connect MiniCPM vision part with the 70B quantized model for example? Because previously I connected to llm part 8B fine-tuned version that was fine-tuned for Kazakh language and model worked better on that language but to connect quantized or just 70B model we need to do some adjustments in internal code? Maybe you can give me better intuition if I want to connect 70B with MiniCPM
|
I think this is possible, but the first point is that the code needs to be changed. Second point, you may need to use a lot of data to retrain the alignment of images and text. |
Thanks. |
This may make sense, but it depends on how much data you have for coordinates. If the amount of data is not particularly large, I would recommend that you directly affect our model. |
We will check if that will not work we will just try other approach data might be the key. Maybe you can provide me some directions of how to connect MiniCPM vision encoder with different model, what I will need to change, sorry I completely don't understand where to start with it. Thanks for all you suggestions and fast responses. |
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
Approach.
Maybe someone know what is wrong here. Thanks in advance.
Code
Error
期望行为 | Expected Behavior
Worked with base MiniCPM model but not with different llm part.
Response from the model with different llm part.
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response
The text was updated successfully, but these errors were encountered: