-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meta-Llama-3-70B-Instruct #470
Comments
是用的int4模型吗? 这个模型int4精度好像不太够,可以试试int4g (int4分组量化) |
感谢博主,已解决,这个模型采用了int4分组量化后可用 |
博主我再像您请教一个问题,./main -p fastllm_int4g_70B.flm 用这个命令跑,问一个问题之后他会一直回复,除非用ctrl c终止程序,如何才能达成持续性的连续问问题 |
这个模型运行的时候好像得指定 --eos_token "<|eot_id|>",因为它模型里面定义的eos_token不是这个(官方代码里面也这么指定了) |
感谢博主,加了这个命令后问题已解决,非常感谢您 |
问一下博主,支持加速Meta-Llama-3-70B-Instruct吗,我用您的方法加速Meta-Llama-3-70B-Instruct结果回复的都是反斜杠
The text was updated successfully, but these errors were encountered: