We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It is working great with below however still not sufficient. Uses around 16 GB VRAM
I want to further lower the requirement if possible
How can I achieve that?
model path is : THUDM/cogvlm2-llama3-chat-19B
THUDM/cogvlm2-llama3-chat-19B
model = AutoModelForCausalLM.from_pretrained( MODEL_PATH, torch_dtype=TORCH_TYPE, trust_remote_code=True, quantization_config=BitsAndBytesConfig(load_in_4bit=True), low_cpu_mem_usage=True ).eval()
text models: @ArthurZucker vision models: @amyeroberts, @qubvel pipelines: @Rocketknight1 quantization (bitsandbytes, autogpt): @SunMarc @MekkCyber
The text was updated successfully, but these errors were encountered:
No branches or pull requests
It is working great with below however still not sufficient. Uses around 16 GB VRAM
I want to further lower the requirement if possible
How can I achieve that?
model path is :
THUDM/cogvlm2-llama3-chat-19B
Who can help?
text models: @ArthurZucker
vision models: @amyeroberts, @qubvel
pipelines: @Rocketknight1
quantization (bitsandbytes, autogpt): @SunMarc @MekkCyber
The text was updated successfully, but these errors were encountered: