Can deepspeed's zero_optimization achieve model parallelism? #5710
Unanswered
ojipadeson
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
My machine has 8 Tesla V100s. But when loading LLM (I loaded Qwen2-7B-Instruct), an OOM error will be reported when using a single card.
I used deepspeed to divide the model parameters into 8 cards, but it was always unsuccessful (OOM error). I don't know if it can easily implement this function?
my config.json
Beta Was this translation helpful? Give feedback.
All reactions