Replies: 1 comment 1 reply
-
The multimodal PPO might has some bugs now, you can use DPO or KTO instead |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I got the following error when I was doing multimodal PPO on llava-1.5-7b.
It's very strange. I checked that the dataset obtained in
dataset_module = get_dataset(template, model_args, data_args, training_args, stage="ppo", **tokenizer_module)
contains labels normally, but the features printed at the beginning of MultiModalDataCollatorForSeq2Seq have no labels.
Beta Was this translation helpful? Give feedback.
All reactions