You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Many different classes in aligner have similar implementations that are duplicated. For example log prob calculation in DPO and PPO get log prob. Instead we should create a common class or common until function to avoid this duplication.
in complicated cases like with TRT-LLM this is even more important since there are various edge cases that should be handled centrally.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Many different classes in aligner have similar implementations that are duplicated. For example log prob calculation in DPO and PPO get log prob. Instead we should create a common class or common until function to avoid this duplication.
in complicated cases like with TRT-LLM this is even more important since there are various edge cases that should be handled centrally.
The text was updated successfully, but these errors were encountered: