rbao2018

Follow

BaoRong rbao2018

Follow

3 followers · 6 following

Shanghai,China

Highlights

Pro

Pinned Loading

self_ref_feedback self_ref_feedback Public

Code for Improving Large Language Model Alignment from Self-Reference Model Feedback

Python 6
OpenRLHF/OpenRLHF OpenRLHF/OpenRLHF Public

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)

Python 2.4k 232