Pinned Loading
Repositories
Showing 10 of 98 repositories
- LifelongSafetyAlignment Public
sail-sg/LifelongSafetyAlignment’s past year of commit activity - oat Public
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
sail-sg/oat’s past year of commit activity - feedback-conditional-policy Public
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
sail-sg/feedback-conditional-policy’s past year of commit activity - Precision-RL-verl Public Forked from volcengine/verl
Defeating the Training-Inference Mismatch via FP16
sail-sg/Precision-RL-verl’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…