Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Directional Preference Alignment

classroom
Activity Feed

AI & ML interests

None defined yet.

Haoxiang Wang's profile picture Wei Xiong's profile picture Yong Lin's profile picture

weqweasdas 
authored 4 papers over 1 year ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 71

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

Paper • 2312.11456 • Published Dec 18, 2023 • 1

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

Paper • 2306.12420 • Published Jun 21, 2023 • 2

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Paper • 2304.06767 • Published Apr 13, 2023 • 2
Haoxiang-Wang 
authored a paper over 1 year ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 71
Haoxiang-Wang 
authored a paper about 2 years ago

SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

Paper • 2310.15308 • Published Oct 23, 2023 • 23
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs