Yaqi Duan's picture

2

Yaqi Duan

duanyq

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting

upvoted a paper 11 months ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

authored a paper 11 months ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting

Paper • 2510.08696 • Published Oct 9, 2025 • 14

upvoted a paper 11 months ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

Paper • 2502.04270 • Published Feb 6, 2025 • 12

authored a paper 11 months ago

PILAF: Optimal Human Preference Sampling for Reward Modeling

Paper • 2502.04270 • Published Feb 6, 2025 • 12