yoonkyumng's picture

5 1

yoonkyumng

yoonkg

AI & ML interests

None yet

Organizations

None yet

upvoted 5 papers 3 months ago

Diversity-Incentivized Exploration for Versatile Reasoning

Paper • 2509.26209 • Published Sep 30 • 16

It Takes Two: Your GRPO Is Secretly DPO

Paper • 2510.00977 • Published Oct 1 • 31

CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning

Paper • 2509.20712 • Published Sep 25 • 19

Video models are zero-shot learners and reasoners

Paper • 2509.20328 • Published Sep 24 • 98

ExGRPO: Learning to Reason from Experience

Paper • 2510.02245 • Published Oct 2 • 80