Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning Paper • 2512.15687 • Published 10 days ago • 17
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing Paper • 2512.10284 • Published 17 days ago • 25
Building a Foundational Guardrail for General Agentic Systems via Synthetic Data Paper • 2510.09781 • Published Oct 10 • 26
VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning Paper • 2510.01444 • Published Oct 1 • 19
CLUE: Non-parametric Verification from Experience via Hidden-State Clustering Paper • 2510.01591 • Published Oct 2 • 27
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation Paper • 2509.15194 • Published Sep 18 • 33
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models Paper • 2509.09675 • Published Sep 11 • 28
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning Paper • 2509.07980 • Published Sep 9 • 101
On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective Paper • 2502.14296 • Published Feb 20 • 45
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published Feb 3 • 40
Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment Paper • 2411.17188 • Published Nov 26, 2024 • 20