When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning Paper • 2602.10560 • Published Feb 11 • 30
Quantile Advantage Estimation for Entropy-Safe Reasoning Paper • 2509.22611 • Published Sep 26, 2025 • 119