Deep Tabular Research via Continual Experience-Driven Execution Paper • 2603.09151 • Published 14 days ago • 11
A Subgoal-driven Framework for Improving Long-Horizon LLM Agents Paper • 2603.19685 • Published 4 days ago • 14
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning Paper • 2603.17024 • Published 6 days ago • 96
LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation Paper • 2603.20192 • Published 4 days ago • 22
TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models Paper • 2511.08667 • Published Nov 11, 2025 • 5
Nemotron-Cascade 2 Collection Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation • 4 items • Updated about 7 hours ago • 31
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 5 days ago • 54
🌌 Cosmopedia Collection Resources for Cosmopedia dataset • 6 items • Updated May 5, 2025 • 12
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent Paper • 2508.06600 • Published Aug 8, 2025 • 42
Scaling Language-Free Visual Representation Learning Paper • 2504.01017 • Published Apr 1, 2025 • 33
OLMoASR: Open Models and Data for Training Robust Speech Recognition Models Paper • 2508.20869 • Published Aug 28, 2025 • 1
World Models Can Leverage Human Videos for Dexterous Manipulation Paper • 2512.13644 • Published Dec 15, 2025 • 1
MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning Paper • 2603.16929 • Published 10 days ago • 9
Qwen3.5 Caption [`Gliese Series ] Collection Expert Image Captioning System • 5 items • Updated 10 days ago • 3
MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents Paper • 2601.12346 • Published Jan 18 • 50
BayesianVLA: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries Paper • 2601.15197 • Published Jan 21 • 55
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks Paper • 2602.12670 • Published Feb 13 • 56
HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios Paper • 2603.11975 • Published 12 days ago • 11