QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models Paper • 2512.19526 • Published 4 days ago • 9
MatSpray: Fusing 2D Material World Knowledge on 3D Geometry Paper • 2512.18314 • Published 6 days ago • 7
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published 7 days ago • 20
MobileWorld: Benchmarking Autonomous Mobile Agents in Agent-User Interactive, and MCP-Augmented Environments Paper • 2512.19432 • Published 4 days ago • 10
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 8 days ago • 105
LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published 3 days ago • 47