SpatialTree: How Spatial Abilities Branch Out in MLLMs Paper • 2512.20617 • Published 3 days ago • 41
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 25 days ago • 233
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning Paper • 2507.16815 • Published Jul 22 • 40
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence Paper • 2512.16793 • Published 8 days ago • 71
TrajSelector: Harnessing Latent Representations for Efficient and Effective Best-of-N in Large Reasoning Model Paper • 2510.16449 • Published Oct 18 • 34
Euclid's Gift: Enhancing Spatial Perception and Reasoning in Vision-Language Models via Geometric Surrogate Tasks Paper • 2509.24473 • Published Sep 29 • 17
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model Paper • 2509.09372 • Published Sep 11 • 242
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning Paper • 2509.09674 • Published Sep 11 • 80
Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset Paper • 2406.06039 • Published Jun 10, 2024 • 1
Taming SAM for Underwater Instance Segmentation and Beyond Paper • 2505.15581 • Published May 21 • 1
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels Paper • 2507.21809 • Published Jul 29 • 136
MIRepNet: A Pipeline and Foundation Model for EEG-Based Motor Imagery Classification Paper • 2507.20254 • Published Jul 27 • 20