Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.15869

MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models

Paper • 2511.18373 • Published Nov 23, 2025 • 5
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO

Paper • 2511.13288 • Published Nov 17, 2025 • 17
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens

Paper • 2511.19418 • Published Nov 24, 2025 • 28
SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 125

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

Paper • 2509.09372 • Published Sep 11, 2025 • 243
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Paper • 2510.15869 • Published Oct 17, 2025 • 48

about 19 hours ago

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Paper • 2510.15869 • Published Oct 17, 2025 • 48

3D Models & Modeling

Towards Scalable and Consistent 3D Editing

Paper • 2510.02994 • Published Oct 3, 2025 • 5
UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections

Paper • 2509.24817 • Published Sep 29, 2025 • 8
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Paper • 2510.15019 • Published Oct 16, 2025 • 63
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Paper • 2510.15869 • Published Oct 17, 2025 • 48

MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models

Paper • 2511.18373 • Published Nov 23, 2025 • 5
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO

Paper • 2511.13288 • Published Nov 17, 2025 • 17
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens

Paper • 2511.19418 • Published Nov 24, 2025 • 28
SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 125

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Paper • 2510.15869 • Published Oct 17, 2025 • 48

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

Paper • 2509.09372 • Published Sep 11, 2025 • 243
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Paper • 2510.15869 • Published Oct 17, 2025 • 48

3D Models & Modeling

Towards Scalable and Consistent 3D Editing

Paper • 2510.02994 • Published Oct 3, 2025 • 5
UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections

Paper • 2509.24817 • Published Sep 29, 2025 • 8
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Paper • 2510.15019 • Published Oct 16, 2025 • 63
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Paper • 2510.15869 • Published Oct 17, 2025 • 48

about 19 hours ago

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs