arxiv:2603.23488

One View Is Enough! Monocular Training for In-the-Wild Novel View Generation

Published on Mar 24

· Submitted by

Nicolas Dufour on Mar 25

Kyutai

Upvote

Authors:

Adrien Ramanana Rahary ,

Abstract

OVIE enables monocular novel-view synthesis from single images using pseudo-target views generated via geometric scaffolding, achieving superior performance with faster inference compared to previous methods.

AI-generated summary

Monocular novel-view synthesis has long required multi-view image pairs for supervision, limiting training data scale and diversity. We argue it is not necessary: one view is enough. We present OVIE, trained entirely on unpaired internet images. We leverage a monocular depth estimator as a geometric scaffold at training time: we lift a source image into 3D, apply a sampled camera transformation, and project to obtain a pseudo-target view. To handle disocclusions, we introduce a masked training formulation that restricts geometric, perceptual, and textural losses to valid regions, enabling training on 30 million uncurated images. At inference, OVIE is geometry-free, requiring no depth estimator or 3D representation. Trained exclusively on in-the-wild images, OVIE outperforms prior methods in a zero-shot setting, while being 600x faster than the second-best baseline. Code and models are publicly available at https://github.com/AdrienRR/ovie.

View arXiv page View PDF GitHub 1 Add to collection

Community

nicolas-dufour

Paper submitter about 6 hours ago

OVIE aims to learn camera based next based view, learning only from monocular images

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.23488 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.23488 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.23488 in a Space README.md to link it from this page.