MVSGaussian: fast generalizable Gaussian splatting reconstruction from multi-view stereo
We present MVSGaussian, a new generalizable 3D Gaussian representation approach derived from Multi-View Stereo (MVS) that can efficiently reconstruct unseen scenes. Specifically, 1) we leverage MVS to encode geometry-aware Gaussian representations and decode them into Gaussian parameters. 2) To...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2025
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/182591 http://arxiv.org/abs/2405.12218v3 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | We present MVSGaussian, a new generalizable 3D Gaussian representation
approach derived from Multi-View Stereo (MVS) that can efficiently reconstruct
unseen scenes. Specifically, 1) we leverage MVS to encode geometry-aware
Gaussian representations and decode them into Gaussian parameters. 2) To
further enhance performance, we propose a hybrid Gaussian rendering that
integrates an efficient volume rendering design for novel view synthesis. 3) To
support fast fine-tuning for specific scenes, we introduce a multi-view
geometric consistent aggregation strategy to effectively aggregate the point
clouds generated by the generalizable model, serving as the initialization for
per-scene optimization. Compared with previous generalizable NeRF-based
methods, which typically require minutes of fine-tuning and seconds of
rendering per image, MVSGaussian achieves real-time rendering with better
synthesis quality for each scene. Compared with the vanilla 3D-GS, MVSGaussian
achieves better view synthesis with less training computational cost. Extensive
experiments on DTU, Real Forward-facing, NeRF Synthetic, and Tanks and Temples
datasets validate that MVSGaussian attains state-of-the-art performance with
convincing generalizability, real-time rendering speed, and fast per-scene
optimization. |
---|