MoDA: modeling deformable 3D objects from casual videos
In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of NeRF, many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical spac...
Saved in:
Main Authors: | , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2025
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/182305 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-182305 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1823052025-01-21T02:45:24Z MoDA: modeling deformable 3D objects from casual videos Song, Chaoyue Wei, Jiacheng Chen, Tianyi Chen, Yiwen Foo, Chuan-Sheng Liu, Fayao Lin, Guosheng College of Computing and Data Science Institute for Infocomm Research, A*STAR Computer and Information Science 3D reconstruction from videos Deformable neural radiance fields In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of NeRF, many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonical-observation transformation. However, the linearly weighted combination of rigid transformation matrices is not guaranteed to be rigid. As a matter of fact, unexpected scale and shear factors often appear. In practice, using LBS as the deformation model can always lead to skin-collapsing artifacts for bending or twisting motions. To solve this problem, we propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation, which can perform rigid transformation without skin-collapsing artifacts. To register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space, and 2D image features by solving an optimal transport problem. Besides, we introduce a texture filtering approach for texture rendering that effectively minimizes the impact of noisy colors outside target deformable objects. Agency for Science, Technology and Research (A*STAR) This research is supported by the Agency for Science, Technology and Research (A*STAR) under its MTC Programmatic Funds (Grant No. M23L7b0021). 2025-01-21T02:45:24Z 2025-01-21T02:45:24Z 2024 Journal Article Song, C., Wei, J., Chen, T., Chen, Y., Foo, C., Liu, F. & Lin, G. (2024). MoDA: modeling deformable 3D objects from casual videos. International Journal of Computer Vision. https://dx.doi.org/10.1007/s11263-024-02310-5 0920-5691 https://hdl.handle.net/10356/182305 10.1007/s11263-024-02310-5 2-s2.0-85211923417 en M23L7b0021 International Journal of Computer Vision © 2024 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science 3D reconstruction from videos Deformable neural radiance fields |
spellingShingle |
Computer and Information Science 3D reconstruction from videos Deformable neural radiance fields Song, Chaoyue Wei, Jiacheng Chen, Tianyi Chen, Yiwen Foo, Chuan-Sheng Liu, Fayao Lin, Guosheng MoDA: modeling deformable 3D objects from casual videos |
description |
In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of NeRF, many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonical-observation transformation. However, the linearly weighted combination of rigid transformation matrices is not guaranteed to be rigid. As a matter of fact, unexpected scale and shear factors often appear. In practice, using LBS as the deformation model can always lead to skin-collapsing artifacts for bending or twisting motions. To solve this problem, we propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation, which can perform rigid transformation without skin-collapsing artifacts. To register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space, and 2D image features by solving an optimal transport problem. Besides, we introduce a texture filtering approach for texture rendering that effectively minimizes the impact of noisy colors outside target deformable objects. |
author2 |
College of Computing and Data Science |
author_facet |
College of Computing and Data Science Song, Chaoyue Wei, Jiacheng Chen, Tianyi Chen, Yiwen Foo, Chuan-Sheng Liu, Fayao Lin, Guosheng |
format |
Article |
author |
Song, Chaoyue Wei, Jiacheng Chen, Tianyi Chen, Yiwen Foo, Chuan-Sheng Liu, Fayao Lin, Guosheng |
author_sort |
Song, Chaoyue |
title |
MoDA: modeling deformable 3D objects from casual videos |
title_short |
MoDA: modeling deformable 3D objects from casual videos |
title_full |
MoDA: modeling deformable 3D objects from casual videos |
title_fullStr |
MoDA: modeling deformable 3D objects from casual videos |
title_full_unstemmed |
MoDA: modeling deformable 3D objects from casual videos |
title_sort |
moda: modeling deformable 3d objects from casual videos |
publishDate |
2025 |
url |
https://hdl.handle.net/10356/182305 |
_version_ |
1823108728604327936 |