MoDA: modeling deformable 3D objects from casual videos

In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of NeRF, many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical spac...

Full description

Saved in:
Bibliographic Details
Main Authors: Song, Chaoyue, Wei, Jiacheng, Chen, Tianyi, Chen, Yiwen, Foo, Chuan-Sheng, Liu, Fayao, Lin, Guosheng
Other Authors: College of Computing and Data Science
Format: Article
Language:English
Published: 2025
Subjects:
Online Access:https://hdl.handle.net/10356/182305
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-182305
record_format dspace
spelling sg-ntu-dr.10356-1823052025-01-21T02:45:24Z MoDA: modeling deformable 3D objects from casual videos Song, Chaoyue Wei, Jiacheng Chen, Tianyi Chen, Yiwen Foo, Chuan-Sheng Liu, Fayao Lin, Guosheng College of Computing and Data Science Institute for Infocomm Research, A*STAR Computer and Information Science 3D reconstruction from videos Deformable neural radiance fields In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of NeRF, many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonical-observation transformation. However, the linearly weighted combination of rigid transformation matrices is not guaranteed to be rigid. As a matter of fact, unexpected scale and shear factors often appear. In practice, using LBS as the deformation model can always lead to skin-collapsing artifacts for bending or twisting motions. To solve this problem, we propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation, which can perform rigid transformation without skin-collapsing artifacts. To register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space, and 2D image features by solving an optimal transport problem. Besides, we introduce a texture filtering approach for texture rendering that effectively minimizes the impact of noisy colors outside target deformable objects. Agency for Science, Technology and Research (A*STAR) This research is supported by the Agency for Science, Technology and Research (A*STAR) under its MTC Programmatic Funds (Grant No. M23L7b0021). 2025-01-21T02:45:24Z 2025-01-21T02:45:24Z 2024 Journal Article Song, C., Wei, J., Chen, T., Chen, Y., Foo, C., Liu, F. & Lin, G. (2024). MoDA: modeling deformable 3D objects from casual videos. International Journal of Computer Vision. https://dx.doi.org/10.1007/s11263-024-02310-5 0920-5691 https://hdl.handle.net/10356/182305 10.1007/s11263-024-02310-5 2-s2.0-85211923417 en M23L7b0021 International Journal of Computer Vision © 2024 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
3D reconstruction from videos
Deformable neural radiance fields
spellingShingle Computer and Information Science
3D reconstruction from videos
Deformable neural radiance fields
Song, Chaoyue
Wei, Jiacheng
Chen, Tianyi
Chen, Yiwen
Foo, Chuan-Sheng
Liu, Fayao
Lin, Guosheng
MoDA: modeling deformable 3D objects from casual videos
description In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of NeRF, many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonical-observation transformation. However, the linearly weighted combination of rigid transformation matrices is not guaranteed to be rigid. As a matter of fact, unexpected scale and shear factors often appear. In practice, using LBS as the deformation model can always lead to skin-collapsing artifacts for bending or twisting motions. To solve this problem, we propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation, which can perform rigid transformation without skin-collapsing artifacts. To register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space, and 2D image features by solving an optimal transport problem. Besides, we introduce a texture filtering approach for texture rendering that effectively minimizes the impact of noisy colors outside target deformable objects.
author2 College of Computing and Data Science
author_facet College of Computing and Data Science
Song, Chaoyue
Wei, Jiacheng
Chen, Tianyi
Chen, Yiwen
Foo, Chuan-Sheng
Liu, Fayao
Lin, Guosheng
format Article
author Song, Chaoyue
Wei, Jiacheng
Chen, Tianyi
Chen, Yiwen
Foo, Chuan-Sheng
Liu, Fayao
Lin, Guosheng
author_sort Song, Chaoyue
title MoDA: modeling deformable 3D objects from casual videos
title_short MoDA: modeling deformable 3D objects from casual videos
title_full MoDA: modeling deformable 3D objects from casual videos
title_fullStr MoDA: modeling deformable 3D objects from casual videos
title_full_unstemmed MoDA: modeling deformable 3D objects from casual videos
title_sort moda: modeling deformable 3d objects from casual videos
publishDate 2025
url https://hdl.handle.net/10356/182305
_version_ 1823108728604327936