Retrieval-augmented human motion generation with diffusion model
Human motion generation is a crucial area of research with the potential to bring lifelike characters and movements to various applications, enhancing user engagement and immersion. However, the intricacy and diversity of human movements, the scarcity of motion data, the difficulty of incorporating...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/167733 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Human motion generation is a crucial area of research with the potential to bring lifelike characters and movements to various applications, enhancing user engagement and immersion. However, the intricacy and diversity of human movements, the scarcity of motion data, the difficulty of incorporating human-like traits, and human’s heightened sensitivity to body movements pose persistent challenges in generating plausible human motions. The aforementioned problems have led to a surge in human motion generation model development in recent years, with text-driven motion generation being particularly popular due to its user-friendly nature. However, current text-driven generative approaches suffer from either poor quality or limitations in generalizability and expressiveness. To overcome these challenges, this project draws inspiration from successful diffusion models and retrieval techniques in related fields, and proposes ReMoDiffuse, an efficient diffusion-model-based text-driven motion generation framework complementing with a novel retrieval strategy. Specifically, ReMoDiffuse utilizes a diffusion model and integrates a multi-modality retrieval database to refine the denoising process. The results of extensive experiments demonstrate that the proposed method achieves superior performance in terms of quality, generalizability, and expressiveness. |
---|