Bailando++: 3D dance GPT with choreographic memory
Our proposed music-to-dance framework, Bailando++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. Bailando++ consists of two components: a choreographic memory that l...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/173444 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-173444 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1734442024-02-06T07:05:28Z Bailando++: 3D dance GPT with choreographic memory Li, Siyao Yu, Weijiang Gu, Tianpei Lin, Chunze Wang, Quan Qian, Chen Loy, Chen Change Liu, Ziwei School of Computer Science and Engineering Computer and Information Science 3D Human Motion Dance Generation Our proposed music-to-dance framework, Bailando++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. Bailando++ consists of two components: a choreographic memory that learns to summarize meaningful dancing units from 3D pose sequences, and an actor-critic Generative Pre-trained Transformer (GPT) that composes these units into a fluent dance coherent to the music. In particular, to synchronize the diverse motion tempos and music beats, we introduce an actor-critic-based reinforcement learning scheme to the GPT with a novel beat-align reward function. Additionally, we consider learning human dance poses in the rotation domain to avoid body distortions incompatible with human morphology, and introduce a musical contextual encoding to allow the motion GPT to grasp longer-term patterns of music. Our experiments on the standard benchmark show that Bailando++ achieves state-of-the-art performance both qualitatively and quantitatively, with the added benefit of the unsupervised discovery of human-interpretable dancing-style poses in the choreographic memory. Agency for Science, Technology and Research (A*STAR) Nanyang Technological University National Research Foundation (NRF) This work was supported in part by the RIE2020 Industry Alignment Fund Industry Collaboration Projects (IAFICP) Funding Initiative through cash and in-kind contribution from the industry partner(s), in part by the National Research Foundation, Singapore, through its AI Singapore Programme under AISG Award AISG-PhD/2021-01-031[T], and in part by the NTU NAP Grant. 2024-02-05T02:04:10Z 2024-02-05T02:04:10Z 2023 Journal Article Li, S., Yu, W., Gu, T., Lin, C., Wang, Q., Qian, C., Loy, C. C. & Liu, Z. (2023). Bailando++: 3D dance GPT with choreographic memory. IEEE Transactions On Pattern Analysis and Machine Intelligence, 45(12), 14192-14207. https://dx.doi.org/10.1109/TPAMI.2023.3319435 0162-8828 https://hdl.handle.net/10356/173444 10.1109/TPAMI.2023.3319435 37751342 2-s2.0-85173064445 12 45 14192 14207 en AISG-PhD/2021-01-031[T] NTU-NAP IEEE Transactions on Pattern Analysis and Machine Intelligence © 2023 IEEE. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science 3D Human Motion Dance Generation |
spellingShingle |
Computer and Information Science 3D Human Motion Dance Generation Li, Siyao Yu, Weijiang Gu, Tianpei Lin, Chunze Wang, Quan Qian, Chen Loy, Chen Change Liu, Ziwei Bailando++: 3D dance GPT with choreographic memory |
description |
Our proposed music-to-dance framework, Bailando++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. Bailando++ consists of two components: a choreographic memory that learns to summarize meaningful dancing units from 3D pose sequences, and an actor-critic Generative Pre-trained Transformer (GPT) that composes these units into a fluent dance coherent to the music. In particular, to synchronize the diverse motion tempos and music beats, we introduce an actor-critic-based reinforcement learning scheme to the GPT with a novel beat-align reward function. Additionally, we consider learning human dance poses in the rotation domain to avoid body distortions incompatible with human morphology, and introduce a musical contextual encoding to allow the motion GPT to grasp longer-term patterns of music. Our experiments on the standard benchmark show that Bailando++ achieves state-of-the-art performance both qualitatively and quantitatively, with the added benefit of the unsupervised discovery of human-interpretable dancing-style poses in the choreographic memory. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Li, Siyao Yu, Weijiang Gu, Tianpei Lin, Chunze Wang, Quan Qian, Chen Loy, Chen Change Liu, Ziwei |
format |
Article |
author |
Li, Siyao Yu, Weijiang Gu, Tianpei Lin, Chunze Wang, Quan Qian, Chen Loy, Chen Change Liu, Ziwei |
author_sort |
Li, Siyao |
title |
Bailando++: 3D dance GPT with choreographic memory |
title_short |
Bailando++: 3D dance GPT with choreographic memory |
title_full |
Bailando++: 3D dance GPT with choreographic memory |
title_fullStr |
Bailando++: 3D dance GPT with choreographic memory |
title_full_unstemmed |
Bailando++: 3D dance GPT with choreographic memory |
title_sort |
bailando++: 3d dance gpt with choreographic memory |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/173444 |
_version_ |
1794549369515540480 |