Bailando++: 3D dance GPT with choreographic memory

Our proposed music-to-dance framework, Bailando++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. Bailando++ consists of two components: a choreographic memory that l...

Full description

Saved in:
Bibliographic Details
Main Authors: Li, Siyao, Yu, Weijiang, Gu, Tianpei, Lin, Chunze, Wang, Quan, Qian, Chen, Loy, Chen Change, Liu, Ziwei
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/173444
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-173444
record_format dspace
spelling sg-ntu-dr.10356-1734442024-02-06T07:05:28Z Bailando++: 3D dance GPT with choreographic memory Li, Siyao Yu, Weijiang Gu, Tianpei Lin, Chunze Wang, Quan Qian, Chen Loy, Chen Change Liu, Ziwei School of Computer Science and Engineering Computer and Information Science 3D Human Motion Dance Generation Our proposed music-to-dance framework, Bailando++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. Bailando++ consists of two components: a choreographic memory that learns to summarize meaningful dancing units from 3D pose sequences, and an actor-critic Generative Pre-trained Transformer (GPT) that composes these units into a fluent dance coherent to the music. In particular, to synchronize the diverse motion tempos and music beats, we introduce an actor-critic-based reinforcement learning scheme to the GPT with a novel beat-align reward function. Additionally, we consider learning human dance poses in the rotation domain to avoid body distortions incompatible with human morphology, and introduce a musical contextual encoding to allow the motion GPT to grasp longer-term patterns of music. Our experiments on the standard benchmark show that Bailando++ achieves state-of-the-art performance both qualitatively and quantitatively, with the added benefit of the unsupervised discovery of human-interpretable dancing-style poses in the choreographic memory. Agency for Science, Technology and Research (A*STAR) Nanyang Technological University National Research Foundation (NRF) This work was supported in part by the RIE2020 Industry Alignment Fund Industry Collaboration Projects (IAFICP) Funding Initiative through cash and in-kind contribution from the industry partner(s), in part by the National Research Foundation, Singapore, through its AI Singapore Programme under AISG Award AISG-PhD/2021-01-031[T], and in part by the NTU NAP Grant. 2024-02-05T02:04:10Z 2024-02-05T02:04:10Z 2023 Journal Article Li, S., Yu, W., Gu, T., Lin, C., Wang, Q., Qian, C., Loy, C. C. & Liu, Z. (2023). Bailando++: 3D dance GPT with choreographic memory. IEEE Transactions On Pattern Analysis and Machine Intelligence, 45(12), 14192-14207. https://dx.doi.org/10.1109/TPAMI.2023.3319435 0162-8828 https://hdl.handle.net/10356/173444 10.1109/TPAMI.2023.3319435 37751342 2-s2.0-85173064445 12 45 14192 14207 en AISG-PhD/2021-01-031[T] NTU-NAP IEEE Transactions on Pattern Analysis and Machine Intelligence © 2023 IEEE. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
3D Human Motion
Dance Generation
spellingShingle Computer and Information Science
3D Human Motion
Dance Generation
Li, Siyao
Yu, Weijiang
Gu, Tianpei
Lin, Chunze
Wang, Quan
Qian, Chen
Loy, Chen Change
Liu, Ziwei
Bailando++: 3D dance GPT with choreographic memory
description Our proposed music-to-dance framework, Bailando++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. Bailando++ consists of two components: a choreographic memory that learns to summarize meaningful dancing units from 3D pose sequences, and an actor-critic Generative Pre-trained Transformer (GPT) that composes these units into a fluent dance coherent to the music. In particular, to synchronize the diverse motion tempos and music beats, we introduce an actor-critic-based reinforcement learning scheme to the GPT with a novel beat-align reward function. Additionally, we consider learning human dance poses in the rotation domain to avoid body distortions incompatible with human morphology, and introduce a musical contextual encoding to allow the motion GPT to grasp longer-term patterns of music. Our experiments on the standard benchmark show that Bailando++ achieves state-of-the-art performance both qualitatively and quantitatively, with the added benefit of the unsupervised discovery of human-interpretable dancing-style poses in the choreographic memory.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Li, Siyao
Yu, Weijiang
Gu, Tianpei
Lin, Chunze
Wang, Quan
Qian, Chen
Loy, Chen Change
Liu, Ziwei
format Article
author Li, Siyao
Yu, Weijiang
Gu, Tianpei
Lin, Chunze
Wang, Quan
Qian, Chen
Loy, Chen Change
Liu, Ziwei
author_sort Li, Siyao
title Bailando++: 3D dance GPT with choreographic memory
title_short Bailando++: 3D dance GPT with choreographic memory
title_full Bailando++: 3D dance GPT with choreographic memory
title_fullStr Bailando++: 3D dance GPT with choreographic memory
title_full_unstemmed Bailando++: 3D dance GPT with choreographic memory
title_sort bailando++: 3d dance gpt with choreographic memory
publishDate 2024
url https://hdl.handle.net/10356/173444
_version_ 1794549369515540480