Bailando++: 3D dance GPT with choreographic memory

Our proposed music-to-dance framework, Bailando++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. Bailando++ consists of two components: a choreographic memory that l...

Full description

Saved in:

Bibliographic Details
Main Authors:	Li, Siyao, Yu, Weijiang, Gu, Tianpei, Lin, Chunze, Wang, Quan, Qian, Chen, Loy, Chen Change, Liu, Ziwei
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2024
Subjects:	Computer and Information Science 3D Human Motion Dance Generation
Online Access:	https://hdl.handle.net/10356/173444
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-173444
record_format	dspace
spelling	sg-ntu-dr.10356-1734442024-02-06T07:05:28Z Bailando++: 3D dance GPT with choreographic memory Li, Siyao Yu, Weijiang Gu, Tianpei Lin, Chunze Wang, Quan Qian, Chen Loy, Chen Change Liu, Ziwei School of Computer Science and Engineering Computer and Information Science 3D Human Motion Dance Generation Our proposed music-to-dance framework, Bailando++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. Bailando++ consists of two components: a choreographic memory that learns to summarize meaningful dancing units from 3D pose sequences, and an actor-critic Generative Pre-trained Transformer (GPT) that composes these units into a fluent dance coherent to the music. In particular, to synchronize the diverse motion tempos and music beats, we introduce an actor-critic-based reinforcement learning scheme to the GPT with a novel beat-align reward function. Additionally, we consider learning human dance poses in the rotation domain to avoid body distortions incompatible with human morphology, and introduce a musical contextual encoding to allow the motion GPT to grasp longer-term patterns of music. Our experiments on the standard benchmark show that Bailando++ achieves state-of-the-art performance both qualitatively and quantitatively, with the added benefit of the unsupervised discovery of human-interpretable dancing-style poses in the choreographic memory. Agency for Science, Technology and Research (A*STAR) Nanyang Technological University National Research Foundation (NRF) This work was supported in part by the RIE2020 Industry Alignment Fund Industry Collaboration Projects (IAFICP) Funding Initiative through cash and in-kind contribution from the industry partner(s), in part by the National Research Foundation, Singapore, through its AI Singapore Programme under AISG Award AISG-PhD/2021-01-031[T], and in part by the NTU NAP Grant. 2024-02-05T02:04:10Z 2024-02-05T02:04:10Z 2023 Journal Article Li, S., Yu, W., Gu, T., Lin, C., Wang, Q., Qian, C., Loy, C. C. & Liu, Z. (2023). Bailando++: 3D dance GPT with choreographic memory. IEEE Transactions On Pattern Analysis and Machine Intelligence, 45(12), 14192-14207. https://dx.doi.org/10.1109/TPAMI.2023.3319435 0162-8828 https://hdl.handle.net/10356/173444 10.1109/TPAMI.2023.3319435 37751342 2-s2.0-85173064445 12 45 14192 14207 en AISG-PhD/2021-01-031[T] NTU-NAP IEEE Transactions on Pattern Analysis and Machine Intelligence © 2023 IEEE. All rights reserved.
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science 3D Human Motion Dance Generation
spellingShingle	Computer and Information Science 3D Human Motion Dance Generation Li, Siyao Yu, Weijiang Gu, Tianpei Lin, Chunze Wang, Quan Qian, Chen Loy, Chen Change Liu, Ziwei Bailando++: 3D dance GPT with choreographic memory
description	Our proposed music-to-dance framework, Bailando++, addresses the challenges of driving 3D characters to dance in a way that follows the constraints of choreography norms and maintains temporal coherency with different music genres. Bailando++ consists of two components: a choreographic memory that learns to summarize meaningful dancing units from 3D pose sequences, and an actor-critic Generative Pre-trained Transformer (GPT) that composes these units into a fluent dance coherent to the music. In particular, to synchronize the diverse motion tempos and music beats, we introduce an actor-critic-based reinforcement learning scheme to the GPT with a novel beat-align reward function. Additionally, we consider learning human dance poses in the rotation domain to avoid body distortions incompatible with human morphology, and introduce a musical contextual encoding to allow the motion GPT to grasp longer-term patterns of music. Our experiments on the standard benchmark show that Bailando++ achieves state-of-the-art performance both qualitatively and quantitatively, with the added benefit of the unsupervised discovery of human-interpretable dancing-style poses in the choreographic memory.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Li, Siyao Yu, Weijiang Gu, Tianpei Lin, Chunze Wang, Quan Qian, Chen Loy, Chen Change Liu, Ziwei
format	Article
author	Li, Siyao Yu, Weijiang Gu, Tianpei Lin, Chunze Wang, Quan Qian, Chen Loy, Chen Change Liu, Ziwei
author_sort	Li, Siyao
title	Bailando++: 3D dance GPT with choreographic memory
title_short	Bailando++: 3D dance GPT with choreographic memory
title_full	Bailando++: 3D dance GPT with choreographic memory
title_fullStr	Bailando++: 3D dance GPT with choreographic memory
title_full_unstemmed	Bailando++: 3D dance GPT with choreographic memory
title_sort	bailando++: 3d dance gpt with choreographic memory
publishDate	2024
url	https://hdl.handle.net/10356/173444
_version_	1794549369515540480

Bailando++: 3D dance GPT with choreographic memory

Similar Items