Sequence-to-sequence learning for motion prediction and generation
The research field for computational understanding and modelling of human motion has garnered increasing importance in the last decade, with a plethora of applications in sports science, animation, robotics, surveillance and autonomous driving. In this thesis, we engage the sequence-to-sequence lear...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/159102 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-159102 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1591022022-06-07T08:14:40Z Sequence-to-sequence learning for motion prediction and generation Wu, Shuang Lu Shijian School of Computer Science and Engineering Bioinformatics Institute, A*STAR Cheng Li Eisenhaber Frank Shijian.Lu@ntu.edu.sg, lcheng5@ualberta.ca, franke@bii.a-star.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence The research field for computational understanding and modelling of human motion has garnered increasing importance in the last decade, with a plethora of applications in sports science, animation, robotics, surveillance and autonomous driving. In this thesis, we engage the sequence-to-sequence learning paradigm to study motion prediction and motion generation. We first examine multiple articulated pose representation schemes for integrating biomechanical constraints within computational motion models. Our theoretical analysis and empirical studies suggest that the kinematic tree representation with Stiefel manifold parametrizations is most suitable. In motion prediction, we seek to generate future motion given an observed sequence. To handle long-term dependency, we design a hierarchical recurrent network to simultaneously model local contexts and global characteristics. This attains better short-term accuracy along with natural motion predictions in the long-term. On another front, we look to incorporate control into our prediction models. We employ multiple generative adversarial networks to model individual body parts, allowing for fine-grained control and tuning of the prediction spectrum. Finally, we reconsider motion prediction within the framework of stochastic differential equations, which allows for interpretation of model weights as the stochastic diffusion matrix and drift parameters. For motion generation, we specifically study generating dance motion conditioned on music input. We introduce an optimal transport objective for evaluating the authenticity of generated dance distributions and a Gromov-Wasserstein objective to match dance with music. These objectives allow our model to synthesize realistic dance motion in harmony with the input music. Furthermore, we consider a dual learning framework to concurrently learn both music-to-dance and dance-to-music generation. Effectively integrating the information from both domains, dual learning boosts the performance of individual tasks, delivering realistic genre-consistent dance generations and viable music compositions. Doctor of Philosophy 2022-06-07T08:14:40Z 2022-06-07T08:14:40Z 2022 Thesis-Doctor of Philosophy Wu, S. (2022). Sequence-to-sequence learning for motion prediction and generation. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/159102 https://hdl.handle.net/10356/159102 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Wu, Shuang Sequence-to-sequence learning for motion prediction and generation |
description |
The research field for computational understanding and modelling of human motion has garnered increasing importance in the last decade, with a plethora of applications in sports science, animation, robotics, surveillance and autonomous driving. In this thesis, we engage the sequence-to-sequence learning paradigm to study motion prediction and motion generation.
We first examine multiple articulated pose representation schemes for integrating biomechanical constraints within computational motion models. Our theoretical analysis and empirical studies suggest that the kinematic tree representation with Stiefel manifold parametrizations is most suitable.
In motion prediction, we seek to generate future motion given an observed sequence. To handle long-term dependency, we design a hierarchical recurrent network to simultaneously model local contexts and global characteristics. This attains better short-term accuracy along with natural motion predictions in the long-term. On another front, we look to incorporate control into our prediction models. We employ multiple generative adversarial networks to model individual body parts, allowing for fine-grained control and tuning of the prediction spectrum. Finally, we reconsider motion prediction within the framework of stochastic differential equations, which allows for interpretation of model weights as the stochastic diffusion matrix and drift parameters.
For motion generation, we specifically study generating dance motion conditioned on music input. We introduce an optimal transport objective for evaluating the authenticity of generated dance distributions and a Gromov-Wasserstein objective to match dance with music. These objectives allow our model to synthesize realistic dance motion in harmony with the input music. Furthermore, we consider a dual learning framework to concurrently learn both music-to-dance and dance-to-music generation. Effectively integrating the information from both domains, dual learning boosts the performance of individual tasks, delivering realistic genre-consistent dance generations and viable music compositions. |
author2 |
Lu Shijian |
author_facet |
Lu Shijian Wu, Shuang |
format |
Thesis-Doctor of Philosophy |
author |
Wu, Shuang |
author_sort |
Wu, Shuang |
title |
Sequence-to-sequence learning for motion prediction and generation |
title_short |
Sequence-to-sequence learning for motion prediction and generation |
title_full |
Sequence-to-sequence learning for motion prediction and generation |
title_fullStr |
Sequence-to-sequence learning for motion prediction and generation |
title_full_unstemmed |
Sequence-to-sequence learning for motion prediction and generation |
title_sort |
sequence-to-sequence learning for motion prediction and generation |
publisher |
Nanyang Technological University |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/159102 |
_version_ |
1735491092098318336 |