Disentangling action content and style from motion capture sequences of standardised rehabilitation tasks

Physical therapy and rehabilitation will always be a pivotal part in every society as our ability to move is greatly treasured. However, there is no objective way of assessing one’s physical ability. Instead, it is measured subjectively through clinical observation, objective diagnostic procedures a...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Shauna Li-Ting
Other Authors: Cham Tat Jen
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/163035
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Physical therapy and rehabilitation will always be a pivotal part in every society as our ability to move is greatly treasured. However, there is no objective way of assessing one’s physical ability. Instead, it is measured subjectively through clinical observation, objective diagnostic procedures and standardised tests. As part of the Towards Data-Driven Ability Gap Modelling project under the aegis of the Rehabilitation Research Institute of Singapore (RRIS), this project aims to explore the use of deep neural networks to predict human motion on a specific physical task. Since the Disentangled Representation for Image-to-Image Translation (DRIT) model has been found to generate competitive results for Image-to-Image translation, we propose modifying this model to do the same for motion capture data instead of images, since the broad idea is similar. Using motion capture data of subjects executing two tasks – 10-metre walk and step-up – we train the model for domain translation between these two tasks in an unsupervised fashion. Given data of one subject doing the 10-metre walk task and data of one subject doing the step-up task, we will be able to generate a prediction on how each of the subject would carry out the other task. This is done by first mapping the data onto two different spaces – a shared latent content space and a latent attribute space that is separate for both tasks. Then, we carry out two cross translations by swapping the content between both tasks and generating the output. Although we train each iteration with a pair of data, one from each task, the data is considered unpaired. Although the results for this project are not ideal, we can use this as a starting point for modifying DRIT for motion capture data, to close the gap in data-driven ability modelling.