Study on quality evaluation and model testing of human motion dataset under manufacturing scenario
In the field of human trajectory prediction, most existing research focuses on urban roadways or indoor public spaces, often overlooking task-specific behaviors and interactions in industrial environments. To address this issue, our study utilized two datasets collected by Nanyang Technologica...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181401 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181401 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1814012024-12-06T15:49:12Z Study on quality evaluation and model testing of human motion dataset under manufacturing scenario Zhang, Li Su Rong School of Electrical and Electronic Engineering RSu@ntu.edu.sg Engineering Human trajectory prediction Industrial environment Convolutional neural network (CNN) Transformer model Human-robot collaboration (HRC) In the field of human trajectory prediction, most existing research focuses on urban roadways or indoor public spaces, often overlooking task-specific behaviors and interactions in industrial environments. To address this issue, our study utilized two datasets collected by Nanyang Technological University (NTU): the Fixed Detective Perspective (FDP) dataset and the First Person Perspective (FPP) dataset for human movement analysis in manufacturing environment. We extracted three features—pose, trajectory, and ego motion—from these datasets, which were used as inputs for a modified Convolutional Neural Network (CNN) and a modified Transformer model for human trajectory prediction. The experiments revealed that CNN is more suitable for tasks with strict training time requirements, while the Transformer model excels in tasks that demand higher accuracy. Moreover, experiments using the Transformer model with the optimal hyperparameter configuration, showed that the FDP-trained model achieved a Mean Absolute Error (MAE) of 84.1 pixels, compared to 158.9 pixels for the FPP-trained model, indicating that the FDP dataset, due to reduced self-motion noise, serves as a more suitable input. Furthermore, in scenarios where the image of human operators is incomplete due to occlusion, the Transformer model trained on sub-dataset where humans are occluded had an MAE of 180.7 pixels, while the model trained on the sub- dataset of human movement without occlusion had an MAE of 90.4 pixels, highlighting the challenges posed by occlusion in industrial environments. In the ablation study, different combinations of features—key points + pose, key points + ego motion, and key points + pose + ego motion—were used as inputs to the Transformer model. The results showed that the model trained with key points + pose achieved a Mean Absolute Error (MAE) of 11.82 pixels, the model trained with key points + ego motion had an MAE of 37.04 pixels, and the model trained with key points + pose + ego motion produced the lowest MAE of 10.79 pixels. All of these combinations significantly outperformed the model trained solely on trajectory, which had an MAE of 83.98 pixels. These results confirm that the inclusion of the pose feature plays a crucial role in improving the accuracy of the Transformer-based human trajectory prediction model, making it a key feature for enhancing predictive performance in industrial environments. Master's degree 2024-12-02T02:05:20Z 2024-12-02T02:05:20Z 2024 Thesis-Master by Coursework Zhang, L. (2024). Study on quality evaluation and model testing of human motion dataset under manufacturing scenario. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181401 https://hdl.handle.net/10356/181401 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering Human trajectory prediction Industrial environment Convolutional neural network (CNN) Transformer model Human-robot collaboration (HRC) |
spellingShingle |
Engineering Human trajectory prediction Industrial environment Convolutional neural network (CNN) Transformer model Human-robot collaboration (HRC) Zhang, Li Study on quality evaluation and model testing of human motion dataset under manufacturing scenario |
description |
In the field of human trajectory prediction, most existing research focuses on
urban roadways or indoor public spaces, often overlooking task-specific behaviors
and interactions in industrial environments. To address this issue, our study utilized
two datasets collected by Nanyang Technological University (NTU): the Fixed
Detective Perspective (FDP) dataset and the First Person Perspective (FPP) dataset
for human movement analysis in manufacturing environment. We extracted three
features—pose, trajectory, and ego motion—from these datasets, which were used as
inputs for a modified Convolutional Neural Network (CNN) and a modified
Transformer model for human trajectory prediction. The experiments revealed that
CNN is more suitable for tasks with strict training time requirements, while the
Transformer model excels in tasks that demand higher accuracy. Moreover,
experiments using the Transformer model with the optimal hyperparameter
configuration, showed that the FDP-trained model achieved a Mean Absolute Error
(MAE) of 84.1 pixels, compared to 158.9 pixels for the FPP-trained model, indicating
that the FDP dataset, due to reduced self-motion noise, serves as a more suitable input.
Furthermore, in scenarios where the image of human operators is incomplete due to
occlusion, the Transformer model trained on sub-dataset where humans are occluded
had an MAE of 180.7 pixels, while the model trained on the sub- dataset of human
movement without occlusion had an MAE of 90.4 pixels, highlighting the challenges posed by occlusion in industrial environments. In the ablation study, different
combinations of features—key points + pose, key points + ego motion, and key points
+ pose + ego motion—were used as inputs to the Transformer model. The results
showed that the model trained with key points + pose achieved a Mean Absolute Error
(MAE) of 11.82 pixels, the model trained with key points + ego motion had an MAE
of 37.04 pixels, and the model trained with key points + pose + ego motion produced
the lowest MAE of 10.79 pixels. All of these combinations significantly outperformed
the model trained solely on trajectory, which had an MAE of 83.98 pixels. These
results confirm that the inclusion of the pose feature plays a crucial role in improving
the accuracy of the Transformer-based human trajectory prediction model, making it
a key feature for enhancing predictive performance in industrial environments. |
author2 |
Su Rong |
author_facet |
Su Rong Zhang, Li |
format |
Thesis-Master by Coursework |
author |
Zhang, Li |
author_sort |
Zhang, Li |
title |
Study on quality evaluation and model testing of human motion dataset under manufacturing scenario |
title_short |
Study on quality evaluation and model testing of human motion dataset under manufacturing scenario |
title_full |
Study on quality evaluation and model testing of human motion dataset under manufacturing scenario |
title_fullStr |
Study on quality evaluation and model testing of human motion dataset under manufacturing scenario |
title_full_unstemmed |
Study on quality evaluation and model testing of human motion dataset under manufacturing scenario |
title_sort |
study on quality evaluation and model testing of human motion dataset under manufacturing scenario |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181401 |
_version_ |
1819112937744236544 |