Image processing using artificial intelligence
Human pose estimation is an important part of computer vision that determines the positions and orientations of a human body in 2D or 3D images and videos. This project explores the application of Artificial Intelligence (AI) techniques for 3D HPE, specifically leveraging the MixSTE: Seq2seq Mixed S...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181618 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Human pose estimation is an important part of computer vision that determines the positions and orientations of a human body in 2D or 3D images and videos. This project explores the application of Artificial Intelligence (AI) techniques for 3D HPE, specifically leveraging the MixSTE: Seq2seq Mixed Spatio-Temporal Encoder, used to estimate poses from video sequences. MixSTE combines spatial and temporal feature extraction to accurately predict human poses by modeling complex body dynamics over time.
The main goal of this work is to develop and assess MixSTE for human pose estimation in videos, focusing on enhancing the accuracy and reliability of pose predictions, even in challenging conditions involving occlusions and diverse body movements. The proposed system uses a sequence-to-sequence (seq2seq) architecture to effectively encode and decode spatial and temporal information, providing a significant advancement over existing methods that often struggle with temporal inconsistencies.
The experiments were performed on benchmark datasets like Human3.6M, and the results indicate that the proposed approach achieves high accuracy in 3D pose estimation, outperforming several state-of-the-art methods in terms of Mean Per Joint Position Error (MPJPE). This work demonstrates the potential of MixSTE for real-world applications, including activity recognition, human-computer interaction, and animation, contributing to the broader field of AI-driven human motion analysis. |
---|