Human pose estimation using artificial intelligence

Over the past 10 years, human pose estimation (HPE) using artificial intelligence (AI) has gained more and more attention and been used in a range of applications, like human-computer interaction, motion analysis, healthcare, and security. The optimal goal of HPE is to use input data, such as pictur...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Zheng, Zhoudong
مؤلفون آخرون: Yap Kim Hui
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2024
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/176189
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:Over the past 10 years, human pose estimation (HPE) using artificial intelligence (AI) has gained more and more attention and been used in a range of applications, like human-computer interaction, motion analysis, healthcare, and security. The optimal goal of HPE is to use input data, such as pictures and movies, to identify the various body parts and create a representation of the human body, such as skeleton and mesh. By leveraging and comparing different state-of-the-art (SOTA) deep learning models, such as convolutional neural networks (CNNs) and transformer-based structures, it is found out that insufficient learning of spatial-temporal correlation results from the prior approaches’ inability to effectively represent each joint’s solid inter-frame relationship. Among the models, MixSTE and its baseline model, VideoPose3D, have innovative idea to solve the problems. Therefore, the purpose of this report is to improve the accuracy as well as robustness of the 3D HPE by improving the current models. The present model is able to thoroughly and adaptively record long-range spatio-temporal interactions among the skeletal joints through the use of a Dual-stream Spatio-temporal Transformer (DSTformer) coupled with a motion encoder.