Machine learning for human behavior analysis

Nowadays, many fitness bloggers have popped up to upload teaching yoga videos for rookies to exercise. Generally, yoga poses are designed to stretch different parts of human bodies, and if wrong videos are followed, it would be a waste of time and effort. However, the present solution to select the...

Full description

Saved in:
Bibliographic Details
Main Author: Chen, Zien
Other Authors: Tan Yap Peng
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/158900
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Nowadays, many fitness bloggers have popped up to upload teaching yoga videos for rookies to exercise. Generally, yoga poses are designed to stretch different parts of human bodies, and if wrong videos are followed, it would be a waste of time and effort. However, the present solution to select the right videos is by manual recognition, which is time-consuming and requires domain expertise. In addition, new yoga gestures are created constantly, which cannot be simply recognized by pose recognition or detection. This dissertation aims to design a system to classify yoga exercising videos. It adopts VGG16, short for Visual Geometry Group Network, as its classification model. In this dissertation, one-shot learning is used to find gestures of interest in video testing samples. After that, these gestures are compared with small datasets using an m-way k-shots few-shot learning method. Eventually, it would label each yoga video, classifying its exercising part of body for yoga learners. In addition, this dissertation provides a supervision function for learners. It allows users to input videos recording their gestures and judge if they do it right. The output score is an evaluation indicator similar to mAP. This part relies on supervised learning method, and this dissertation adopts Faster RCNN as its object detection model, whose accuracy is 90.90%, based on our experiments. Keywords: yoga, VGG16, classification, one-shot learning, few-shot learning, supervised learning, mAP, accuracy.