Temporal feature extraction for video-based activity recognition

With the development of modern media, video understanding has become a heated research topic. Convolutional Neural Network(CNN) has been proven to be very effective in the image classification task. But simply applying traditional CNN on the video action recognition task is not feasible because it c...

Full description

Saved in:

Bibliographic Details
Main Author:	Chen, Zhiyang
Other Authors:	Mao Kezhi
Format:	Thesis-Master by Coursework
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Electrical and electronic engineering
Online Access:	https://hdl.handle.net/10356/164378
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-164378
record_format	dspace
spelling	sg-ntu-dr.10356-1643782023-01-19T08:43:03Z Temporal feature extraction for video-based activity recognition Chen, Zhiyang Mao Kezhi School of Electrical and Electronic Engineering EKZMao@ntu.edu.sg Engineering::Electrical and electronic engineering With the development of modern media, video understanding has become a heated research topic. Convolutional Neural Network(CNN) has been proven to be very effective in the image classification task. But simply applying traditional CNN on the video action recognition task is not feasible because it cannot learn the motion information. In this dissertation, we study two mainstream temporal feature extraction methods at present, two-stream CNN and 3D CNN, together with their variants. The following conclusions can be obtained from our work: (i) 3D CNN models are more prone to overfit and a small video dataset is not sufficient to train a deep 3D CNN model. Transferring and fine-tuning the pre-trained model can help to solve the problem. (ii) We can improve the performance of two-stream CNN by building interaction features between two-stream features after a late convolutional layer. (iii) Factorizing 3D convolution into separate 2D and 1D convolution can boost the performance of 3D CNN. (iv) Using optical flow input in 3D CNN can also improve the prediction accuracy. Master of Science (Signal Processing) 2023-01-19T08:43:02Z 2023-01-19T08:43:02Z 2022 Thesis-Master by Coursework Chen, Z. (2022). Temporal feature extraction for video-based activity recognition. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/164378 https://hdl.handle.net/10356/164378 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering
spellingShingle	Engineering::Electrical and electronic engineering Chen, Zhiyang Temporal feature extraction for video-based activity recognition
description	With the development of modern media, video understanding has become a heated research topic. Convolutional Neural Network(CNN) has been proven to be very effective in the image classification task. But simply applying traditional CNN on the video action recognition task is not feasible because it cannot learn the motion information. In this dissertation, we study two mainstream temporal feature extraction methods at present, two-stream CNN and 3D CNN, together with their variants. The following conclusions can be obtained from our work: (i) 3D CNN models are more prone to overfit and a small video dataset is not sufficient to train a deep 3D CNN model. Transferring and fine-tuning the pre-trained model can help to solve the problem. (ii) We can improve the performance of two-stream CNN by building interaction features between two-stream features after a late convolutional layer. (iii) Factorizing 3D convolution into separate 2D and 1D convolution can boost the performance of 3D CNN. (iv) Using optical flow input in 3D CNN can also improve the prediction accuracy.
author2	Mao Kezhi
author_facet	Mao Kezhi Chen, Zhiyang
format	Thesis-Master by Coursework
author	Chen, Zhiyang
author_sort	Chen, Zhiyang
title	Temporal feature extraction for video-based activity recognition
title_short	Temporal feature extraction for video-based activity recognition
title_full	Temporal feature extraction for video-based activity recognition
title_fullStr	Temporal feature extraction for video-based activity recognition
title_full_unstemmed	Temporal feature extraction for video-based activity recognition
title_sort	temporal feature extraction for video-based activity recognition
publisher	Nanyang Technological University
publishDate	2023
url	https://hdl.handle.net/10356/164378
_version_	1756370600034566144

Temporal feature extraction for video-based activity recognition

Similar Items