Scene understanding based on visual and acoustic data

Convolutional Neural Networks (CNN) is the latest development of neural network. It is a deep learning network having multiple layers of convolution operations to extract image or video features and make predictions. With recent years of rapid development, CNN can be seen everywhere with various app...

Full description

Saved in:
Bibliographic Details
Main Author: Feng, Nijing
Other Authors: Mao Kezhi
Format: Final Year Project
Language:English
Published: 2019
Subjects:
Online Access:http://hdl.handle.net/10356/77672
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Convolutional Neural Networks (CNN) is the latest development of neural network. It is a deep learning network having multiple layers of convolution operations to extract image or video features and make predictions. With recent years of rapid development, CNN can be seen everywhere with various applications in real life. New models of CNNs are being developed continuously. In this report, the basic principles of CNN will be discussed, experiments performed will be described, attempts to improve CNN performance with additional acoustic data will be explained.