Finding visual attention regions in videos
Whenever one looks at a video or image, there are regions in these videos or images that are often more prominent than other regions. The gaze of human eye would notice these regions first before moving on to other parts in the video or image. These regions are called salient regions. In this report...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2010
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/35693 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-35693 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-356932023-03-03T20:52:15Z Finding visual attention regions in videos Ang, Kenny Wen Bin Deepu Rajan School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Whenever one looks at a video or image, there are regions in these videos or images that are often more prominent than other regions. The gaze of human eye would notice these regions first before moving on to other parts in the video or image. These regions are called salient regions. In this report, a project on finding visual attention regions in videos is described. A video is made up of several frames, and by playing multiple frames per second, it shows the moving images. Based on Shannon’s Information Theory, an event that is unique contains high information. Hence if a particular region in a video frame is unique, it would stand out in the video and gain the notice of human gaze. To make use of Shannon’s Information Theory, it is needed to divide or split every frame into spatiotemporal events. It means that each frame would be split into patches of equal size and each patch would be containing information of the video. If a particular patch is unique, it would mean that this patch contains higher information. Each video has a spatial and temporal score that would be added up to form the spatiotemporal saliency score. This spatiotemporal saliency score shows the salient regions of the video as brighter pixel intensity than other. Hence by doing a threshold on the spatiotemporal saliency score, the model would be able to show the salient regions only and discarding the rest. Lastly, different video sequences would be tested to check if the result is accurate. The method used in this project might not be the same as other research papers. For example, some may perceive salient regions to be moving regions in the video. The method used in this project is showing regions with the most information and not only the moving regions in a video. Bachelor of Engineering (Computer Engineering) 2010-04-23T01:06:50Z 2010-04-23T01:06:50Z 2010 2010 Final Year Project (FYP) http://hdl.handle.net/10356/35693 en Nanyang Technological University 48 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Ang, Kenny Wen Bin Finding visual attention regions in videos |
description |
Whenever one looks at a video or image, there are regions in these videos or images that are often more prominent than other regions. The gaze of human eye would notice these regions first before moving on to other parts in the video or image. These regions are called salient regions. In this report, a project on finding visual attention regions in videos is described.
A video is made up of several frames, and by playing multiple frames per second, it shows the moving images. Based on Shannon’s Information Theory, an event that is unique contains high information. Hence if a particular region in a video frame is unique, it would stand out in the video and gain the notice of human gaze. To make use of Shannon’s Information Theory, it is needed to divide or split every frame into spatiotemporal events. It means that each frame would be split into patches of equal size and each patch would be containing information of the video. If a particular patch is unique, it would mean that this patch contains higher information.
Each video has a spatial and temporal score that would be added up to form the spatiotemporal saliency score. This spatiotemporal saliency score shows the salient regions of the video as brighter pixel intensity than other. Hence by doing a threshold on the spatiotemporal saliency score, the model would be able to show the salient regions only and discarding the rest.
Lastly, different video sequences would be tested to check if the result is accurate. The method used in this project might not be the same as other research papers. For example, some may perceive salient regions to be moving regions in the video. The method used in this project is showing regions with the most information and not only the moving regions in a video. |
author2 |
Deepu Rajan |
author_facet |
Deepu Rajan Ang, Kenny Wen Bin |
format |
Final Year Project |
author |
Ang, Kenny Wen Bin |
author_sort |
Ang, Kenny Wen Bin |
title |
Finding visual attention regions in videos |
title_short |
Finding visual attention regions in videos |
title_full |
Finding visual attention regions in videos |
title_fullStr |
Finding visual attention regions in videos |
title_full_unstemmed |
Finding visual attention regions in videos |
title_sort |
finding visual attention regions in videos |
publishDate |
2010 |
url |
http://hdl.handle.net/10356/35693 |
_version_ |
1759853033699672064 |