Exploiting approximation, caching and specialization to accelerate vision sensing applications
Over the past few years, deep learning has emerged as state-of-the-art solutions for many challenging computer vision tasks such as face recognition, object detection, etc. Despite of its outstanding performance, deep neural networks (DNNs) are computational intensive, which prevent them to be widel...
Saved in:
Main Author: | |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2019
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/etd_coll/242 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1242&context=etd_coll |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.etd_coll-1242 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.etd_coll-12422020-03-13T08:17:17Z Exploiting approximation, caching and specialization to accelerate vision sensing applications HUYNH, Nguyen Loc Over the past few years, deep learning has emerged as state-of-the-art solutions for many challenging computer vision tasks such as face recognition, object detection, etc. Despite of its outstanding performance, deep neural networks (DNNs) are computational intensive, which prevent them to be widely adopted on billions of mobile and embedded devices with scarce resources. To address that limitation, wefocus on building systems and optimization algorithms to accelerate those models, making them more computational-efficient. First, this thesis explores the computational capabilities of different existing processors (or co-processors) on modern mobile devices. It recognizes that by leveraging the mobile Graphics Processing Units (mGPUs), we can reduce the time consumed in the deep learning inference pipeline by an order of magnitude. We further investigated variety of optimizations that work on the mGPUs for more accelerations and built the DeepSense framework to demonstrate their uses.Second, we also discovered that video streams often contain invariant regions (e.g., background, static objects) across multiple video frames. Processing those regions from frame to frame would waste a lot of computational power. We proposed a convolutional caching technique and built a DeepMon framework that quickly determines the static regions and intelligently skips the computations on those regions during the deep neural network processing pipeline.The thesis also explores how to make deep learning models more computational-efficient by pruning unnecessary parameters. Many studies have shown that most of the computations occurred within convolutional layers, which are widely used in convolutional neural networks (CNNs) for many computer vision tasks. We designed a novel D-Pruner algorithm that allows us to score the parameters based onhow important they are to the final performance. Parameters with little impacts will be removed for smaller, faster and more computational-efficient models.Finally, we investigated the feasibility of using multi-exit models (MXNs), which consist many neural networks with shared-layers, as an efficient implementation to accelerate many existing computer vision tasks. We show that applying techniques such as aggregating results cross exits, threshold-based early exiting with MXNs can significantly speed up the inference latency in indexed video querying and facerecognition systems. 2019-09-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/etd_coll/242 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1242&context=etd_coll http://creativecommons.org/licenses/by-nc-nd/4.0/ Dissertations and Theses Collection (Open Access) eng Institutional Knowledge at Singapore Management University Deep learning deep neural network mobile deep learning model approximation model pruning specialized model multi-exit models anytime neural network Programming Languages and Compilers Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Deep learning deep neural network mobile deep learning model approximation model pruning specialized model multi-exit models anytime neural network Programming Languages and Compilers Software Engineering |
spellingShingle |
Deep learning deep neural network mobile deep learning model approximation model pruning specialized model multi-exit models anytime neural network Programming Languages and Compilers Software Engineering HUYNH, Nguyen Loc Exploiting approximation, caching and specialization to accelerate vision sensing applications |
description |
Over the past few years, deep learning has emerged as state-of-the-art solutions for many challenging computer vision tasks such as face recognition, object detection, etc. Despite of its outstanding performance, deep neural networks (DNNs) are computational intensive, which prevent them to be widely adopted on billions of mobile and embedded devices with scarce resources. To address that limitation, wefocus on building systems and optimization algorithms to accelerate those models, making them more computational-efficient. First, this thesis explores the computational capabilities of different existing processors (or co-processors) on modern mobile devices. It recognizes that by leveraging the mobile Graphics Processing Units (mGPUs), we can reduce the time consumed in the deep learning inference pipeline by an order of magnitude. We further investigated variety of optimizations that work on the mGPUs for more accelerations and built the DeepSense framework to demonstrate their uses.Second, we also discovered that video streams often contain invariant regions (e.g., background, static objects) across multiple video frames. Processing those regions from frame to frame would waste a lot of computational power. We proposed a convolutional caching technique and built a DeepMon framework that quickly determines the static regions and intelligently skips the computations on those regions during the deep neural network processing pipeline.The thesis also explores how to make deep learning models more computational-efficient by pruning unnecessary parameters. Many studies have shown that most of the computations occurred within convolutional layers, which are widely used in convolutional neural networks (CNNs) for many computer vision tasks. We designed a novel D-Pruner algorithm that allows us to score the parameters based onhow important they are to the final performance. Parameters with little impacts will be removed for smaller, faster and more computational-efficient models.Finally, we investigated the feasibility of using multi-exit models (MXNs), which consist many neural networks with shared-layers, as an efficient implementation to accelerate many existing computer vision tasks. We show that applying techniques such as aggregating results cross exits, threshold-based early exiting with MXNs can significantly speed up the inference latency in indexed video querying and facerecognition systems. |
format |
text |
author |
HUYNH, Nguyen Loc |
author_facet |
HUYNH, Nguyen Loc |
author_sort |
HUYNH, Nguyen Loc |
title |
Exploiting approximation, caching and specialization to accelerate vision sensing applications |
title_short |
Exploiting approximation, caching and specialization to accelerate vision sensing applications |
title_full |
Exploiting approximation, caching and specialization to accelerate vision sensing applications |
title_fullStr |
Exploiting approximation, caching and specialization to accelerate vision sensing applications |
title_full_unstemmed |
Exploiting approximation, caching and specialization to accelerate vision sensing applications |
title_sort |
exploiting approximation, caching and specialization to accelerate vision sensing applications |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2019 |
url |
https://ink.library.smu.edu.sg/etd_coll/242 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1242&context=etd_coll |
_version_ |
1712300934150750208 |