Hardware : efficient techniques for FAST corner detector

Corner detector is the foundation to many computer vision applications such as high-speed object recognition or object analysis. These applications are beginning to find their way in battery-operated devices such as smart-phones, drones, mobile robots, etc. The corner detection algorithm for these a...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Teck Chuan
Other Authors: Lam Siew Kei
Format: Final Year Project
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/70140
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-70140
record_format dspace
spelling sg-ntu-dr.10356-701402023-03-03T20:39:14Z Hardware : efficient techniques for FAST corner detector Lim, Teck Chuan Lam Siew Kei School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering Corner detector is the foundation to many computer vision applications such as high-speed object recognition or object analysis. These applications are beginning to find their way in battery-operated devices such as smart-phones, drones, mobile robots, etc. The corner detection algorithm for these applications must therefore meet real time constraints while ensuring low power consumption. Software implementation on embedded platforms often fails to simultaneously meet these conflicting requirements. For example, low end embedded microcontrollers consume less power but are slow. On the other hand, high end processor consumes higher power to achieve high speed. In order to meet the real-time and low power requirements, some well-known hardware acceleration methods for corner detection algorithm such as FAST, Shi - Tomasi, SUSAN and Harris have been presented in the literature. In this paper, the focus will be on hardware implementation for the FAST corner detector which has been reported to have the lowest execution time. In this thesis, six hardware designs have been proposed for the FAST corner detection architecture. These hardware designs aim to reduce the resources, computation time and power dissipation. The unrolled hardware design is proposed to eliminate the usage of a 7x7 convolution buffer in the baseline architecture to reduce the resources utilized by 5.3%. The merged hardware design provides resource sharing of the scoring units in the unrolled implementation by utilizing multi-pumping [18]. Subsequently, another hardware design, Two’s Complement Merged (TCM) was proposed to remove the redundant multiplexors used by introducing a 2's complement operation at the end of the computation. With these optimizations, the total resources were reduced further by 27%. Eventually, a simpler design (XNOR TCM) was proposed to introduce the XNOR logic to simplify the complex pixel scoring module in the TCM approach. The simplification of the design reduced the number of switching activities which in turns reduce dynamic power dissipation by 47.5% when compared to the baseline architecture. To reduce the computation time, the delayed TCM employed pipelining in the critical path. This approach exploited the resource utilization achieved in previous designs. It led to a reduction in total thermal power dissipation by 18.5%, and total resource usage dropped by 14.9% while the difference in minimum period was only 6.3% difference compared to baseline architecture. Finally, the heuristics design was proposed to reduce the resources utilized in the non-maximal suppression module by introducing three scoring units. These hardware designs were implemented and demonstrated on the TERASIC DE2i-150 FPGA development kit. Bachelor of Engineering (Computer Engineering) 2017-04-12T03:53:26Z 2017-04-12T03:53:26Z 2017 Final Year Project (FYP) http://hdl.handle.net/10356/70140 en Nanyang Technological University 87 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Lim, Teck Chuan
Hardware : efficient techniques for FAST corner detector
description Corner detector is the foundation to many computer vision applications such as high-speed object recognition or object analysis. These applications are beginning to find their way in battery-operated devices such as smart-phones, drones, mobile robots, etc. The corner detection algorithm for these applications must therefore meet real time constraints while ensuring low power consumption. Software implementation on embedded platforms often fails to simultaneously meet these conflicting requirements. For example, low end embedded microcontrollers consume less power but are slow. On the other hand, high end processor consumes higher power to achieve high speed. In order to meet the real-time and low power requirements, some well-known hardware acceleration methods for corner detection algorithm such as FAST, Shi - Tomasi, SUSAN and Harris have been presented in the literature. In this paper, the focus will be on hardware implementation for the FAST corner detector which has been reported to have the lowest execution time. In this thesis, six hardware designs have been proposed for the FAST corner detection architecture. These hardware designs aim to reduce the resources, computation time and power dissipation. The unrolled hardware design is proposed to eliminate the usage of a 7x7 convolution buffer in the baseline architecture to reduce the resources utilized by 5.3%. The merged hardware design provides resource sharing of the scoring units in the unrolled implementation by utilizing multi-pumping [18]. Subsequently, another hardware design, Two’s Complement Merged (TCM) was proposed to remove the redundant multiplexors used by introducing a 2's complement operation at the end of the computation. With these optimizations, the total resources were reduced further by 27%. Eventually, a simpler design (XNOR TCM) was proposed to introduce the XNOR logic to simplify the complex pixel scoring module in the TCM approach. The simplification of the design reduced the number of switching activities which in turns reduce dynamic power dissipation by 47.5% when compared to the baseline architecture. To reduce the computation time, the delayed TCM employed pipelining in the critical path. This approach exploited the resource utilization achieved in previous designs. It led to a reduction in total thermal power dissipation by 18.5%, and total resource usage dropped by 14.9% while the difference in minimum period was only 6.3% difference compared to baseline architecture. Finally, the heuristics design was proposed to reduce the resources utilized in the non-maximal suppression module by introducing three scoring units. These hardware designs were implemented and demonstrated on the TERASIC DE2i-150 FPGA development kit.
author2 Lam Siew Kei
author_facet Lam Siew Kei
Lim, Teck Chuan
format Final Year Project
author Lim, Teck Chuan
author_sort Lim, Teck Chuan
title Hardware : efficient techniques for FAST corner detector
title_short Hardware : efficient techniques for FAST corner detector
title_full Hardware : efficient techniques for FAST corner detector
title_fullStr Hardware : efficient techniques for FAST corner detector
title_full_unstemmed Hardware : efficient techniques for FAST corner detector
title_sort hardware : efficient techniques for fast corner detector
publishDate 2017
url http://hdl.handle.net/10356/70140
_version_ 1759855792736960512