Vision-based analytics for improved AI-driven IoT applications
Proliferation of Internet of Things (IoT) sensor systems, primarily driven by cheaper embedded hardware platforms and wide availability of light-weight software platforms, has opened up doors for large-scale data collection opportunities. The availability of massive amount of data has in-turn given...
Saved in:
Main Author: | |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2020
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/etd_coll/321 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1321&context=etd_coll |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | Proliferation of Internet of Things (IoT) sensor systems, primarily driven by cheaper embedded hardware platforms and wide availability of light-weight software platforms, has opened up doors for large-scale data collection opportunities. The availability of massive amount of data has in-turn given way to rapidly growing machine learning models e.g. You Only Look Once (YOLO), Single-Shot-Detectors (SSD) and so on. There has been a growing trend of applying machine learning techniques, e.g., object detection, image classification, face detection etc., on data collected from camera sensors and therefore enabling plethora of vision-sensing applications namely self-driving cars, automatic crowd monitoring, traffic-flow analysis, occupancy detection and so on. While these vision-sensing applications are quite useful, their real-world deployments can be challenging for various reasons namely DNN performance drop on data collected in-the-wild, high energy consumption by vision sensors, privacy concerns raised by the captured audio/video data and so on. This dissertation explores how a combination of IoT sensors and machine-learning models can help resolve some of these challenges. It proposes novel vision-analytics techniques, aimed at improving the large-scale adoption of vision-sensing techniques, with their potential performance improvements demonstrated by using two different vision-sensing systems namely SmrtFridge and CollabCam.
First, this dissertation describes SmrtFridge system, which uses a combination of embedded RGB & Infrared (IR) camera sensors and a machine-learning model for automatic food item identification and residual quantity sensing. SmrtFridge adopts a user interaction-driven sensing approach which is triggered as and when a user is interacting (adding/removing items) with any food item. Using two different processing pipelines, i.e., motion-vector based and IR based, SmrtFridge isolates the food item from the other background objects that might be present in the captured images. The segmented items are then assigned a food label by an image classifier. SmrtFridge shows that using these segmentation techniques can help convert the item identification problem from a complex object-detection problem to a relatively simpler object-classification problem. Also, SmrtFridge proposes a novel IR based residual quantity estimation technique which can quantify the residual content inside food item containers (transparent/opaque) of various shapes, sizes and material types.
Secondly, this dissertation presents CollabCam, a novel and distinct multi-camera collaboration framework for energy efficient visual (RGB) sensing in a large-scale camera deployment. CollabCam exploits the partially overlapping FoVs of cameras to selectively reduce imaging resolution in their mutually common regions. This resolution reduction can enable overall energy savings of a camera sensor by reducing the energy consumption in image capture, optional storage and network transmission. CollabCam proposes novel techniques for (a) autonomous and accurate estimation of overlapping regions between a pair of cameras (b) mixedresolution sensing where selected regions of an image are captured at lower resolution, whereas the remaining regions are captured at default (higher) resolution and (c) collaborative object inference where a modified DNN model, called CollabDNN, utilizes the perspective of other collaborating cameras to enhance performance of object detection on low-resolution images. Application of CollabCam techniques on two publicly available datasets demonstrates the potential high energy savings for a multi-camera system and takes a step towards making energy efficient large-scale vision-sensing systems a reality |
---|