A new data transmission paradigm for visual analysis in edge-cloud collaboration
Edge-cloud collaboration, where sensor data is acquired at edge end while analyses finish at cloud end, has become a new fashion for deep learning based visual analysis applications. The data communication which serves as the fundamental infrastructure is playing an important role in edge-cloud coll...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/153055 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Edge-cloud collaboration, where sensor data is acquired at edge end while analyses finish at cloud end, has become a new fashion for deep learning based visual analysis applications. The data communication which serves as the fundamental infrastructure is playing an important role in edge-cloud collaboration. To enable better balance among computing load, bandwidth usage and generalization ability, I propose a new paradigm of transmitting intermediate deep learning features instead of visual signals or ultimately utilized features, which inspires research and standardization of compression techniques for intermediate deep learning features.
To improve the data transmission efficiency, I develop a video-codec-based coding framework for intermediate deep learning feature compression. Besides, I also provide an overview and propose new coding tools for PreQuantization and Repack modules in the coding framework, with extensive comparative experiments analyzing their pros and cons. The optimal combination of the proposed modes can achieve over 50x compression ratio with less than 1% task performance drop, where the bitstream of intermediate deep learning features can be much smaller than that of corresponding visual signals. It is also worth mentioning that the proposed coding framework and coding tools have been partially adopted into the ongoing AVS (Audio Video Coding Standard Workgroup) - Visual Feature Coding Standard, and provided evidences for MPEG Video Coding for Machine (VCM) standard.
Moreover, to train more robust and generic backbone neural networks for feature extraction at edge end, I present an image quality assessment (IQA) based label smoothing method to tune the objective functions in neural network training. To provide better task-specific models on top of the intermediate deep features for the cloud end, I also propose a deep holographic network with a holographic composition operator to improve task performance with less memory costs. Extensive evaluations demonstrate the efficiency of the proposed methods. |
---|