INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING

With the rapid development and implementation of video streaming technology in the modern world, it is necessary to increase the accuracy and performance of video analytics, but the availability of computing and network resources such as limited bandwidth is the main obstacle. This is because the...

Full description

Saved in:
Bibliographic Details
Main Author: Krishna, Farhan
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/82322
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:82322
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description With the rapid development and implementation of video streaming technology in the modern world, it is necessary to increase the accuracy and performance of video analytics, but the availability of computing and network resources such as limited bandwidth is the main obstacle. This is because the availability of resources for video analytics is distributed statically and video analytics is less adaptive to the computing and network resources required by video streams that continue to change content. Therefore, a system is required that can adaptively increase the inference accuracy of video analytics. As an alternative solution to this problem, an integrated system was designed and developed that increases inference accuracy through bandwidth resource allocation and optimization of video compression parameter configuration. The topic brought up in this Final Project Book by the author is a subsystem that optimizes parameter configurations using DNN driven streaming, or DDS. The DDS subsystem was developed by Du et al. which uses a feedback system by DNN to adjust video coding and compression parameters to achieve high inference accuracy efficiently. On the video compression parameter configuration side, the DDS subsystem designed by Du et al. uses a configuration that is still the default, so there is room for development and research. The design requirements for the DDS subsystem consist of three, namely E2E latency that does not exceed 1 second, increased inference accuracy resulting from video parameter configuration optimization, and the use of Linux as an open source OS. The DDS subsystem receives a video input and then performs initial encoding and compression to produce a low quality video stream. The video stream is then sent to the server to be evaluated by the object detection model, namely the Faster RCNN model. The results of the evaluation are then used as feedback which contains information on all regions of each frame of the video that are relevant to increasing the inference accuracy, thus are prioritized for improving their quality. By using feedback from the Faster RCNN evaluation results on the server, the DDS subsystem will perform reencoding and recompression to produce a high quality video stream, which can achieve higher inference accuracy. The DDS subsystem is iv developed on Linux OS, in accordance with the design requirements of an open source OS, which runs a number of Python programs and uses FFMPEG for video compression. The DDS subsystem is run in emulation mode to obtain ground truth results, then continued in implementation mode where the results are compared with ground truth to obtain subsystem inference accuracy values. In testing the DDS subsystem, it is first tested to observe the effect of configuration optimization results on those without optimization (default), then the parameters of entropy coding, motion estimation, direct prediction, macroblock size, E2E latency, and performance comparison between H.264 and H video compression standards are tested. ,265. Testing was carried out using a number of video datasets which were categorized based on the speed of object movement, so they were divided into high motion and low motion. For both video compression standards, an increase in inference accuracy regarding the default parameters was obtained by 12.14% for H.264 and 11.65% for H.265, thus meeting the design requirements for increasing the accuracy of parameter configuration optimization results. The E2E latency in the H.264 standard is smaller than H.265, but exceeds 1 second, so it has not succeeded in meeting the design requirements regarding subsystem E2E latency. Implementation and design research still has room for development and improvement for further research in the future.
format Final Project
author Krishna, Farhan
spellingShingle Krishna, Farhan
INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING
author_facet Krishna, Farhan
author_sort Krishna, Farhan
title INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING
title_short INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING
title_full INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING
title_fullStr INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING
title_full_unstemmed INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING
title_sort increasing video analytics accuracy via automated video compression parameter optimization with deep neural network driven streaming
url https://digilib.itb.ac.id/gdl/view/82322
_version_ 1822282196371636224
spelling id-itb.:823222024-07-08T08:04:30ZINCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING Krishna, Farhan Indonesia Final Project video analytics, video compression, DNN driven streaming, feedback INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/82322 With the rapid development and implementation of video streaming technology in the modern world, it is necessary to increase the accuracy and performance of video analytics, but the availability of computing and network resources such as limited bandwidth is the main obstacle. This is because the availability of resources for video analytics is distributed statically and video analytics is less adaptive to the computing and network resources required by video streams that continue to change content. Therefore, a system is required that can adaptively increase the inference accuracy of video analytics. As an alternative solution to this problem, an integrated system was designed and developed that increases inference accuracy through bandwidth resource allocation and optimization of video compression parameter configuration. The topic brought up in this Final Project Book by the author is a subsystem that optimizes parameter configurations using DNN driven streaming, or DDS. The DDS subsystem was developed by Du et al. which uses a feedback system by DNN to adjust video coding and compression parameters to achieve high inference accuracy efficiently. On the video compression parameter configuration side, the DDS subsystem designed by Du et al. uses a configuration that is still the default, so there is room for development and research. The design requirements for the DDS subsystem consist of three, namely E2E latency that does not exceed 1 second, increased inference accuracy resulting from video parameter configuration optimization, and the use of Linux as an open source OS. The DDS subsystem receives a video input and then performs initial encoding and compression to produce a low quality video stream. The video stream is then sent to the server to be evaluated by the object detection model, namely the Faster RCNN model. The results of the evaluation are then used as feedback which contains information on all regions of each frame of the video that are relevant to increasing the inference accuracy, thus are prioritized for improving their quality. By using feedback from the Faster RCNN evaluation results on the server, the DDS subsystem will perform reencoding and recompression to produce a high quality video stream, which can achieve higher inference accuracy. The DDS subsystem is iv developed on Linux OS, in accordance with the design requirements of an open source OS, which runs a number of Python programs and uses FFMPEG for video compression. The DDS subsystem is run in emulation mode to obtain ground truth results, then continued in implementation mode where the results are compared with ground truth to obtain subsystem inference accuracy values. In testing the DDS subsystem, it is first tested to observe the effect of configuration optimization results on those without optimization (default), then the parameters of entropy coding, motion estimation, direct prediction, macroblock size, E2E latency, and performance comparison between H.264 and H video compression standards are tested. ,265. Testing was carried out using a number of video datasets which were categorized based on the speed of object movement, so they were divided into high motion and low motion. For both video compression standards, an increase in inference accuracy regarding the default parameters was obtained by 12.14% for H.264 and 11.65% for H.265, thus meeting the design requirements for increasing the accuracy of parameter configuration optimization results. The E2E latency in the H.264 standard is smaller than H.265, but exceeds 1 second, so it has not succeeded in meeting the design requirements regarding subsystem E2E latency. Implementation and design research still has room for development and improvement for further research in the future. text