INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING
With the rapid development and implementation of video streaming technology in the modern world, it is necessary to increase the accuracy and performance of video analytics, but the availability of computing and network resources such as limited bandwidth is the main obstacle. This is because the...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/82322 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:82322 |
---|---|
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
With the rapid development and implementation of video streaming technology in
the modern world, it is necessary to increase the accuracy and performance of video
analytics, but the availability of computing and network resources such as limited
bandwidth is the main obstacle. This is because the availability of resources for
video analytics is distributed statically and video analytics is less adaptive to the
computing and network resources required by video streams that continue to
change content. Therefore, a system is required that can adaptively increase the
inference accuracy of video analytics.
As an alternative solution to this problem, an integrated system was designed and
developed that increases inference accuracy through bandwidth resource
allocation and optimization of video compression parameter configuration. The
topic brought up in this Final Project Book by the author is a subsystem that
optimizes parameter configurations using DNN driven streaming, or DDS. The
DDS subsystem was developed by Du et al. which uses a feedback system by DNN
to adjust video coding and compression parameters to achieve high inference
accuracy efficiently. On the video compression parameter configuration side, the
DDS subsystem designed by Du et al. uses a configuration that is still the default,
so there is room for development and research. The design requirements for the
DDS subsystem consist of three, namely E2E latency that does not exceed 1 second,
increased inference accuracy resulting from video parameter configuration
optimization, and the use of Linux as an open source OS.
The DDS subsystem receives a video input and then performs initial encoding and
compression to produce a low quality video stream. The video stream is then sent
to the server to be evaluated by the object detection model, namely the Faster RCNN
model. The results of the evaluation are then used as feedback which contains
information on all regions of each frame of the video that are relevant to increasing
the inference accuracy, thus are prioritized for improving their quality. By using
feedback from the Faster RCNN evaluation results on the server, the DDS
subsystem will perform reencoding and recompression to produce a high quality
video stream, which can achieve higher inference accuracy. The DDS subsystem is
iv
developed on Linux OS, in accordance with the design requirements of an open
source OS, which runs a number of Python programs and uses FFMPEG for video
compression. The DDS subsystem is run in emulation mode to obtain ground truth
results, then continued in implementation mode where the results are compared
with ground truth to obtain subsystem inference accuracy values.
In testing the DDS subsystem, it is first tested to observe the effect of configuration
optimization results on those without optimization (default), then the parameters of
entropy coding, motion estimation, direct prediction, macroblock size, E2E latency,
and performance comparison between H.264 and H video compression standards
are tested. ,265. Testing was carried out using a number of video datasets which
were categorized based on the speed of object movement, so they were divided into
high motion and low motion. For both video compression standards, an increase in
inference accuracy regarding the default parameters was obtained by 12.14% for
H.264 and 11.65% for H.265, thus meeting the design requirements for increasing
the accuracy of parameter configuration optimization results. The E2E latency in
the H.264 standard is smaller than H.265, but exceeds 1 second, so it has not
succeeded in meeting the design requirements regarding subsystem E2E latency.
Implementation and design research still has room for development and
improvement for further research in the future. |
format |
Final Project |
author |
Krishna, Farhan |
spellingShingle |
Krishna, Farhan INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING |
author_facet |
Krishna, Farhan |
author_sort |
Krishna, Farhan |
title |
INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING |
title_short |
INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING |
title_full |
INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING |
title_fullStr |
INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING |
title_full_unstemmed |
INCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING |
title_sort |
increasing video analytics accuracy via automated video compression parameter optimization with deep neural network driven streaming |
url |
https://digilib.itb.ac.id/gdl/view/82322 |
_version_ |
1822282196371636224 |
spelling |
id-itb.:823222024-07-08T08:04:30ZINCREASING VIDEO ANALYTICS ACCURACY VIA AUTOMATED VIDEO COMPRESSION PARAMETER OPTIMIZATION WITH DEEP NEURAL NETWORK DRIVEN STREAMING Krishna, Farhan Indonesia Final Project video analytics, video compression, DNN driven streaming, feedback INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/82322 With the rapid development and implementation of video streaming technology in the modern world, it is necessary to increase the accuracy and performance of video analytics, but the availability of computing and network resources such as limited bandwidth is the main obstacle. This is because the availability of resources for video analytics is distributed statically and video analytics is less adaptive to the computing and network resources required by video streams that continue to change content. Therefore, a system is required that can adaptively increase the inference accuracy of video analytics. As an alternative solution to this problem, an integrated system was designed and developed that increases inference accuracy through bandwidth resource allocation and optimization of video compression parameter configuration. The topic brought up in this Final Project Book by the author is a subsystem that optimizes parameter configurations using DNN driven streaming, or DDS. The DDS subsystem was developed by Du et al. which uses a feedback system by DNN to adjust video coding and compression parameters to achieve high inference accuracy efficiently. On the video compression parameter configuration side, the DDS subsystem designed by Du et al. uses a configuration that is still the default, so there is room for development and research. The design requirements for the DDS subsystem consist of three, namely E2E latency that does not exceed 1 second, increased inference accuracy resulting from video parameter configuration optimization, and the use of Linux as an open source OS. The DDS subsystem receives a video input and then performs initial encoding and compression to produce a low quality video stream. The video stream is then sent to the server to be evaluated by the object detection model, namely the Faster RCNN model. The results of the evaluation are then used as feedback which contains information on all regions of each frame of the video that are relevant to increasing the inference accuracy, thus are prioritized for improving their quality. By using feedback from the Faster RCNN evaluation results on the server, the DDS subsystem will perform reencoding and recompression to produce a high quality video stream, which can achieve higher inference accuracy. The DDS subsystem is iv developed on Linux OS, in accordance with the design requirements of an open source OS, which runs a number of Python programs and uses FFMPEG for video compression. The DDS subsystem is run in emulation mode to obtain ground truth results, then continued in implementation mode where the results are compared with ground truth to obtain subsystem inference accuracy values. In testing the DDS subsystem, it is first tested to observe the effect of configuration optimization results on those without optimization (default), then the parameters of entropy coding, motion estimation, direct prediction, macroblock size, E2E latency, and performance comparison between H.264 and H video compression standards are tested. ,265. Testing was carried out using a number of video datasets which were categorized based on the speed of object movement, so they were divided into high motion and low motion. For both video compression standards, an increase in inference accuracy regarding the default parameters was obtained by 12.14% for H.264 and 11.65% for H.265, thus meeting the design requirements for increasing the accuracy of parameter configuration optimization results. The E2E latency in the H.264 standard is smaller than H.265, but exceeds 1 second, so it has not succeeded in meeting the design requirements regarding subsystem E2E latency. Implementation and design research still has room for development and improvement for further research in the future. text |