Blind video quality prediction by uncovering human video perceptual representation
Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the la...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181046 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181046 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1810462024-11-12T05:23:18Z Blind video quality prediction by uncovering human video perceptual representation Liao, Liang Xu, Kangmin Wu, Haoning Chen, Chaofeng Sun, Wenxiu Yan, Qiong Kuo, Jay C.-C. Lin, Weisi School of Computer Science and Engineering Computer and Information Science Video quality assessment Human visual system Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the lack of modeling of how the human visual system (HVS) relates to the temporal quality of videos hinders the precise mapping of predicted temporal scores to the human perception. Inspired by the recent discovery of the temporal straightness law of natural videos in the HVS, this paper intends to model the complex temporal distortions of in-the-wild videos in a simple and uniform representation by describing the geometric properties of videos in the visual perceptual domain. A novel videolet, with perceptual representation embedding of a few consecutive frames, is designed as the basic quality measurement unit to quantify temporal distortions by measuring the angular and linear displacements from the straightness law. By combining the predicted score on each videolet, a perceptually temporal quality evaluator (PTQE) is formed to measure the temporal quality of the entire video. Experimental results demonstrate that the perceptual representation in the HVS is an efficient way of predicting subjective temporal quality. Moreover, when combined with spatial quality metrics, PTQE achieves top performance over popular in-the-wild video datasets. More importantly, PTQE requires no additional information beyond the video being assessed, making it applicable to any dataset without parameter tuning. Additionally, the generalizability of PTQE is evaluated on video frame interpolation tasks, demonstrating its potential to benefit temporal-related enhancement tasks. Agency for Science, Technology and Research (A*STAR) This work was supported in part under the RIE2020 Industry Alignment Fund–Industry Collaboration Projects (IAF-ICP) Funding Initiative, in part by the National Natural Science Foundation of China under Grant 62202349, and in part by the Young Elite Scientists Sponsorship Program by CAST under Grant 2023QNRC001. 2024-11-12T05:23:18Z 2024-11-12T05:23:18Z 2024 Journal Article Liao, L., Xu, K., Wu, H., Chen, C., Sun, W., Yan, Q., Kuo, J. C. & Lin, W. (2024). Blind video quality prediction by uncovering human video perceptual representation. IEEE Transactions On Image Processing, 33, 4998-5013. https://dx.doi.org/10.1109/TIP.2024.3445738 1941-0042 https://hdl.handle.net/10356/181046 10.1109/TIP.2024.3445738 39236121 2-s2.0-85203558671 33 4998 5013 en IAF-ICP IEEE Transactions on Image Processing © 2024 IEEE. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Video quality assessment Human visual system |
spellingShingle |
Computer and Information Science Video quality assessment Human visual system Liao, Liang Xu, Kangmin Wu, Haoning Chen, Chaofeng Sun, Wenxiu Yan, Qiong Kuo, Jay C.-C. Lin, Weisi Blind video quality prediction by uncovering human video perceptual representation |
description |
Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the lack of modeling of how the human visual system (HVS) relates to the temporal quality of videos hinders the precise mapping of predicted temporal scores to the human perception. Inspired by the recent discovery of the temporal straightness law of natural videos in the HVS, this paper intends to model the complex temporal distortions of in-the-wild videos in a simple and uniform representation by describing the geometric properties of videos in the visual perceptual domain. A novel videolet, with perceptual representation embedding of a few consecutive frames, is designed as the basic quality measurement unit to quantify temporal distortions by measuring the angular and linear displacements from the straightness law. By combining the predicted score on each videolet, a perceptually temporal quality evaluator (PTQE) is formed to measure the temporal quality of the entire video. Experimental results demonstrate that the perceptual representation in the HVS is an efficient way of predicting subjective temporal quality. Moreover, when combined with spatial quality metrics, PTQE achieves top performance over popular in-the-wild video datasets. More importantly, PTQE requires no additional information beyond the video being assessed, making it applicable to any dataset without parameter tuning. Additionally, the generalizability of PTQE is evaluated on video frame interpolation tasks, demonstrating its potential to benefit temporal-related enhancement tasks. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Liao, Liang Xu, Kangmin Wu, Haoning Chen, Chaofeng Sun, Wenxiu Yan, Qiong Kuo, Jay C.-C. Lin, Weisi |
format |
Article |
author |
Liao, Liang Xu, Kangmin Wu, Haoning Chen, Chaofeng Sun, Wenxiu Yan, Qiong Kuo, Jay C.-C. Lin, Weisi |
author_sort |
Liao, Liang |
title |
Blind video quality prediction by uncovering human video perceptual representation |
title_short |
Blind video quality prediction by uncovering human video perceptual representation |
title_full |
Blind video quality prediction by uncovering human video perceptual representation |
title_fullStr |
Blind video quality prediction by uncovering human video perceptual representation |
title_full_unstemmed |
Blind video quality prediction by uncovering human video perceptual representation |
title_sort |
blind video quality prediction by uncovering human video perceptual representation |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181046 |
_version_ |
1816858959766618112 |