Blind video quality prediction by uncovering human video perceptual representation

Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the la...

Full description

Saved in:
Bibliographic Details
Main Authors: Liao, Liang, Xu, Kangmin, Wu, Haoning, Chen, Chaofeng, Sun, Wenxiu, Yan, Qiong, Kuo, Jay C.-C., Lin, Weisi
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181046
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-181046
record_format dspace
spelling sg-ntu-dr.10356-1810462024-11-12T05:23:18Z Blind video quality prediction by uncovering human video perceptual representation Liao, Liang Xu, Kangmin Wu, Haoning Chen, Chaofeng Sun, Wenxiu Yan, Qiong Kuo, Jay C.-C. Lin, Weisi School of Computer Science and Engineering Computer and Information Science Video quality assessment Human visual system Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the lack of modeling of how the human visual system (HVS) relates to the temporal quality of videos hinders the precise mapping of predicted temporal scores to the human perception. Inspired by the recent discovery of the temporal straightness law of natural videos in the HVS, this paper intends to model the complex temporal distortions of in-the-wild videos in a simple and uniform representation by describing the geometric properties of videos in the visual perceptual domain. A novel videolet, with perceptual representation embedding of a few consecutive frames, is designed as the basic quality measurement unit to quantify temporal distortions by measuring the angular and linear displacements from the straightness law. By combining the predicted score on each videolet, a perceptually temporal quality evaluator (PTQE) is formed to measure the temporal quality of the entire video. Experimental results demonstrate that the perceptual representation in the HVS is an efficient way of predicting subjective temporal quality. Moreover, when combined with spatial quality metrics, PTQE achieves top performance over popular in-the-wild video datasets. More importantly, PTQE requires no additional information beyond the video being assessed, making it applicable to any dataset without parameter tuning. Additionally, the generalizability of PTQE is evaluated on video frame interpolation tasks, demonstrating its potential to benefit temporal-related enhancement tasks. Agency for Science, Technology and Research (A*STAR) This work was supported in part under the RIE2020 Industry Alignment Fund–Industry Collaboration Projects (IAF-ICP) Funding Initiative, in part by the National Natural Science Foundation of China under Grant 62202349, and in part by the Young Elite Scientists Sponsorship Program by CAST under Grant 2023QNRC001. 2024-11-12T05:23:18Z 2024-11-12T05:23:18Z 2024 Journal Article Liao, L., Xu, K., Wu, H., Chen, C., Sun, W., Yan, Q., Kuo, J. C. & Lin, W. (2024). Blind video quality prediction by uncovering human video perceptual representation. IEEE Transactions On Image Processing, 33, 4998-5013. https://dx.doi.org/10.1109/TIP.2024.3445738 1941-0042 https://hdl.handle.net/10356/181046 10.1109/TIP.2024.3445738 39236121 2-s2.0-85203558671 33 4998 5013 en IAF-ICP IEEE Transactions on Image Processing © 2024 IEEE. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Video quality assessment
Human visual system
spellingShingle Computer and Information Science
Video quality assessment
Human visual system
Liao, Liang
Xu, Kangmin
Wu, Haoning
Chen, Chaofeng
Sun, Wenxiu
Yan, Qiong
Kuo, Jay C.-C.
Lin, Weisi
Blind video quality prediction by uncovering human video perceptual representation
description Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the lack of modeling of how the human visual system (HVS) relates to the temporal quality of videos hinders the precise mapping of predicted temporal scores to the human perception. Inspired by the recent discovery of the temporal straightness law of natural videos in the HVS, this paper intends to model the complex temporal distortions of in-the-wild videos in a simple and uniform representation by describing the geometric properties of videos in the visual perceptual domain. A novel videolet, with perceptual representation embedding of a few consecutive frames, is designed as the basic quality measurement unit to quantify temporal distortions by measuring the angular and linear displacements from the straightness law. By combining the predicted score on each videolet, a perceptually temporal quality evaluator (PTQE) is formed to measure the temporal quality of the entire video. Experimental results demonstrate that the perceptual representation in the HVS is an efficient way of predicting subjective temporal quality. Moreover, when combined with spatial quality metrics, PTQE achieves top performance over popular in-the-wild video datasets. More importantly, PTQE requires no additional information beyond the video being assessed, making it applicable to any dataset without parameter tuning. Additionally, the generalizability of PTQE is evaluated on video frame interpolation tasks, demonstrating its potential to benefit temporal-related enhancement tasks.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Liao, Liang
Xu, Kangmin
Wu, Haoning
Chen, Chaofeng
Sun, Wenxiu
Yan, Qiong
Kuo, Jay C.-C.
Lin, Weisi
format Article
author Liao, Liang
Xu, Kangmin
Wu, Haoning
Chen, Chaofeng
Sun, Wenxiu
Yan, Qiong
Kuo, Jay C.-C.
Lin, Weisi
author_sort Liao, Liang
title Blind video quality prediction by uncovering human video perceptual representation
title_short Blind video quality prediction by uncovering human video perceptual representation
title_full Blind video quality prediction by uncovering human video perceptual representation
title_fullStr Blind video quality prediction by uncovering human video perceptual representation
title_full_unstemmed Blind video quality prediction by uncovering human video perceptual representation
title_sort blind video quality prediction by uncovering human video perceptual representation
publishDate 2024
url https://hdl.handle.net/10356/181046
_version_ 1816858959766618112