Reinforcement learning-based interactive video search

Despite the rapid progress in text-to-video search due to the advancement of cross-modal representation learning, the existing techniques still fall short in helping users to rapidly identify the search targets. Particularly, in the situation that a system suggests a long list of similar candidates,...

Full description

Saved in:

Bibliographic Details
Main Authors:	MA, Zhixin, WU, Jiaxin, HOU, Zhijian, NGO, Chong-wah
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2022
Subjects:	Feature enhancement Interactive video retrieval Query understanding Reinforcement learning Artificial Intelligence and Robotics Graphics and Human Computer Interfaces
Online Access:	https://ink.library.smu.edu.sg/sis_research/7503 https://ink.library.smu.edu.sg/context/sis_research/article/8506/viewcontent/reinforcement_learning.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-8506
record_format	dspace
spelling	sg-smu-ink.sis_research-85062023-04-04T02:49:04Z Reinforcement learning-based interactive video search MA, Zhixin WU, Jiaxin HOU, Zhijian NGO, Chong-wah Despite the rapid progress in text-to-video search due to the advancement of cross-modal representation learning, the existing techniques still fall short in helping users to rapidly identify the search targets. Particularly, in the situation that a system suggests a long list of similar candidates, the user needs to painstakingly inspect every search result. The experience is frustrated with repeated watching of similar clips, and more frustratingly, the search targets may be overlooked due to mental tiredness. This paper explores reinforcement learning-based (RL) searching to relieve the user from the burden of brute force inspection. Specifically, the system maintains a graph connecting shots based on their temporal and semantic relationship. Using the navigation paths outlined by the graph, an RL agent learns to seek a path that maximizes the reward based on the continuous user feedback. In each round of interaction, the system will recommend one most likely video candidate for users to inspect. In addition to RL, two incremental changes are introduced to improve VIREO search engine. First, the dual-task cross-modal representation learning has been revised to index phrases and model user query and unlikelihood relationship more effectively. Second, two more deep features extracted from SlowFast and Swin-Transformer, respectively, are involved in dual-task model training. Substantial improvement is noticed for the automatic Ad-hoc search (AVS) task on the V3C1 dataset. 2022-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7503 info:doi/10.1007/978-3-030-98355-0_53 https://ink.library.smu.edu.sg/context/sis_research/article/8506/viewcontent/reinforcement_learning.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Feature enhancement Interactive video retrieval Query understanding Reinforcement learning Artificial Intelligence and Robotics Graphics and Human Computer Interfaces
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Feature enhancement Interactive video retrieval Query understanding Reinforcement learning Artificial Intelligence and Robotics Graphics and Human Computer Interfaces
spellingShingle	Feature enhancement Interactive video retrieval Query understanding Reinforcement learning Artificial Intelligence and Robotics Graphics and Human Computer Interfaces MA, Zhixin WU, Jiaxin HOU, Zhijian NGO, Chong-wah Reinforcement learning-based interactive video search
description	Despite the rapid progress in text-to-video search due to the advancement of cross-modal representation learning, the existing techniques still fall short in helping users to rapidly identify the search targets. Particularly, in the situation that a system suggests a long list of similar candidates, the user needs to painstakingly inspect every search result. The experience is frustrated with repeated watching of similar clips, and more frustratingly, the search targets may be overlooked due to mental tiredness. This paper explores reinforcement learning-based (RL) searching to relieve the user from the burden of brute force inspection. Specifically, the system maintains a graph connecting shots based on their temporal and semantic relationship. Using the navigation paths outlined by the graph, an RL agent learns to seek a path that maximizes the reward based on the continuous user feedback. In each round of interaction, the system will recommend one most likely video candidate for users to inspect. In addition to RL, two incremental changes are introduced to improve VIREO search engine. First, the dual-task cross-modal representation learning has been revised to index phrases and model user query and unlikelihood relationship more effectively. Second, two more deep features extracted from SlowFast and Swin-Transformer, respectively, are involved in dual-task model training. Substantial improvement is noticed for the automatic Ad-hoc search (AVS) task on the V3C1 dataset.
format	text
author	MA, Zhixin WU, Jiaxin HOU, Zhijian NGO, Chong-wah
author_facet	MA, Zhixin WU, Jiaxin HOU, Zhijian NGO, Chong-wah
author_sort	MA, Zhixin
title	Reinforcement learning-based interactive video search
title_short	Reinforcement learning-based interactive video search
title_full	Reinforcement learning-based interactive video search
title_fullStr	Reinforcement learning-based interactive video search
title_full_unstemmed	Reinforcement learning-based interactive video search
title_sort	reinforcement learning-based interactive video search
publisher	Institutional Knowledge at Singapore Management University
publishDate	2022
url	https://ink.library.smu.edu.sg/sis_research/7503 https://ink.library.smu.edu.sg/context/sis_research/article/8506/viewcontent/reinforcement_learning.pdf
_version_	1770576359561625600

Reinforcement learning-based interactive video search

Similar Items