Leveraging LLMs and generative models for interactive known-item video search

While embedding techniques such as CLIP have considerably boosted search performance, user strategies in interactive video search still largely operate on a trial-and-error basis. Users are often required to manually adjust their queries and carefully inspect the search results, which greatly rely o...

Full description

Saved in:

Bibliographic Details
Main Authors:	MA, Zhixin, WU, Jiaxin, NGO, Chong-wah
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Generative Model Interactive Video Retrieval Known-Item Search Large Language Models Artificial Intelligence and Robotics Graphics and Human Computer Interfaces
Online Access:	https://ink.library.smu.edu.sg/sis_research/8748 https://ink.library.smu.edu.sg/context/sis_research/article/9751/viewcontent/24_MMM_av.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9751
record_format	dspace
spelling	sg-smu-ink.sis_research-97512024-10-17T07:57:59Z Leveraging LLMs and generative models for interactive known-item video search MA, Zhixin WU, Jiaxin NGO, Chong-wah While embedding techniques such as CLIP have considerably boosted search performance, user strategies in interactive video search still largely operate on a trial-and-error basis. Users are often required to manually adjust their queries and carefully inspect the search results, which greatly rely on the users’ capability and proficiency. Recent advancements in large language models (LLMs) and generative models offer promising avenues for enhancing interactivity in video retrieval and reducing the personal bias in query interpretation, particularly in the known-item search. Specifically, LLMs can expand and diversify the semantics of the queries while avoiding grammar mistakes or the language barrier. In addition, generative models have the ability to imagine or visualize the verbose query as images. We integrate these new LLM capabilities into our existing system and evaluate their effectiveness on V3C1 and V3C2 datasets. 2024-02-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8748 info:doi/10.1007/978-3-031-53302-0_35 https://ink.library.smu.edu.sg/context/sis_research/article/9751/viewcontent/24_MMM_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Generative Model Interactive Video Retrieval Known-Item Search Large Language Models Artificial Intelligence and Robotics Graphics and Human Computer Interfaces
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Generative Model Interactive Video Retrieval Known-Item Search Large Language Models Artificial Intelligence and Robotics Graphics and Human Computer Interfaces
spellingShingle	Generative Model Interactive Video Retrieval Known-Item Search Large Language Models Artificial Intelligence and Robotics Graphics and Human Computer Interfaces MA, Zhixin WU, Jiaxin NGO, Chong-wah Leveraging LLMs and generative models for interactive known-item video search
description	While embedding techniques such as CLIP have considerably boosted search performance, user strategies in interactive video search still largely operate on a trial-and-error basis. Users are often required to manually adjust their queries and carefully inspect the search results, which greatly rely on the users’ capability and proficiency. Recent advancements in large language models (LLMs) and generative models offer promising avenues for enhancing interactivity in video retrieval and reducing the personal bias in query interpretation, particularly in the known-item search. Specifically, LLMs can expand and diversify the semantics of the queries while avoiding grammar mistakes or the language barrier. In addition, generative models have the ability to imagine or visualize the verbose query as images. We integrate these new LLM capabilities into our existing system and evaluate their effectiveness on V3C1 and V3C2 datasets.
format	text
author	MA, Zhixin WU, Jiaxin NGO, Chong-wah
author_facet	MA, Zhixin WU, Jiaxin NGO, Chong-wah
author_sort	MA, Zhixin
title	Leveraging LLMs and generative models for interactive known-item video search
title_short	Leveraging LLMs and generative models for interactive known-item video search
title_full	Leveraging LLMs and generative models for interactive known-item video search
title_fullStr	Leveraging LLMs and generative models for interactive known-item video search
title_full_unstemmed	Leveraging LLMs and generative models for interactive known-item video search
title_sort	leveraging llms and generative models for interactive known-item video search
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/8748 https://ink.library.smu.edu.sg/context/sis_research/article/9751/viewcontent/24_MMM_av.pdf
_version_	1814047946032807936

Leveraging LLMs and generative models for interactive known-item video search

Similar Items