Towards temporal sentence grounding in videos

Temporal sentence grounding in videos (TSGV), a.k.a., natural language video localization (NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment (i.e., a fraction of a video) that semantically corresponds to a language query from an untrimmed video. Connecting computer vision and...

全面介紹

Saved in:
書目詳細資料
主要作者: Zhang, Hao
其他作者: Sun Aixin
格式: Thesis-Doctor of Philosophy
語言:English
出版: Nanyang Technological University 2022
主題:
在線閱讀:https://hdl.handle.net/10356/163788
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!