Text-driven video prediction

Text-driven video prediction

Current video generation models usually convert signals indicating appearance and motion received from inputs (e.g., image and text) or latent spaces (e.g., noise vectors) into consecutive frames, fulfilling a stochastic generation process for the uncertainty introduced by latent code sampling. Howe...

Full description

Saved in:

Bibliographic Details
Main Authors:	SONG, Xue, CHEN, Jingjing, ZHU, Bin, JIANG, Yu-gang
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Text-driven Video Prediction controllable video generation motion inference Artificial Intelligence and Robotics Graphics and Human Computer Interfaces
Online Access:	https://ink.library.smu.edu.sg/sis_research/9356 https://ink.library.smu.edu.sg/context/sis_research/article/10356/viewcontent/Text_drivenVideoPrediction_sv__2_.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Similar Items

Synchronization of lecture videos and electronic slides by video text analysis
by: WANG, Feng, et al.
Published: (2003)

Video event detection using motion relativity and visual relatedness
by: WANG, Feng, et al.
Published: (2008)

Serendipity-driven celebrity video hyperlinking
by: YANG, Shujun, et al.
Published: (2016)

Video text detection and segmentation for optical character recognition
by: NGO, Chong-wah, et al.
Published: (2005)

Wavelet-gradient-fusion for video text binarization
by: Roy, S., et al.
Published: (2013)

Lecture video enhancement and editing by integrating posture, gesture, and text
by: WANG, Feng, et al.
Published: (2007)

A new method for word segmentation from arbitrarily-oriented video text lines
by: Sharma, N., et al.
Published: (2013)

VrdONE : One-stage video visual relation detection
by: JIANG, Xinjie, et al.
Published: (2024)

Concept-driven multi-modality fusion for video search
by: WEI, Xiao-Yong, et al.
Published: (2011)

Text2Human: text-driven controllable human image generation
by: Jiang, Yuming, et al.
Published: (2022)

On the annotation of web videos by efficient near-duplicate search
by: ZHAO, Wan-Lei, et al.
Published: (2010)

Towards textually describing complex video contents with audio-visual concept classifiers
by: TAN, Chun Chet, et al.
Published: (2011)

DualFormer: Local-global stratified transformer for efficient video recognition
by: LIANG, Yuxuan, et al.
Published: (2022)

Long-term leap attention, short-term periodic shift for video classification
by: ZHANG, Hao, et al.
Published: (2022)

Recent advances in content-based video analysis
by: NGO, Chong-wah, et al.
Published: (2001)

Extraction of Text from Images and Videos
by: PHAN QUY TRUNG
Published: (2014)

Detection and interpretation of text information in noisy video sequences
by: Chan, U., et al.
Published: (2014)

Rushes video summarization by object and event understanding
by: WANG, Feng, et al.
Published: (2007)

Leveraging LLMs and generative models for interactive known-item video search
by: MA, Zhixin, et al.
Published: (2024)

Stargazer: An interactive camera robot for capturing how-to videos based on subtle instructor cues
by: LI, Jiannan, et al.
Published: (2023)

Structuring low-quality videotaped lectures for cross-reference browsing by video text analysis
by: WANG, Feng, et al.
Published: (2008)

Open-vocabulary video anomaly detection
by: WU, Peng, et al.
Published: (2024)

A new method for arbitrarily-oriented text detection in video
by: Sharma, N., et al.
Published: (2013)

Watching 360° videos together
by: TANG, Anthony, et al.
Published: (2017)

Localizing volumetric motion for action recognition in realistic videos
by: WU, Xiao, et al.
Published: (2009)

Video event detection using motion relativity and feature selection
by: WANG, Feng, et al.
Published: (2014)

Mix-DANN and dynamic-modal-distillation for video domain adaptation
by: YIN, Yuehao, et al.
Published: (2022)

Video script identification based on text lines
by: Phan, T.Q., et al.
Published: (2013)

Reinforcement learning-based interactive video search
by: MA, Zhixin, et al.
Published: (2022)

Vireo @ video browser showdown 2019
by: NGUYEN, Phuong Anh, et al.
Published: (2019)

Dynamic temporal filtering in video models
by: LONG, Fuchen, et al.
Published: (2022)

Summarizing rushes videos by motion, object, and event understanding
by: WANG, Feng, et al.
Published: (2012)

PosMLP-Video: Spatial and temporal relative position encoding for efficient video recognition
by: HAO, Yanbin, et al.
Published: (2024)

Feature prediction diffusion model for video anomaly detection
by: YAN, Cheng, et al.
Published: (2023)

An eigen value based approach for text detection in video
by: Guru, D.S., et al.
Published: (2013)

Self-trained deep ordinal regression for end-to-end video anomaly detection
by: PANG, Guansong, et al.
Published: (2020)

Interactive video corpus moment retrieval using reinforcement learning
by: MA, Zhixin, et al.
Published: (2022)

New spatial-gradient-features for video script identification
by: Zhao, D., et al.
Published: (2013)

SwapVid: Integrating video viewing and document exploration with direct manipulation
by: MURAKAMI, Taichi, et al.
Published: (2024)

Placing videos on a semantic hierarchy for search result navigation
by: TAN, Song, et al.
Published: (2014)