Multi-stream social-aware transformers for deterministic trajectory prediction

With the development of artificial intelligence technology, intelligent robots are being used more widely in daily life. For any delivery robot operating in crowded environments, accurate and fast pedestrian trajectory prediction is the basis of autonomous tasks and poses considerable challenges. (...

全面介紹

Saved in:

書目詳細資料
主要作者:	Chen, Xun
其他作者:	Wang Dan Wei
格式:	Thesis-Master by Coursework
語言:	English
出版:	Nanyang Technological University 2024
主題:	Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
在線閱讀:	https://hdl.handle.net/10356/172974
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!

實物特徵
總結:	With the development of artificial intelligence technology, intelligent robots are being used more widely in daily life. For any delivery robot operating in crowded environments, accurate and fast pedestrian trajectory prediction is the basis of autonomous tasks and poses considerable challenges. (1) For the pedestrian trajectory prediction task, most previous works use probabilistic generative models (such as CVAE/Diffusion) to model the problem and use evaluation metrics like the best result out of 20 samples to measure model accuracy. This has a considerable gap from actual deployment applications. In this work, the task is modeled as a seq2seq translation model, outputting only one accurate prediction, which is more amenable to real-world deployment while also reducing model complexity. (2) The difficulty of this task lies in its inherent spatio-temporal and social dimensions. Simply modeling the temporal dimension alone would miss interactions between agents. Most solutions alternate information exchange across the two dimensions and achieve decent results, but this may lead to information loss. Approaches that exchange information simultaneously in both dimensions incur high computational complexity (quadratic in total length). Drawing inspiration from multi-modal fusion network architectures, a novel multi-stream Transformer architecture is proposed that fuses information from multiple input streams into a single stream and then decodes it back to multiple output streams. This multi-stream Transformer architecture significantly reduces computational complexity for real-time deployment while achieving results very close to state-of-the-art on well-established datasets. Keywords: Trajectory prediction, seq2seq model, multi-stream Transformer, Real-time.

Multi-stream social-aware transformers for deterministic trajectory prediction

相似書籍