Context models for pedestrian intention prediction by factored latent-dynamic conditional random fields

Smooth handling of pedestrian interactions is a key requirement for Autonomous Vehicles (AV) and Advanced Driver Assistance Systems (ADAS). Such systems call for early and accurate prediction of a pedestrian's crossing/not-crossing behaviour in front of the vehicle. Existing approaches to pedes...

Full description

Saved in:

Bibliographic Details
Main Author:	Satyajit Neogi
Other Authors:	Justin Dauwels
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2020
Subjects:	Engineering::Electrical and electronic engineering
Online Access:	https://hdl.handle.net/10356/143222
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-143222
record_format	dspace
spelling	sg-ntu-dr.10356-1432222023-07-04T17:20:52Z Context models for pedestrian intention prediction by factored latent-dynamic conditional random fields Satyajit Neogi Justin Dauwels School of Electrical and Electronic Engineering JDAUWELS@ntu.edu.sg Engineering::Electrical and electronic engineering Smooth handling of pedestrian interactions is a key requirement for Autonomous Vehicles (AV) and Advanced Driver Assistance Systems (ADAS). Such systems call for early and accurate prediction of a pedestrian's crossing/not-crossing behaviour in front of the vehicle. Existing approaches to pedestrian behaviour prediction make use of pedestrian motion, his/her location in a scene and static context variables such as traffic lights, zebra crossings etc. We stress on the necessity of early prediction for smooth operation of such systems. We introduce the influence of vehicle interactions on pedestrian intention for this purpose. In this thesis, we show a discernible advance in prediction time aided by the inclusion of such vehicle interaction context. We apply our methods to two public datasets, viz., Daimler dataset and JAAD dataset. We also contribute two datasets towards pedestrian behaviour prediction research, viz., NTU dataset and Little India dataset and apply our methods on these datasets. While the existing best system predicts pedestrian stopping behaviour with 70% accuracy 0.38 seconds before the actual events, our system achieves such accuracy across multiple datasets at least 0.9 seconds on an average before the actual events. We formulate the pedestrian behaviour prediction problem as a sequence labeling task. Conditional Random Fields (CRF) are frequently applied for labeling and segmenting sequence data. In the existing literature, hidden variables have been introduced in a labeled CRF structure in order to model the latent dynamics within class labels, thus improving the labeling performance. Such a model is known as Latent-Dynamic CRF (LDCRF). We propose a generalization of LDCRF, called Factored LDCRF (FLDCRF), a structure that allows multiple latent dynamics of the class labels to interact with each other. Including such latent-dynamic interactions leads to improved performance on single-label and multi-label sequence modeling tasks. We validate our FLDCRF models on standard single-label and multi-label sequence tagging experiments across two different datasets - UCI gesture phase data and UCI opportunity data, before proceeding to apply it for pedestrian behaviour prediction. FLDCRF outperforms all state-of-the-art sequence models, i.e., CRF, LDCRF, LSTM, LSTM-CRF, Factorial CRF, Coupled CRF and a multi-label LSTM model in all our experiments. In addition, FLDCRF offers easier model selection and is more consistent across validation and test data than LSTM models. FLDCRF is also much faster to train compared to LSTM, even without a GPU. FLDCRF outshines the best LSTM model by 4% in terms of F1-score on a single-label task on the UCI gesture phase data and outperforms LSTM by ~2% (F1-score) on average on the multi-label sequence tagging experiment on UCI opportunity data. FLDCRF models also outperform LSTM models on the pedestrian behaviour prediction task across multiple datasets. Interacting latent dynamics in a FLDCRF can be exploited for modeling multi-agent interactions in a social environment. FLDCRF accommodates interacting discrete latent state spaces in its structure. The same idea can be extended to interacting heterogeneous (discrete and continuous) state space models. Doctor of Philosophy 2020-08-14T01:54:24Z 2020-08-14T01:54:24Z 2020 Thesis-Doctor of Philosophy Satyajit Neogi. (2020). Context models for pedestrian intention prediction by factored latent-dynamic conditional random fields. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/143222 10.32657/10356/143222 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering
spellingShingle	Engineering::Electrical and electronic engineering Satyajit Neogi Context models for pedestrian intention prediction by factored latent-dynamic conditional random fields
description	Smooth handling of pedestrian interactions is a key requirement for Autonomous Vehicles (AV) and Advanced Driver Assistance Systems (ADAS). Such systems call for early and accurate prediction of a pedestrian's crossing/not-crossing behaviour in front of the vehicle. Existing approaches to pedestrian behaviour prediction make use of pedestrian motion, his/her location in a scene and static context variables such as traffic lights, zebra crossings etc. We stress on the necessity of early prediction for smooth operation of such systems. We introduce the influence of vehicle interactions on pedestrian intention for this purpose. In this thesis, we show a discernible advance in prediction time aided by the inclusion of such vehicle interaction context. We apply our methods to two public datasets, viz., Daimler dataset and JAAD dataset. We also contribute two datasets towards pedestrian behaviour prediction research, viz., NTU dataset and Little India dataset and apply our methods on these datasets. While the existing best system predicts pedestrian stopping behaviour with 70% accuracy 0.38 seconds before the actual events, our system achieves such accuracy across multiple datasets at least 0.9 seconds on an average before the actual events. We formulate the pedestrian behaviour prediction problem as a sequence labeling task. Conditional Random Fields (CRF) are frequently applied for labeling and segmenting sequence data. In the existing literature, hidden variables have been introduced in a labeled CRF structure in order to model the latent dynamics within class labels, thus improving the labeling performance. Such a model is known as Latent-Dynamic CRF (LDCRF). We propose a generalization of LDCRF, called Factored LDCRF (FLDCRF), a structure that allows multiple latent dynamics of the class labels to interact with each other. Including such latent-dynamic interactions leads to improved performance on single-label and multi-label sequence modeling tasks. We validate our FLDCRF models on standard single-label and multi-label sequence tagging experiments across two different datasets - UCI gesture phase data and UCI opportunity data, before proceeding to apply it for pedestrian behaviour prediction. FLDCRF outperforms all state-of-the-art sequence models, i.e., CRF, LDCRF, LSTM, LSTM-CRF, Factorial CRF, Coupled CRF and a multi-label LSTM model in all our experiments. In addition, FLDCRF offers easier model selection and is more consistent across validation and test data than LSTM models. FLDCRF is also much faster to train compared to LSTM, even without a GPU. FLDCRF outshines the best LSTM model by 4% in terms of F1-score on a single-label task on the UCI gesture phase data and outperforms LSTM by ~2% (F1-score) on average on the multi-label sequence tagging experiment on UCI opportunity data. FLDCRF models also outperform LSTM models on the pedestrian behaviour prediction task across multiple datasets. Interacting latent dynamics in a FLDCRF can be exploited for modeling multi-agent interactions in a social environment. FLDCRF accommodates interacting discrete latent state spaces in its structure. The same idea can be extended to interacting heterogeneous (discrete and continuous) state space models.
author2	Justin Dauwels
author_facet	Justin Dauwels Satyajit Neogi
format	Thesis-Doctor of Philosophy
author	Satyajit Neogi
author_sort	Satyajit Neogi
title	Context models for pedestrian intention prediction by factored latent-dynamic conditional random fields
title_short	Context models for pedestrian intention prediction by factored latent-dynamic conditional random fields
title_full	Context models for pedestrian intention prediction by factored latent-dynamic conditional random fields
title_fullStr	Context models for pedestrian intention prediction by factored latent-dynamic conditional random fields
title_full_unstemmed	Context models for pedestrian intention prediction by factored latent-dynamic conditional random fields
title_sort	context models for pedestrian intention prediction by factored latent-dynamic conditional random fields
publisher	Nanyang Technological University
publishDate	2020
url	https://hdl.handle.net/10356/143222
_version_	1772827650930245632

Context models for pedestrian intention prediction by factored latent-dynamic conditional random fields

Similar Items