BehAVExplor: Behavior diversity guided testing for autonomous driving systems

Testing Autonomous Driving Systems (ADSs) is a critical task for ensuring the reliability and safety of autonomous vehicles. Existing methods mainly focus on searching for safety violations while the diversity of the generated test cases is ignored, which may generate many redundant test cases and f...

Full description

Saved in:
Bibliographic Details
Main Authors: CHENG, Mingfei, ZHOU, Yuan, XIE, Xiaofei
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2023
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8246
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Testing Autonomous Driving Systems (ADSs) is a critical task for ensuring the reliability and safety of autonomous vehicles. Existing methods mainly focus on searching for safety violations while the diversity of the generated test cases is ignored, which may generate many redundant test cases and failures. Such redundant failures can reduce testing performance and increase failure analysis costs. In this paper, we present a novel behavior-guided fuzzing technique (BehAVExplor) to explore the different behaviors of the ego vehi- cle (i.e., the vehicle controlled by the ADS under test) and detect diverse violations. Specifically, we design an efficient unsupervised model, called BehaviorMiner, to characterize the behavior of the ego vehicle. BehaviorMiner extracts the temporal features from the given scenarios and performs a clustering-based abstraction to group behaviors with similar features into abstract states. A new test case will be added to the seed corpus if it triggers new behav- iors (e.g., cover new abstract states). Due to the potential conflict between the behavior diversity and the general violation feedback, we further propose an energy mechanism to guide the seed selec- tion and the mutation. The energy of a seed quantifies how good it is. We evaluated BehAVExplor on Apollo, an industrial-level ADS, and LGSVL simulation environment. Empirical evaluation results show that BehAVExplor can effectively find more diverse violations than the state-of-the-art.