Testing automated driving systems by breaking many laws efficiently

An automated driving system (ADS), as the brain of an autonomous vehicle (AV), should be tested thoroughly ahead of deployment. ADS must satisfy a complex set of rules to ensure road safety, e.g., the existing traffic laws and possibly future laws that are dedicated to AVs. To comprehensively test a...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHANG, Xiaodong, ZHAO, Wei, SUN, Yang, SUN, Jun, SHEN, Yulong, DONG, Xuewen, YANG, Zijiang
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2023
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8078
https://ink.library.smu.edu.sg/context/sis_research/article/9081/viewcontent/3597926.3598108.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:An automated driving system (ADS), as the brain of an autonomous vehicle (AV), should be tested thoroughly ahead of deployment. ADS must satisfy a complex set of rules to ensure road safety, e.g., the existing traffic laws and possibly future laws that are dedicated to AVs. To comprehensively test an ADS, we would like to systematically discover diverse scenarios in which certain traffic law is violated. The challenge is that (1) there are many traffic laws (e.g., 13 testable articles in Chinese traffic laws and 16 testable articles in Singapore traffic laws, with 81 and 43 violation situations respectively); and (2) many of traffic laws are only relevant in complicated specific scenarios. Existing approaches to testing ADS either focus on simple oracles such as no-collision or have limited capacity in generating diverse law-violating scenarios. In this work, we propose ABLE, a new ADS testing method inspired by the success of GFlowNet, which Aims to Break many Laws Efficiently by generating diverse scenarios. Different from vanilla GFlowNet, ABLE drives the testing process with dynamically updated testing objectives (based on a robustness semantics of signal temporal logic) as well as active learning, so as to effectively explore the vast search space. We evaluate ABLE based on Apollo and LGSVL, and the results show that ABLE outperforms the state-of-the-art by violating 17% and 25% more laws when testing Apollo 6.0 and Apollo 7.0, most of which are hard-to-violate laws, respectively.