DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems

Deep learning (DL) models are trained on sampled data, where the distribution of training data differs from that of real-world data (i.e., the distribution shift), which reduces the model's robustness. Various testing techniques have been proposed, including distribution-unaware and distributio...

Full description

Saved in:

Bibliographic Details
Main Authors:	WANG, Longtian, XIE, Xiaofei, DU, Xiaoning, TIAN, Meng, GUO, Qing, YANG, Zheng, SHEN, Chao
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2023
Subjects:	Deep learning Neural networks Software defect analysis Software testing and debugging Artificial Intelligence and Robotics
Online Access:	https://ink.library.smu.edu.sg/sis_research/8516 https://ink.library.smu.edu.sg/context/sis_research/article/9519/viewcontent/3611643.3616266.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9519
record_format	dspace
spelling	sg-smu-ink.sis_research-95192024-01-22T15:08:28Z DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems WANG, Longtian XIE, Xiaofei DU, Xiaoning TIAN, Meng GUO, Qing YANG, Zheng SHEN, Chao Deep learning (DL) models are trained on sampled data, where the distribution of training data differs from that of real-world data (i.e., the distribution shift), which reduces the model's robustness. Various testing techniques have been proposed, including distribution-unaware and distribution-aware methods. However, distribution-unaware testing lacks effectiveness by not explicitly considering the distribution of test cases and may generate redundant errors (within same distribution). Distribution-aware testing techniques primarily focus on generating test cases that follow the training distribution, missing out-of-distribution data that may also be valid and should be considered in the testing process. In this paper, we propose a novel distribution-guided approach for generating valid test cases with diverse distributions, which can better evaluate the model's robustness (i.e., generating hard-to-detect errors) and enhance the model's robustness (i.e., enriching training data). Unlike existing testing techniques that optimize individual test cases, DistXplore optimizes test suites that represent specific distributions. To evaluate and enhance the model's robustness, we design two metrics: distribution difference, which maximizes the similarity in distribution between two different classes of data to generate hard-to-detect errors, and distribution diversity, which increase the distribution diversity of generated test cases for enhancing the model's robustness. To evaluate the effectiveness of DistXplore in model evaluation and enhancement, we compare DistXplore with 14 state-of-the-art baselines on 10 models across 4 datasets. The evaluation results show that DisXplore not only detects a larger number of errors (e.g., 2×+ on average). Furthermore, DistXplore achieves a higher improvement in empirical robustness (e.g., 5.2% more accuracy improvement than the baselines on average). 2023-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8516 info:doi/10.1145/3611643.3616266 https://ink.library.smu.edu.sg/context/sis_research/article/9519/viewcontent/3611643.3616266.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Deep learning Neural networks Software defect analysis Software testing and debugging Artificial Intelligence and Robotics
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Deep learning Neural networks Software defect analysis Software testing and debugging Artificial Intelligence and Robotics
spellingShingle	Deep learning Neural networks Software defect analysis Software testing and debugging Artificial Intelligence and Robotics WANG, Longtian XIE, Xiaofei DU, Xiaoning TIAN, Meng GUO, Qing YANG, Zheng SHEN, Chao DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems
description	Deep learning (DL) models are trained on sampled data, where the distribution of training data differs from that of real-world data (i.e., the distribution shift), which reduces the model's robustness. Various testing techniques have been proposed, including distribution-unaware and distribution-aware methods. However, distribution-unaware testing lacks effectiveness by not explicitly considering the distribution of test cases and may generate redundant errors (within same distribution). Distribution-aware testing techniques primarily focus on generating test cases that follow the training distribution, missing out-of-distribution data that may also be valid and should be considered in the testing process. In this paper, we propose a novel distribution-guided approach for generating valid test cases with diverse distributions, which can better evaluate the model's robustness (i.e., generating hard-to-detect errors) and enhance the model's robustness (i.e., enriching training data). Unlike existing testing techniques that optimize individual test cases, DistXplore optimizes test suites that represent specific distributions. To evaluate and enhance the model's robustness, we design two metrics: distribution difference, which maximizes the similarity in distribution between two different classes of data to generate hard-to-detect errors, and distribution diversity, which increase the distribution diversity of generated test cases for enhancing the model's robustness. To evaluate the effectiveness of DistXplore in model evaluation and enhancement, we compare DistXplore with 14 state-of-the-art baselines on 10 models across 4 datasets. The evaluation results show that DisXplore not only detects a larger number of errors (e.g., 2×+ on average). Furthermore, DistXplore achieves a higher improvement in empirical robustness (e.g., 5.2% more accuracy improvement than the baselines on average).
format	text
author	WANG, Longtian XIE, Xiaofei DU, Xiaoning TIAN, Meng GUO, Qing YANG, Zheng SHEN, Chao
author_facet	WANG, Longtian XIE, Xiaofei DU, Xiaoning TIAN, Meng GUO, Qing YANG, Zheng SHEN, Chao
author_sort	WANG, Longtian
title	DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems
title_short	DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems
title_full	DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems
title_fullStr	DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems
title_full_unstemmed	DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems
title_sort	distxplore: distribution-guided testing for evaluating and enhancing deep learning systems
publisher	Institutional Knowledge at Singapore Management University
publishDate	2023
url	https://ink.library.smu.edu.sg/sis_research/8516 https://ink.library.smu.edu.sg/context/sis_research/article/9519/viewcontent/3611643.3616266.pdf
_version_	1789483257338789888

DistXplore: Distribution-guided testing for evaluating and enhancing deep learning systems

Similar Items