What is the vocabulary of flaky tests?

Flaky tests are tests whose outcomes are non-deterministic. Despite the recent research activity on this topic, no effort has been made on understanding the vocabulary of flaky tests. This work proposes to automatically classify tests as flaky or not based on their vocabulary. Static classification...

Full description

Saved in:

Bibliographic Details
Main Authors:	PINTO, Gustavo, MIRANDA, Breno, DISSANAYAKE, Supun, D'AMORIM, Marcelo, TREUDE, Christoph, BERTOLINO, Antonia
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2020
Subjects:	Regression testing Test flakiness Text classification Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/8809 https://ink.library.smu.edu.sg/context/sis_research/article/9812/viewcontent/msr20a.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9812
record_format	dspace
spelling	sg-smu-ink.sis_research-98122024-05-30T07:39:20Z What is the vocabulary of flaky tests? PINTO, Gustavo MIRANDA, Breno DISSANAYAKE, Supun D'AMORIM, Marcelo TREUDE, Christoph BERTOLINO, Antonia Flaky tests are tests whose outcomes are non-deterministic. Despite the recent research activity on this topic, no effort has been made on understanding the vocabulary of flaky tests. This work proposes to automatically classify tests as flaky or not based on their vocabulary. Static classification of flaky tests is important, for example, to detect the introduction of flaky tests and to search for flaky tests after they are introduced in regression test suites. We evaluated performance of various machine learning algorithms to solve this problem. We constructed a data set of flaky and non-flaky tests by running every test case, in a set of 64k tests, 100 times (6.4 million test executions). We then used machine learning techniques on the resulting data set to predict which tests are flaky from their source code. Based on features, such as counting stemmed tokens extracted from source code identifiers, we achieved an F-measure of 0.95 for the identification of flaky tests. The best prediction performance was obtained when using Random Forest and Support Vector Machines. In terms of the code identifiers that are most strongly associated with test flakiness, we noted that job, action, and services are commonly associated with flaky tests. Overall, our results provides initial yet strong evidence that static detection of flaky tests is effective. 2020-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8809 info:doi/10.1145/3379597.3387482 https://ink.library.smu.edu.sg/context/sis_research/article/9812/viewcontent/msr20a.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Regression testing Test flakiness Text classification Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Regression testing Test flakiness Text classification Software Engineering
spellingShingle	Regression testing Test flakiness Text classification Software Engineering PINTO, Gustavo MIRANDA, Breno DISSANAYAKE, Supun D'AMORIM, Marcelo TREUDE, Christoph BERTOLINO, Antonia What is the vocabulary of flaky tests?
description	Flaky tests are tests whose outcomes are non-deterministic. Despite the recent research activity on this topic, no effort has been made on understanding the vocabulary of flaky tests. This work proposes to automatically classify tests as flaky or not based on their vocabulary. Static classification of flaky tests is important, for example, to detect the introduction of flaky tests and to search for flaky tests after they are introduced in regression test suites. We evaluated performance of various machine learning algorithms to solve this problem. We constructed a data set of flaky and non-flaky tests by running every test case, in a set of 64k tests, 100 times (6.4 million test executions). We then used machine learning techniques on the resulting data set to predict which tests are flaky from their source code. Based on features, such as counting stemmed tokens extracted from source code identifiers, we achieved an F-measure of 0.95 for the identification of flaky tests. The best prediction performance was obtained when using Random Forest and Support Vector Machines. In terms of the code identifiers that are most strongly associated with test flakiness, we noted that job, action, and services are commonly associated with flaky tests. Overall, our results provides initial yet strong evidence that static detection of flaky tests is effective.
format	text
author	PINTO, Gustavo MIRANDA, Breno DISSANAYAKE, Supun D'AMORIM, Marcelo TREUDE, Christoph BERTOLINO, Antonia
author_facet	PINTO, Gustavo MIRANDA, Breno DISSANAYAKE, Supun D'AMORIM, Marcelo TREUDE, Christoph BERTOLINO, Antonia
author_sort	PINTO, Gustavo
title	What is the vocabulary of flaky tests?
title_short	What is the vocabulary of flaky tests?
title_full	What is the vocabulary of flaky tests?
title_fullStr	What is the vocabulary of flaky tests?
title_full_unstemmed	What is the vocabulary of flaky tests?
title_sort	what is the vocabulary of flaky tests?
publisher	Institutional Knowledge at Singapore Management University
publishDate	2020
url	https://ink.library.smu.edu.sg/sis_research/8809 https://ink.library.smu.edu.sg/context/sis_research/article/9812/viewcontent/msr20a.pdf
_version_	1814047536292298752

What is the vocabulary of flaky tests?

Similar Items