A differential testing approach for evaluating abstract syntax tree mapping algorithms

Abstract syntax tree (AST) mapping algorithms are widely used to analyze changes in source code. Despite the foundational role of AST mapping algorithms, little effort has been made to evaluate the accuracy of AST mapping algorithms, i.e., the extent to which an algorithm captures the evolution of c...

Full description

Saved in:
Bibliographic Details
Main Authors: FAN, Yuanrui, XIA, Xin, LO, David, HASSAN, Ahmed E., WANG, Yuan, LI, Shanping
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6879
https://ink.library.smu.edu.sg/context/sis_research/article/7882/viewcontent/A_Differential_Testing_Approach_for_Evaluating_Abstract_Syntax_Tree_Mapping_Algorithms.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7882
record_format dspace
spelling sg-smu-ink.sis_research-78822022-02-07T11:05:15Z A differential testing approach for evaluating abstract syntax tree mapping algorithms FAN, Yuanrui XIA, Xin LO, David HASSAN, Ahmed E. WANG, Yuan LI, Shanping Abstract syntax tree (AST) mapping algorithms are widely used to analyze changes in source code. Despite the foundational role of AST mapping algorithms, little effort has been made to evaluate the accuracy of AST mapping algorithms, i.e., the extent to which an algorithm captures the evolution of code. We observe that a program element often has only one best-mapped program element. Based on this observation, we propose a hierarchical approach to automatically compare the similarity of mapped statements and tokens by different algorithms. By performing the comparison, we determine if eachof the compared algorithms generates inaccurate mappings for a statement or its tokens. We invite 12 external experts to determine if three commonly used AST mapping algorithms generate accurate mappings for a statement and its tokens for 200 statements. Based on the experts’ feedback, we observe that our approach achieves a precision of 0.98-1.00 and a recall of 0.65-0.75. Furthermore, we conduct a large-scale study with a dataset of ten Java projects containing a total of 263,165 file revisions. Our approach determines that GumTree, MTDiff and IJM generate inaccurate mappings for 20%-29%, 25%-36% and 21%-30% of the file revisions, respectively. Our experimental results show that state-of-the-art AST mapping algorithms still need improvements. 2021-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6879 info:doi/10.1109/ICSE43902.2021.00108 https://ink.library.smu.edu.sg/context/sis_research/article/7882/viewcontent/A_Differential_Testing_Approach_for_Evaluating_Abstract_Syntax_Tree_Mapping_Algorithms.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University abstract syntax tree program element mapping software evolution Artificial Intelligence and Robotics Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic abstract syntax tree
program element mapping
software evolution
Artificial Intelligence and Robotics
Databases and Information Systems
spellingShingle abstract syntax tree
program element mapping
software evolution
Artificial Intelligence and Robotics
Databases and Information Systems
FAN, Yuanrui
XIA, Xin
LO, David
HASSAN, Ahmed E.
WANG, Yuan
LI, Shanping
A differential testing approach for evaluating abstract syntax tree mapping algorithms
description Abstract syntax tree (AST) mapping algorithms are widely used to analyze changes in source code. Despite the foundational role of AST mapping algorithms, little effort has been made to evaluate the accuracy of AST mapping algorithms, i.e., the extent to which an algorithm captures the evolution of code. We observe that a program element often has only one best-mapped program element. Based on this observation, we propose a hierarchical approach to automatically compare the similarity of mapped statements and tokens by different algorithms. By performing the comparison, we determine if eachof the compared algorithms generates inaccurate mappings for a statement or its tokens. We invite 12 external experts to determine if three commonly used AST mapping algorithms generate accurate mappings for a statement and its tokens for 200 statements. Based on the experts’ feedback, we observe that our approach achieves a precision of 0.98-1.00 and a recall of 0.65-0.75. Furthermore, we conduct a large-scale study with a dataset of ten Java projects containing a total of 263,165 file revisions. Our approach determines that GumTree, MTDiff and IJM generate inaccurate mappings for 20%-29%, 25%-36% and 21%-30% of the file revisions, respectively. Our experimental results show that state-of-the-art AST mapping algorithms still need improvements.
format text
author FAN, Yuanrui
XIA, Xin
LO, David
HASSAN, Ahmed E.
WANG, Yuan
LI, Shanping
author_facet FAN, Yuanrui
XIA, Xin
LO, David
HASSAN, Ahmed E.
WANG, Yuan
LI, Shanping
author_sort FAN, Yuanrui
title A differential testing approach for evaluating abstract syntax tree mapping algorithms
title_short A differential testing approach for evaluating abstract syntax tree mapping algorithms
title_full A differential testing approach for evaluating abstract syntax tree mapping algorithms
title_fullStr A differential testing approach for evaluating abstract syntax tree mapping algorithms
title_full_unstemmed A differential testing approach for evaluating abstract syntax tree mapping algorithms
title_sort differential testing approach for evaluating abstract syntax tree mapping algorithms
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sis_research/6879
https://ink.library.smu.edu.sg/context/sis_research/article/7882/viewcontent/A_Differential_Testing_Approach_for_Evaluating_Abstract_Syntax_Tree_Mapping_Algorithms.pdf
_version_ 1770576112099786752