A large scale study of multiple programming languages and code quality

Nowadays, most software use multiple programming languages to implement certain functionalities based on the strengths and weaknesses of different languages. Researchers in the past have studied the impact of independent programming languages on software quality, however, there has been little or no...

Full description

Saved in:
Bibliographic Details
Main Authors: KOCHHAR, Pavneet Singh, WIJEDASA, Withthige Dinusha Ruchira, LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2016
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3755
https://ink.library.smu.edu.sg/context/sis_research/article/4757/viewcontent/1855a563.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4757
record_format dspace
spelling sg-smu-ink.sis_research-47572018-06-01T05:20:57Z A large scale study of multiple programming languages and code quality KOCHHAR, Pavneet Singh WIJEDASA, Withthige Dinusha Ruchira LO, David Nowadays, most software use multiple programming languages to implement certain functionalities based on the strengths and weaknesses of different languages. Researchers in the past have studied the impact of independent programming languages on software quality, however, there has been little or no research on the impact of multiple languages on the quality of software. Does the use of multiple languages cause more bugs? Are certain languages when used with other languages make software more bug prone? What are the relationships between multi-language usage and various bug categories? In this study, we perform a large scale empirical investigation to shed light on the answers to these questions. We gather a large dataset consisting of popular projects from GitHub (628 projects, 85 million SLOC, 134 thousand authors, 3 million commits, in 17 languages) to understand the impact of using multiple languages on software quality. We build multiple regression models to study the effects of using different languages on the number of bug fixing commits while controlling for factors such as project size, team size, project age and the number of commits. Our results show that in general implementing a project with more languages has a significant effect on project quality, as it increases defect proneness. Moreover, we find specific languages that are statistically significantly more defect prone when they are used in a multi-language setting. These include popular languages like C++, Objective-C, and Java. Furthermore, we note that the use of more languages significantly increases bug proneness across all bug categories. The effect is strongest for memory, concurrency, and algorithm bugs. 2016-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3755 info:doi/10.1109/SANER.2016.112 https://ink.library.smu.edu.sg/context/sis_research/article/4757/viewcontent/1855a563.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Computer bugs Java Programming Software quality Google Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Computer bugs
Java
Programming
Software quality
Google
Software Engineering
spellingShingle Computer bugs
Java
Programming
Software quality
Google
Software Engineering
KOCHHAR, Pavneet Singh
WIJEDASA, Withthige Dinusha Ruchira
LO, David
A large scale study of multiple programming languages and code quality
description Nowadays, most software use multiple programming languages to implement certain functionalities based on the strengths and weaknesses of different languages. Researchers in the past have studied the impact of independent programming languages on software quality, however, there has been little or no research on the impact of multiple languages on the quality of software. Does the use of multiple languages cause more bugs? Are certain languages when used with other languages make software more bug prone? What are the relationships between multi-language usage and various bug categories? In this study, we perform a large scale empirical investigation to shed light on the answers to these questions. We gather a large dataset consisting of popular projects from GitHub (628 projects, 85 million SLOC, 134 thousand authors, 3 million commits, in 17 languages) to understand the impact of using multiple languages on software quality. We build multiple regression models to study the effects of using different languages on the number of bug fixing commits while controlling for factors such as project size, team size, project age and the number of commits. Our results show that in general implementing a project with more languages has a significant effect on project quality, as it increases defect proneness. Moreover, we find specific languages that are statistically significantly more defect prone when they are used in a multi-language setting. These include popular languages like C++, Objective-C, and Java. Furthermore, we note that the use of more languages significantly increases bug proneness across all bug categories. The effect is strongest for memory, concurrency, and algorithm bugs.
format text
author KOCHHAR, Pavneet Singh
WIJEDASA, Withthige Dinusha Ruchira
LO, David
author_facet KOCHHAR, Pavneet Singh
WIJEDASA, Withthige Dinusha Ruchira
LO, David
author_sort KOCHHAR, Pavneet Singh
title A large scale study of multiple programming languages and code quality
title_short A large scale study of multiple programming languages and code quality
title_full A large scale study of multiple programming languages and code quality
title_fullStr A large scale study of multiple programming languages and code quality
title_full_unstemmed A large scale study of multiple programming languages and code quality
title_sort large scale study of multiple programming languages and code quality
publisher Institutional Knowledge at Singapore Management University
publishDate 2016
url https://ink.library.smu.edu.sg/sis_research/3755
https://ink.library.smu.edu.sg/context/sis_research/article/4757/viewcontent/1855a563.pdf
_version_ 1770573712437805056