What makes a popular academic AI repository?

Many AI researchers are publishing code, data and other resources that accompany their papers in GitHub repositories. In this paper, we refer to these repositories as academic AI repositories. Our preliminary study shows that highly cited papers are more likely to have popular academic AI repositori...

Full description

Saved in:
Bibliographic Details
Main Authors: FAN, Yuanrui, XIA, Xin, LO, David, HASSAN, Ahmed E., LI, Shanping
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6713
https://ink.library.smu.edu.sg/context/sis_research/article/7716/viewcontent/2010.02472.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7716
record_format dspace
spelling sg-smu-ink.sis_research-77162022-01-27T11:16:03Z What makes a popular academic AI repository? FAN, Yuanrui XIA, Xin LO, David HASSAN, Ahmed E. LI, Shanping Many AI researchers are publishing code, data and other resources that accompany their papers in GitHub repositories. In this paper, we refer to these repositories as academic AI repositories. Our preliminary study shows that highly cited papers are more likely to have popular academic AI repositories (and vice versa). Hence, in this study, we perform an empirical study on academic AI repositories to highlight good software engineering practices of popular academic AI repositories for AI researchers. We collect 1,149 academic AI repositories, in which we label the top 20% repositories that have the most number of stars as popular, and we label the bottom 70% repositories as unpopular. The remaining 10% repositories are set as a gap between popular and unpopular academic AI repositories. We propose 21 features to characterize the software engineering practices of academic AI repositories. Our experimental results show that popular and unpopular academic AI repositories are statistically significantly different in 11 of the studied features—indicating that the two groups of repositories have significantly different software engineering practices. Furthermore, we find that the number of links to other GitHub repositories in the README file, the number of images in the README file and the inclusion of a license are the most important features for differentiating the two groups of academic AI repositories. Our dataset and code are made publicly available to share with the community. 2021-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6713 info:doi/10.1007/s10664-020-09916-6 https://ink.library.smu.edu.sg/context/sis_research/article/7716/viewcontent/2010.02472.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Academic AI repository Software popularity Mining software repositories Artificial Intelligence and Robotics Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Academic AI repository
Software popularity
Mining software repositories
Artificial Intelligence and Robotics
Software Engineering
spellingShingle Academic AI repository
Software popularity
Mining software repositories
Artificial Intelligence and Robotics
Software Engineering
FAN, Yuanrui
XIA, Xin
LO, David
HASSAN, Ahmed E.
LI, Shanping
What makes a popular academic AI repository?
description Many AI researchers are publishing code, data and other resources that accompany their papers in GitHub repositories. In this paper, we refer to these repositories as academic AI repositories. Our preliminary study shows that highly cited papers are more likely to have popular academic AI repositories (and vice versa). Hence, in this study, we perform an empirical study on academic AI repositories to highlight good software engineering practices of popular academic AI repositories for AI researchers. We collect 1,149 academic AI repositories, in which we label the top 20% repositories that have the most number of stars as popular, and we label the bottom 70% repositories as unpopular. The remaining 10% repositories are set as a gap between popular and unpopular academic AI repositories. We propose 21 features to characterize the software engineering practices of academic AI repositories. Our experimental results show that popular and unpopular academic AI repositories are statistically significantly different in 11 of the studied features—indicating that the two groups of repositories have significantly different software engineering practices. Furthermore, we find that the number of links to other GitHub repositories in the README file, the number of images in the README file and the inclusion of a license are the most important features for differentiating the two groups of academic AI repositories. Our dataset and code are made publicly available to share with the community.
format text
author FAN, Yuanrui
XIA, Xin
LO, David
HASSAN, Ahmed E.
LI, Shanping
author_facet FAN, Yuanrui
XIA, Xin
LO, David
HASSAN, Ahmed E.
LI, Shanping
author_sort FAN, Yuanrui
title What makes a popular academic AI repository?
title_short What makes a popular academic AI repository?
title_full What makes a popular academic AI repository?
title_fullStr What makes a popular academic AI repository?
title_full_unstemmed What makes a popular academic AI repository?
title_sort what makes a popular academic ai repository?
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sis_research/6713
https://ink.library.smu.edu.sg/context/sis_research/article/7716/viewcontent/2010.02472.pdf
_version_ 1770576052321517568