Why and how developers fork what from whom in GitHub

Forking is the creation of a new software repository by copying another repository. Though forking is controversial in traditional open source software (OSS) community, it is encouraged and is a built-in feature in GitHub. Developers freely fork repositories, use codes as their own and make changes....

Full description

Saved in:
Bibliographic Details
Main Authors: JIANG, Jing, LO, David, HE, Jiahuan, XIA, Xin, KOCHHAR, Pavneet Singh, ZHANG, Li
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2017
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3705
https://ink.library.smu.edu.sg/context/sis_research/article/4707/viewcontent/WhyHowDevelopersForkGitHub_2017.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4707
record_format dspace
spelling sg-smu-ink.sis_research-47072019-01-09T01:33:57Z Why and how developers fork what from whom in GitHub JIANG, Jing LO, David HE, Jiahuan XIA, Xin KOCHHAR, Pavneet Singh ZHANG, Li Forking is the creation of a new software repository by copying another repository. Though forking is controversial in traditional open source software (OSS) community, it is encouraged and is a built-in feature in GitHub. Developers freely fork repositories, use codes as their own and make changes. A deep understanding of repository forking can provide important insights for OSS community and GitHub. In this paper, we explore why and how developers fork what from whom in GitHub. We collect a dataset containing 236,344 developers and 1,841,324 forks. We make surveys, and analyze programming languages and owners of forked repositories. Our main observations are: (1) Developers fork repositories to submit pull requests, fix bugs, add new features and keep copies etc. Developers find repositories to fork from various sources: search engines, external sites (e.g., Twitter, Reddit), social relationships, etc. More than 42 % of developers that we have surveyed agree that an automated recommendation tool is useful to help them pick repositories to fork, while more than 44.4 % of developers do not value a recommendation tool. Developers care about repository owners when they fork repositories. (2) A repository written in a developer’s preferred programming language is more likely to be forked. (3) Developers mostly fork repositories from creators. In comparison with unattractive repository owners, attractive repository owners have higher percentage of organizations, more followers and earlier registration in GitHub. Our results show that forking is mainly used for making contributions of original repositories, and it is beneficial for OSS community. Moreover, our results show the value of recommendation and provide important insights for GitHub to recommend repositories. 2017-02-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3705 info:doi/10.1007/s10664-016-9436-6 https://ink.library.smu.edu.sg/context/sis_research/article/4707/viewcontent/WhyHowDevelopersForkGitHub_2017.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Fork Open source software GitHub Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Fork
Open source software
GitHub
Databases and Information Systems
spellingShingle Fork
Open source software
GitHub
Databases and Information Systems
JIANG, Jing
LO, David
HE, Jiahuan
XIA, Xin
KOCHHAR, Pavneet Singh
ZHANG, Li
Why and how developers fork what from whom in GitHub
description Forking is the creation of a new software repository by copying another repository. Though forking is controversial in traditional open source software (OSS) community, it is encouraged and is a built-in feature in GitHub. Developers freely fork repositories, use codes as their own and make changes. A deep understanding of repository forking can provide important insights for OSS community and GitHub. In this paper, we explore why and how developers fork what from whom in GitHub. We collect a dataset containing 236,344 developers and 1,841,324 forks. We make surveys, and analyze programming languages and owners of forked repositories. Our main observations are: (1) Developers fork repositories to submit pull requests, fix bugs, add new features and keep copies etc. Developers find repositories to fork from various sources: search engines, external sites (e.g., Twitter, Reddit), social relationships, etc. More than 42 % of developers that we have surveyed agree that an automated recommendation tool is useful to help them pick repositories to fork, while more than 44.4 % of developers do not value a recommendation tool. Developers care about repository owners when they fork repositories. (2) A repository written in a developer’s preferred programming language is more likely to be forked. (3) Developers mostly fork repositories from creators. In comparison with unattractive repository owners, attractive repository owners have higher percentage of organizations, more followers and earlier registration in GitHub. Our results show that forking is mainly used for making contributions of original repositories, and it is beneficial for OSS community. Moreover, our results show the value of recommendation and provide important insights for GitHub to recommend repositories.
format text
author JIANG, Jing
LO, David
HE, Jiahuan
XIA, Xin
KOCHHAR, Pavneet Singh
ZHANG, Li
author_facet JIANG, Jing
LO, David
HE, Jiahuan
XIA, Xin
KOCHHAR, Pavneet Singh
ZHANG, Li
author_sort JIANG, Jing
title Why and how developers fork what from whom in GitHub
title_short Why and how developers fork what from whom in GitHub
title_full Why and how developers fork what from whom in GitHub
title_fullStr Why and how developers fork what from whom in GitHub
title_full_unstemmed Why and how developers fork what from whom in GitHub
title_sort why and how developers fork what from whom in github
publisher Institutional Knowledge at Singapore Management University
publishDate 2017
url https://ink.library.smu.edu.sg/sis_research/3705
https://ink.library.smu.edu.sg/context/sis_research/article/4707/viewcontent/WhyHowDevelopersForkGitHub_2017.pdf
_version_ 1770573676851232768