18 million links in commit messages: purpose, evolution, and decay

Commit messages contain diverse and valuable types of knowledge in all aspects of software maintenance and evolution. Links are an example of such knowledge. Previous work on “9.6 million links in source code comments” showed that links are prone to decay, become outdated, and lack bidirectional tra...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: XIAO, Tao, BALTES, Sebastian, HATA, Hideaki, TREUDE, Christoph, KULA, Raula, ISHIO, Takashi, MATSUMOTO, Kenichi
التنسيق: text
اللغة:English
منشور في: Institutional Knowledge at Singapore Management University 2023
الموضوعات:
الوصول للمادة أونلاين:https://ink.library.smu.edu.sg/sis_research/8781
https://ink.library.smu.edu.sg/context/sis_research/article/9784/viewcontent/18million.pdf
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Singapore Management University
اللغة: English
الوصف
الملخص:Commit messages contain diverse and valuable types of knowledge in all aspects of software maintenance and evolution. Links are an example of such knowledge. Previous work on “9.6 million links in source code comments” showed that links are prone to decay, become outdated, and lack bidirectional traceability. We conducted a large-scale study of 18,201,165 links from commits in 23,110 GitHub repositories to investigate whether they suffer the same fate. Results show that referencing external resources is prevalent and that the most frequent domains other than github.com are the external domains of Stack Overflow and Google Code. Similarly, links serve as source code context to commit messages, with inaccessible links being frequent. Although repeatedly referencing links is rare (4%), 14% of links that are prone to evolve become unavailable over time; e.g., tutorials or articles and software homepages become unavailable over time. Furthermore, we find that 70% of the distinct links suffer from decay; the domains that occur the most frequently are related to Subversion repositories. We summarize that links in commits share the same fate as links in code, opening up avenues for future work.