Automated vulnerability detection system based on commit messages
Vulnerabilities in Open Source Software (OSS) are the major culprits of cyber-attacks and security breaches today. To avoid repetitive development and speed up release cycle, software teams nowadays are increasingly relying on OSS. However, many OSS users are unaware of the vulnerable components the...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/104726 http://hdl.handle.net/10220/48651 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Vulnerabilities in Open Source Software (OSS) are the major culprits of cyber-attacks and security breaches today. To avoid repetitive development and speed up release cycle, software teams nowadays are increasingly relying on OSS. However, many OSS users are unaware of the vulnerable components they are using. Sometimes it will take weeks or even months for a Common Vulnerabilities and Exposures (CVE) to be determined and finally patched. Thus, to mitigate against cyber-attacks, it is
important to understand both known CVEs and unknown vulnerabilities.
In this thesis, we first conducted a large-scale crawling of Git commits for some popular open
source repositories like Linux. Second, because there is no prior dataset for security-relevant
Git commits, we developed a web-based triage system for security researchers to perform
manual labelling of the commits. Finally, after the commits are cleaned and labelled, a deep
neural network is implemented to automatically identify vulnerability-fixing commits (VFC)
based on the commit messages. The approach has achieved significant better precision than
state-of-the-art while improving the recall rate by 16.8%. In the end, we present a thorough
quantitative and qualitative analysis of the results and discuss the lessons learned and room for
future work. |
---|