Multi-Granularity Detector for Vulnerability Fixes

With the increasing reliance on Open Source Software, users are exposed to third-party library vulnerabilities. Software Composition Analysis (SCA) tools have been created to alert users of such vulnerabilities. SCA requires the identification of vulnerability-fixing commits. Prior works have propos...

Full description

Saved in:

Bibliographic Details
Main Authors:	NGUYEN, Truong Giang, CONG, Thanh Le, KANG, Hong Jin, WIDYASARI, Ratnadira, YANG, Chengran, ZHAO, Zhipeng, XU, Bowen, ZHOU, Jiayuan, XIA, Xin, HASSAN, Ahmed E., David LO, LO, David
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2023
Subjects:	Vulnerability-fixing commit classification Machine learning Deep learning Software security Artificial Intelligence and Robotics Information Security
Online Access:	https://ink.library.smu.edu.sg/sis_research/8508 https://ink.library.smu.edu.sg/context/sis_research/article/9511/viewcontent/2305.13884.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9511
record_format	dspace
spelling	sg-smu-ink.sis_research-95112024-01-22T15:11:22Z Multi-Granularity Detector for Vulnerability Fixes NGUYEN, Truong Giang CONG, Thanh Le, KANG, Hong Jin WIDYASARI, Ratnadira YANG, Chengran ZHAO, Zhipeng XU, Bowen ZHOU, Jiayuan XIA, Xin HASSAN, Ahmed E. David LO, LO, David With the increasing reliance on Open Source Software, users are exposed to third-party library vulnerabilities. Software Composition Analysis (SCA) tools have been created to alert users of such vulnerabilities. SCA requires the identification of vulnerability-fixing commits. Prior works have proposed methods that can automatically identify such vulnerability-fixing commits. However, identifying such commits is highly challenging, as only a very small minority of commits are vulnerability fixing. Moreover, code changes can be noisy and difficult to analyze. We observe that noise can occur at different levels of detail, making it challenging to detect vulnerability fixes accurately. To address these challenges and boost the effectiveness of prior works, we propose MiDas (Multi-Granularity Detector for Vulnerability Fixes). Unique from prior works, MiDas constructs different neural networks for each level of code change granularity, corresponding to commit-level, file-level, hunk-level, and line-level, following their natural organization and then use an ensemble model combining all base models to output the final prediction. This design allows MiDas to better cope with the noisy and highly-imbalanced nature of vulnerability-fixing commit data. In addition, to reduce the human effort required to inspect code changes, we have designed an effort-aware adjustment for MiDas's outputs based on commit length. The evaluation result demonstrates that MiDas outperforms the current state-of-the-art baseline on both Java and Python-based datasets in terms of AUC by 4.9% and 13.7%, respectively. Furthermore, in terms of two effort-aware metrics, i.e., EffortCost@L and Popt@L, MiDas also performs better than the state-of-the-art baseline up to 28.2% and 15.9% on Java, 60% and 51.4% on Python, respectively. 2023-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8508 info:doi/10.1109/TSE.2023.3281275 https://ink.library.smu.edu.sg/context/sis_research/article/9511/viewcontent/2305.13884.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Vulnerability-fixing commit classification Machine learning Deep learning Software security Artificial Intelligence and Robotics Information Security
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Vulnerability-fixing commit classification Machine learning Deep learning Software security Artificial Intelligence and Robotics Information Security
spellingShingle	Vulnerability-fixing commit classification Machine learning Deep learning Software security Artificial Intelligence and Robotics Information Security NGUYEN, Truong Giang CONG, Thanh Le, KANG, Hong Jin WIDYASARI, Ratnadira YANG, Chengran ZHAO, Zhipeng XU, Bowen ZHOU, Jiayuan XIA, Xin HASSAN, Ahmed E. David LO, LO, David Multi-Granularity Detector for Vulnerability Fixes
description	With the increasing reliance on Open Source Software, users are exposed to third-party library vulnerabilities. Software Composition Analysis (SCA) tools have been created to alert users of such vulnerabilities. SCA requires the identification of vulnerability-fixing commits. Prior works have proposed methods that can automatically identify such vulnerability-fixing commits. However, identifying such commits is highly challenging, as only a very small minority of commits are vulnerability fixing. Moreover, code changes can be noisy and difficult to analyze. We observe that noise can occur at different levels of detail, making it challenging to detect vulnerability fixes accurately. To address these challenges and boost the effectiveness of prior works, we propose MiDas (Multi-Granularity Detector for Vulnerability Fixes). Unique from prior works, MiDas constructs different neural networks for each level of code change granularity, corresponding to commit-level, file-level, hunk-level, and line-level, following their natural organization and then use an ensemble model combining all base models to output the final prediction. This design allows MiDas to better cope with the noisy and highly-imbalanced nature of vulnerability-fixing commit data. In addition, to reduce the human effort required to inspect code changes, we have designed an effort-aware adjustment for MiDas's outputs based on commit length. The evaluation result demonstrates that MiDas outperforms the current state-of-the-art baseline on both Java and Python-based datasets in terms of AUC by 4.9% and 13.7%, respectively. Furthermore, in terms of two effort-aware metrics, i.e., EffortCost@L and Popt@L, MiDas also performs better than the state-of-the-art baseline up to 28.2% and 15.9% on Java, 60% and 51.4% on Python, respectively.
format	text
author	NGUYEN, Truong Giang CONG, Thanh Le, KANG, Hong Jin WIDYASARI, Ratnadira YANG, Chengran ZHAO, Zhipeng XU, Bowen ZHOU, Jiayuan XIA, Xin HASSAN, Ahmed E. David LO, LO, David
author_facet	NGUYEN, Truong Giang CONG, Thanh Le, KANG, Hong Jin WIDYASARI, Ratnadira YANG, Chengran ZHAO, Zhipeng XU, Bowen ZHOU, Jiayuan XIA, Xin HASSAN, Ahmed E. David LO, LO, David
author_sort	NGUYEN, Truong Giang
title	Multi-Granularity Detector for Vulnerability Fixes
title_short	Multi-Granularity Detector for Vulnerability Fixes
title_full	Multi-Granularity Detector for Vulnerability Fixes
title_fullStr	Multi-Granularity Detector for Vulnerability Fixes
title_full_unstemmed	Multi-Granularity Detector for Vulnerability Fixes
title_sort	multi-granularity detector for vulnerability fixes
publisher	Institutional Knowledge at Singapore Management University
publishDate	2023
url	https://ink.library.smu.edu.sg/sis_research/8508 https://ink.library.smu.edu.sg/context/sis_research/article/9511/viewcontent/2305.13884.pdf
_version_	1789483255783751680

Multi-Granularity Detector for Vulnerability Fixes

Similar Items