CoLeFunDa: Explainable silent vulnerability fix identification

It is common practice for OSS users to leverage and monitor security advisories to discover newly disclosed OSS vulnerabilities and their corresponding patches for vulnerability remediation. It is common for vulnerability fixes to be publicly available one week earlier than their disclosure. This ga...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHOU, Jiayuan, PACHECO, Michael, CHEN, Jinfu, HU, Xing, XIA, Xin, LO, David, HASSAN, Ahmed E.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2023
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8513
https://ink.library.smu.edu.sg/context/sis_research/article/9516/viewcontent/CoLeFunDa_Explainable_Silent_Vulnerability_Fix_Identification.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-9516
record_format dspace
spelling sg-smu-ink.sis_research-95162024-01-22T15:09:40Z CoLeFunDa: Explainable silent vulnerability fix identification ZHOU, Jiayuan PACHECO, Michael CHEN, Jinfu HU, Xing XIA, Xin LO, David HASSAN, Ahmed E. It is common practice for OSS users to leverage and monitor security advisories to discover newly disclosed OSS vulnerabilities and their corresponding patches for vulnerability remediation. It is common for vulnerability fixes to be publicly available one week earlier than their disclosure. This gap in time provides an opportunity for attackers to exploit the vulnerability. Hence, OSS users need to sense the fix as early as possible so that the vulnerability can be remediated before it is exploited. However, it is common for OSS to adopt a vulnerability disclosure policy which causes the majority of vulnerabilities to be fixed silently, meaning the commit with the fix does not indicate any vulnerability information. In this case even if a fix is identified, it is hard for OSS users to understand the vulnerability and evaluate its potential impact. To improve early sensing of vulnerabilities, the identification of silent fixes and their corresponding explanations (e.g., the corresponding common weakness enumeration (CWE) and exploitability rating) are equally important. However, it is challenging to identify silent fixes and provide explanations due to the limited and diverse data. To tackle this challenge, we propose CoLeFunDa: a framework consisting of a Contrastive Learner and FunDa, which is a novel approach for Function change Data augmentation. FunDa first increases the fix data (i.e., code changes) at the function level with unsupervised and supervised strategies. Then the contrastive learner leverages contrastive learning to effectively train a function change encoder, FCBERT, from diverse fix data. Finally, we leverage FCBERT to further fine-tune three downstream tasks, i.e., silent fix identification, CWE category classification, and exploitability rating classification, respectively. Our result shows that CoLeFunDa outperforms all the state-of-art baselines in all downstream tasks. We also conduct a survey to verify the effectiveness of CoLeFunDa in practical usage. The result shows that CoLeFunDa can categorize 62.5% (25 out of 40) CVEs with correct CWE categories within the top 2 recommendations 2023-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8513 info:doi/10.1109/ICSE48619.2023.00214 https://ink.library.smu.edu.sg/context/sis_research/article/9516/viewcontent/CoLeFunDa_Explainable_Silent_Vulnerability_Fix_Identification.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Contrastive learning Data augmentation Disclosure policies Down-stream OSS vulnerability Potential impacts Security advisories User need Vulnerability disclosure Vulnerability remediations Databases and Information Systems Information Security
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Contrastive learning
Data augmentation
Disclosure policies
Down-stream
OSS vulnerability
Potential impacts
Security advisories
User need
Vulnerability disclosure
Vulnerability remediations
Databases and Information Systems
Information Security
spellingShingle Contrastive learning
Data augmentation
Disclosure policies
Down-stream
OSS vulnerability
Potential impacts
Security advisories
User need
Vulnerability disclosure
Vulnerability remediations
Databases and Information Systems
Information Security
ZHOU, Jiayuan
PACHECO, Michael
CHEN, Jinfu
HU, Xing
XIA, Xin
LO, David
HASSAN, Ahmed E.
CoLeFunDa: Explainable silent vulnerability fix identification
description It is common practice for OSS users to leverage and monitor security advisories to discover newly disclosed OSS vulnerabilities and their corresponding patches for vulnerability remediation. It is common for vulnerability fixes to be publicly available one week earlier than their disclosure. This gap in time provides an opportunity for attackers to exploit the vulnerability. Hence, OSS users need to sense the fix as early as possible so that the vulnerability can be remediated before it is exploited. However, it is common for OSS to adopt a vulnerability disclosure policy which causes the majority of vulnerabilities to be fixed silently, meaning the commit with the fix does not indicate any vulnerability information. In this case even if a fix is identified, it is hard for OSS users to understand the vulnerability and evaluate its potential impact. To improve early sensing of vulnerabilities, the identification of silent fixes and their corresponding explanations (e.g., the corresponding common weakness enumeration (CWE) and exploitability rating) are equally important. However, it is challenging to identify silent fixes and provide explanations due to the limited and diverse data. To tackle this challenge, we propose CoLeFunDa: a framework consisting of a Contrastive Learner and FunDa, which is a novel approach for Function change Data augmentation. FunDa first increases the fix data (i.e., code changes) at the function level with unsupervised and supervised strategies. Then the contrastive learner leverages contrastive learning to effectively train a function change encoder, FCBERT, from diverse fix data. Finally, we leverage FCBERT to further fine-tune three downstream tasks, i.e., silent fix identification, CWE category classification, and exploitability rating classification, respectively. Our result shows that CoLeFunDa outperforms all the state-of-art baselines in all downstream tasks. We also conduct a survey to verify the effectiveness of CoLeFunDa in practical usage. The result shows that CoLeFunDa can categorize 62.5% (25 out of 40) CVEs with correct CWE categories within the top 2 recommendations
format text
author ZHOU, Jiayuan
PACHECO, Michael
CHEN, Jinfu
HU, Xing
XIA, Xin
LO, David
HASSAN, Ahmed E.
author_facet ZHOU, Jiayuan
PACHECO, Michael
CHEN, Jinfu
HU, Xing
XIA, Xin
LO, David
HASSAN, Ahmed E.
author_sort ZHOU, Jiayuan
title CoLeFunDa: Explainable silent vulnerability fix identification
title_short CoLeFunDa: Explainable silent vulnerability fix identification
title_full CoLeFunDa: Explainable silent vulnerability fix identification
title_fullStr CoLeFunDa: Explainable silent vulnerability fix identification
title_full_unstemmed CoLeFunDa: Explainable silent vulnerability fix identification
title_sort colefunda: explainable silent vulnerability fix identification
publisher Institutional Knowledge at Singapore Management University
publishDate 2023
url https://ink.library.smu.edu.sg/sis_research/8513
https://ink.library.smu.edu.sg/context/sis_research/article/9516/viewcontent/CoLeFunDa_Explainable_Silent_Vulnerability_Fix_Identification.pdf
_version_ 1789483256651972608