Using CodeBERT model for vulnerability detection

This report presents on the experimental study that was done based on the aim of achieving a deeper understanding of the parameters used in fine-tuning of the pre-trained model, while also trying to achieve a model with the same or even better accuracy than what was stated in the repository, thro...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Zhou, ZhiWei
مؤلفون آخرون:	Liu Yang
التنسيق:	Final Year Project
اللغة:	English
منشور في:	Nanyang Technological University 2022
الموضوعات:	Engineering::Computer science and engineering
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/156815
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Nanyang Technological University
اللغة:	English

id	sg-ntu-dr.10356-156815
record_format	dspace
spelling	sg-ntu-dr.10356-1568152022-04-26T00:09:31Z Using CodeBERT model for vulnerability detection Zhou, ZhiWei Liu Yang School of Computer Science and Engineering yangliu@ntu.edu.sg Engineering::Computer science and engineering This report presents on the experimental study that was done based on the aim of achieving a deeper understanding of the parameters used in fine-tuning of the pre-trained model, while also trying to achieve a model with the same or even better accuracy than what was stated in the repository, through fine-tuning it by varying various parameter settings. Based on existing research, there have been a clear and growing need for these models to detect vulnerabilities in code intelligence tasks with decent accuracies in order to ultimately increase productivity of programmers and also reduce the risks of using codes that are already available online on code sharing platforms. CodeBERT is a BERT-style (Bidirectional Encoder Representations from Transformers) pretrained model for Natural Language (NL) and Programming Language (PL) which learns general-purpose representations, that supports downstream NL-PL applications such as natural language code search, code documentation generation, etc. It is developed with a Transformerbased neural architecture and trained with a hybrid objective function which enables the utilization of both “bimodal” data and “unimodal” data. CodeBERT is evaluated by fine-tuning the model’s parameters Results show that fine-tuning the parameters of CodeBERT achieves state-of-the-art performance on both NL code search and code documentation generation. Furthermore, CodeBERT is evaluated in a zero-shot setting where parameters of pre-trained models are fixed to find out what type of knowledge is learnt. Results show that CodeBERT constantly performs better than previous pre-trained models on NL-PL probing. With the benchmarks of CodeBERT already in the repository, the purpose of this experimental study is to achieve the benchmarks and possibly exceed it by researching about the parameters, in order to get a better understanding, then changing the various parameters singly, graphing its results, and studying its effects on the fine-tuned process and in turn, the final accuracy of the model. Bachelor of Engineering (Computer Science) 2022-04-26T00:09:31Z 2022-04-26T00:09:31Z 2022 Final Year Project (FYP) Zhou, Z. (2022). Using CodeBERT model for vulnerability detection. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156815 https://hdl.handle.net/10356/156815 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Zhou, ZhiWei Using CodeBERT model for vulnerability detection
description	This report presents on the experimental study that was done based on the aim of achieving a deeper understanding of the parameters used in fine-tuning of the pre-trained model, while also trying to achieve a model with the same or even better accuracy than what was stated in the repository, through fine-tuning it by varying various parameter settings. Based on existing research, there have been a clear and growing need for these models to detect vulnerabilities in code intelligence tasks with decent accuracies in order to ultimately increase productivity of programmers and also reduce the risks of using codes that are already available online on code sharing platforms. CodeBERT is a BERT-style (Bidirectional Encoder Representations from Transformers) pretrained model for Natural Language (NL) and Programming Language (PL) which learns general-purpose representations, that supports downstream NL-PL applications such as natural language code search, code documentation generation, etc. It is developed with a Transformerbased neural architecture and trained with a hybrid objective function which enables the utilization of both “bimodal” data and “unimodal” data. CodeBERT is evaluated by fine-tuning the model’s parameters Results show that fine-tuning the parameters of CodeBERT achieves state-of-the-art performance on both NL code search and code documentation generation. Furthermore, CodeBERT is evaluated in a zero-shot setting where parameters of pre-trained models are fixed to find out what type of knowledge is learnt. Results show that CodeBERT constantly performs better than previous pre-trained models on NL-PL probing. With the benchmarks of CodeBERT already in the repository, the purpose of this experimental study is to achieve the benchmarks and possibly exceed it by researching about the parameters, in order to get a better understanding, then changing the various parameters singly, graphing its results, and studying its effects on the fine-tuned process and in turn, the final accuracy of the model.
author2	Liu Yang
author_facet	Liu Yang Zhou, ZhiWei
format	Final Year Project
author	Zhou, ZhiWei
author_sort	Zhou, ZhiWei
title	Using CodeBERT model for vulnerability detection
title_short	Using CodeBERT model for vulnerability detection
title_full	Using CodeBERT model for vulnerability detection
title_fullStr	Using CodeBERT model for vulnerability detection
title_full_unstemmed	Using CodeBERT model for vulnerability detection
title_sort	using codebert model for vulnerability detection
publisher	Nanyang Technological University
publishDate	2022
url	https://hdl.handle.net/10356/156815
_version_	1731235730335727616

Using CodeBERT model for vulnerability detection

مواد مشابهة