LARGE LANGUAGE MODEL-BASED TESTING FOR CROSS-SITE SCRIPTING VULNERABILITIES
Cross-site scripting is one of the vulnerabilities that is always being published in the fourth last OWASP Top 10's publication. Moreover, cross-site scripting is the most reported vulnerability in CVEDetails' site. In fact, this vulnerability shows an increasing trend in the last ten y...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/82472 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Cross-site scripting is one of the vulnerabilities that is always being published in
the fourth last OWASP Top 10's publication. Moreover, cross-site scripting is the most
reported vulnerability in CVEDetails' site. In fact, this vulnerability shows an increasing
trend in the last ten years. So, it can be concluded that a comprehensive testing mechanism
is needed to prevent this vulnerability from happening.
To address the prevalence of cross-site scripting vulnerabilities, many researches
and tools have been made. However, those tools still have a high false positive rate and are
dependent on the tech stack being used. For example, XSStrike is not confirming the
vulnerabilities with an attack, not even being able to create a payload for DOM cross-site
scripting. Another example comes from Mohammadi et al. (2017) that can only be used for
software using JSP.
This Final Project proposes a cross-site scripting detection tool that validates the
vulnerability by directly attacking the target. By using payloads created by a large language
model with the few-shot prompting technique, the tool proposed in this Final Project has a
comparable performance as the baseline, XSStrike, in server cross-site scripting detection.
Moreover, this tool outperforms XSStrike in client cross-site scripting detection. In short,
this tool can detect 59 out of 92 cross-site scripting vulnerabilities in a testbed called
Google Firing Range. The tool proposed can also reduce the false positive rate to zero in a
test to open source software Airflow. All these performances are achieved with only $0.01
additional cost for each page tested. However, the automatic prompt engineering technique
used in this Final Project is not exactly the same with the original theory and the small
number of vulnerabilities tested in Airflow can be suggestions for next research. |
---|