Provenance graph generation for intrusion detection

In this digital age, cyberattacks are becoming more complex, and are accompanied by increasingly severe consequences. Traditional intrusion detection systems are struggling to identify sophisticated threats such as zero-day attacks or Advanced Persistent Threats (APTs) efficiently and effectively. T...

Full description

Saved in:
Bibliographic Details
Main Author: Chew, Perlyn Jie Ying
Other Authors: Ke Yiping, Kelly
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/171750
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-171750
record_format dspace
spelling sg-ntu-dr.10356-1717502023-11-10T15:36:59Z Provenance graph generation for intrusion detection Chew, Perlyn Jie Ying Ke Yiping, Kelly School of Computer Science and Engineering ypke@ntu.edu.sg Engineering::Computer science and engineering::Computer systems organization::Computer system implementation In this digital age, cyberattacks are becoming more complex, and are accompanied by increasingly severe consequences. Traditional intrusion detection systems are struggling to identify sophisticated threats such as zero-day attacks or Advanced Persistent Threats (APTs) efficiently and effectively. To address this challenge, modern approaches are required. Provenance graphs emerge as a promising data source for modern intrusion detection by capturing comprehensive information on both malicious and benign system activities. Provenance describes the history or lineage of an object, and captures information on how digital objects arrive at their existing state. These graphs present complex dependencies and relationships in the form of a directed acyclic graph that has potential for analysis using machine learning methods. However, there are few end-to-end pipelines that automatically generate and transform provenance data into graph representations suitable for machine learning. The Flurry framework is a contemporary approach, built upon CamFlow, a provenance capture system, to improve the reproducibility and ease of generating provenance graphs for machine learning. Recognising the potential of provenance graphs and the challenges in their generation, this research aims to implement Flurry and improve the generation and capture of provenance graphs for intrusion detection. Intrusion scenarios will be designed then simulated on multiple security- sensitive applications across various operating systems. Extensive datasets of provenance graphs were produced via dynamically executing various attacks on Fedora and Ubuntu, then used to train and validate state-of-the-art graph-based models, to evaluate their effectiveness and accuracy. Specifically, the provenance graphs were seamlessly exported as a dataset for a Graph Convolution Network (GCN) in this project. The results affirm Flurry as an excellent framework for generating provenance graphs. Additionally, the strong performance of cutting-edge graph based models in tasks like graph classification and anomaly detection underscore the potential of provenance graphs as an ideal data source for contemporary intrusion detection systems. Bachelor of Engineering (Computer Science) 2023-11-07T04:58:56Z 2023-11-07T04:58:56Z 2023 Final Year Project (FYP) Chew, P. J. Y. (2023). Provenance graph generation for intrusion detection. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/171750 https://hdl.handle.net/10356/171750 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computer systems organization::Computer system implementation
spellingShingle Engineering::Computer science and engineering::Computer systems organization::Computer system implementation
Chew, Perlyn Jie Ying
Provenance graph generation for intrusion detection
description In this digital age, cyberattacks are becoming more complex, and are accompanied by increasingly severe consequences. Traditional intrusion detection systems are struggling to identify sophisticated threats such as zero-day attacks or Advanced Persistent Threats (APTs) efficiently and effectively. To address this challenge, modern approaches are required. Provenance graphs emerge as a promising data source for modern intrusion detection by capturing comprehensive information on both malicious and benign system activities. Provenance describes the history or lineage of an object, and captures information on how digital objects arrive at their existing state. These graphs present complex dependencies and relationships in the form of a directed acyclic graph that has potential for analysis using machine learning methods. However, there are few end-to-end pipelines that automatically generate and transform provenance data into graph representations suitable for machine learning. The Flurry framework is a contemporary approach, built upon CamFlow, a provenance capture system, to improve the reproducibility and ease of generating provenance graphs for machine learning. Recognising the potential of provenance graphs and the challenges in their generation, this research aims to implement Flurry and improve the generation and capture of provenance graphs for intrusion detection. Intrusion scenarios will be designed then simulated on multiple security- sensitive applications across various operating systems. Extensive datasets of provenance graphs were produced via dynamically executing various attacks on Fedora and Ubuntu, then used to train and validate state-of-the-art graph-based models, to evaluate their effectiveness and accuracy. Specifically, the provenance graphs were seamlessly exported as a dataset for a Graph Convolution Network (GCN) in this project. The results affirm Flurry as an excellent framework for generating provenance graphs. Additionally, the strong performance of cutting-edge graph based models in tasks like graph classification and anomaly detection underscore the potential of provenance graphs as an ideal data source for contemporary intrusion detection systems.
author2 Ke Yiping, Kelly
author_facet Ke Yiping, Kelly
Chew, Perlyn Jie Ying
format Final Year Project
author Chew, Perlyn Jie Ying
author_sort Chew, Perlyn Jie Ying
title Provenance graph generation for intrusion detection
title_short Provenance graph generation for intrusion detection
title_full Provenance graph generation for intrusion detection
title_fullStr Provenance graph generation for intrusion detection
title_full_unstemmed Provenance graph generation for intrusion detection
title_sort provenance graph generation for intrusion detection
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/171750
_version_ 1783955516791193600