Investigation on effective solutions against insider attacks

One of the common flaws of the current insider threat detection is the high demand for data storage. This report investigates the effectiveness of dimensionality reduction techniques in reducing this high demand needed by the machine learning methods used for insider threat detection. The dimensiona...

Full description

Saved in:
Bibliographic Details
Main Author: Ang, Jun Hao
Other Authors: Felicity Chan
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74243
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-74243
record_format dspace
spelling sg-ntu-dr.10356-742432023-03-03T20:57:57Z Investigation on effective solutions against insider attacks Ang, Jun Hao Felicity Chan School of Computer Science and Engineering Li Fang DRNTU::Engineering One of the common flaws of the current insider threat detection is the high demand for data storage. This report investigates the effectiveness of dimensionality reduction techniques in reducing this high demand needed by the machine learning methods used for insider threat detection. The dimensionality reduction techniques discussed in this report are feature selection methods i.e. Recursive Feature Elimination (RFE), Chi-Square Test and feature extraction methods i.e. Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA). The machine learning algorithms discussed in this report are supervised method i.e. K-Nearest Neighbour (KNN) and unsupervised method i.e. K-Means Clustering (KMC). The dataset used is a labelled phishing website dataset with 10,000 rows and 30 features. In practical practices, accuracy of an insider threat detection is more essential than the high data storage demand but having accuracy improved and data storage demand reduced is a bonus. Therefore, in the experiments conducted for this report, the effectiveness of a dimensionality reduction technique is evaluated based on the maximum amount of data storage that can be reduced regardless of any amount of improvement in accuracy. Based on this kind of evaluation, the experimental results show that both feature selection methods RFE and Chi-Square Test in general did a good job on both KNN and KMC, but for feature extraction methods PCA did well only on KNN and LDA did exceptionally well only on KMC. From the results, it can be concluded that the performance of feature selection methods is more stable than feature extraction methods but the degree of improvements in terms of accuracy and data storage reduction by feature extraction methods are far more better than that by feature selection methods. One recommendation for future projects is to evaluate the effectiveness of previous mentioned dimensionality reduction techniques, in addition to Embedded feature selection method and other feature extraction methods, on supervised, unsupervised and reinforcement learning. Bachelor of Engineering (Computer Science) 2018-05-14T04:50:53Z 2018-05-14T04:50:53Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74243 en Nanyang Technological University 54 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering
spellingShingle DRNTU::Engineering
Ang, Jun Hao
Investigation on effective solutions against insider attacks
description One of the common flaws of the current insider threat detection is the high demand for data storage. This report investigates the effectiveness of dimensionality reduction techniques in reducing this high demand needed by the machine learning methods used for insider threat detection. The dimensionality reduction techniques discussed in this report are feature selection methods i.e. Recursive Feature Elimination (RFE), Chi-Square Test and feature extraction methods i.e. Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA). The machine learning algorithms discussed in this report are supervised method i.e. K-Nearest Neighbour (KNN) and unsupervised method i.e. K-Means Clustering (KMC). The dataset used is a labelled phishing website dataset with 10,000 rows and 30 features. In practical practices, accuracy of an insider threat detection is more essential than the high data storage demand but having accuracy improved and data storage demand reduced is a bonus. Therefore, in the experiments conducted for this report, the effectiveness of a dimensionality reduction technique is evaluated based on the maximum amount of data storage that can be reduced regardless of any amount of improvement in accuracy. Based on this kind of evaluation, the experimental results show that both feature selection methods RFE and Chi-Square Test in general did a good job on both KNN and KMC, but for feature extraction methods PCA did well only on KNN and LDA did exceptionally well only on KMC. From the results, it can be concluded that the performance of feature selection methods is more stable than feature extraction methods but the degree of improvements in terms of accuracy and data storage reduction by feature extraction methods are far more better than that by feature selection methods. One recommendation for future projects is to evaluate the effectiveness of previous mentioned dimensionality reduction techniques, in addition to Embedded feature selection method and other feature extraction methods, on supervised, unsupervised and reinforcement learning.
author2 Felicity Chan
author_facet Felicity Chan
Ang, Jun Hao
format Final Year Project
author Ang, Jun Hao
author_sort Ang, Jun Hao
title Investigation on effective solutions against insider attacks
title_short Investigation on effective solutions against insider attacks
title_full Investigation on effective solutions against insider attacks
title_fullStr Investigation on effective solutions against insider attacks
title_full_unstemmed Investigation on effective solutions against insider attacks
title_sort investigation on effective solutions against insider attacks
publishDate 2018
url http://hdl.handle.net/10356/74243
_version_ 1759857655572070400