DORA: Feature selection for network-based intrusion detection models

Intrusion Detection System (IDS) use models as a basis for detecting intrusions. To ensure that these models are comprehensive enough, a huge and highly-dimensional data must be fed to the system. In this study, the data set will contain a huge amount of normal traffic data and a sufficient number o...

Full description

Saved in:

Bibliographic Details
Main Authors:	Acosta, Juan Carlos A., Diguangco, Wilma Patricia A., Obal, Dan Paolo B., Reforeal, Henri Frederic T.
Format:	text
Language:	English
Published:	Animo Repository 2012
Online Access:	https://animorepository.dlsu.edu.ph/etd_bachelors/14784
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	De La Salle University
Language:	English

id	oai:animorepository.dlsu.edu.ph:etd_bachelors-15426
record_format	eprints
spelling	oai:animorepository.dlsu.edu.ph:etd_bachelors-154262021-11-24T02:32:11Z DORA: Feature selection for network-based intrusion detection models Acosta, Juan Carlos A. Diguangco, Wilma Patricia A. Obal, Dan Paolo B. Reforeal, Henri Frederic T. Intrusion Detection System (IDS) use models as a basis for detecting intrusions. To ensure that these models are comprehensive enough, a huge and highly-dimensional data must be fed to the system. In this study, the data set will contain a huge amount of normal traffic data and a sufficient number of network intrusions data to ensure that the model will be able to correctly classify intrusions. Often, data set are noisy – meaning, it contains a lot of redundant data along with the irrelevant features that can only compromise the classification accuracy and performance of the generated model. To avoid this, the redundant data must be filtered and irrelevant features must be dropped. The goal of this study is to determine what the best features are for an intrusion detection model, which is highly dependent upon the feature selection algorithms that will be tested against the same data set. The findings of the study shows that the combined packet headers and n-grams s feature set can dramatically increase the classifications accuracy of the model being built. The results also proved that selecting only the best features from the entire feature set can increase the classification accuracy of the intrusion detection model even further. Based on the test results, the best performing algorithms are Decision Trees while the best feature selection algorithm is the N-Gram Information Gain, given the data set. 2012-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_bachelors/14784 Bachelor's Theses English Animo Repository
institution	De La Salle University
building	De La Salle University Library
continent	Asia
country	Philippines Philippines
content_provider	De La Salle University Library
collection	DLSU Institutional Repository
language	English
description	Intrusion Detection System (IDS) use models as a basis for detecting intrusions. To ensure that these models are comprehensive enough, a huge and highly-dimensional data must be fed to the system. In this study, the data set will contain a huge amount of normal traffic data and a sufficient number of network intrusions data to ensure that the model will be able to correctly classify intrusions. Often, data set are noisy – meaning, it contains a lot of redundant data along with the irrelevant features that can only compromise the classification accuracy and performance of the generated model. To avoid this, the redundant data must be filtered and irrelevant features must be dropped. The goal of this study is to determine what the best features are for an intrusion detection model, which is highly dependent upon the feature selection algorithms that will be tested against the same data set. The findings of the study shows that the combined packet headers and n-grams s feature set can dramatically increase the classifications accuracy of the model being built. The results also proved that selecting only the best features from the entire feature set can increase the classification accuracy of the intrusion detection model even further. Based on the test results, the best performing algorithms are Decision Trees while the best feature selection algorithm is the N-Gram Information Gain, given the data set.
format	text
author	Acosta, Juan Carlos A. Diguangco, Wilma Patricia A. Obal, Dan Paolo B. Reforeal, Henri Frederic T.
spellingShingle	Acosta, Juan Carlos A. Diguangco, Wilma Patricia A. Obal, Dan Paolo B. Reforeal, Henri Frederic T. DORA: Feature selection for network-based intrusion detection models
author_facet	Acosta, Juan Carlos A. Diguangco, Wilma Patricia A. Obal, Dan Paolo B. Reforeal, Henri Frederic T.
author_sort	Acosta, Juan Carlos A.
title	DORA: Feature selection for network-based intrusion detection models
title_short	DORA: Feature selection for network-based intrusion detection models
title_full	DORA: Feature selection for network-based intrusion detection models
title_fullStr	DORA: Feature selection for network-based intrusion detection models
title_full_unstemmed	DORA: Feature selection for network-based intrusion detection models
title_sort	dora: feature selection for network-based intrusion detection models
publisher	Animo Repository
publishDate	2012
url	https://animorepository.dlsu.edu.ph/etd_bachelors/14784
_version_	1718383386882473984

DORA: Feature selection for network-based intrusion detection models

Similar Items