PERFORMANCE COMPARISON OF DIFFERENT FEATURE SETS FOR NETWORK TRAFFIC CLASSIFICATION USING RECURSIVE FEATURE ELIMINATION FEATURE SELECTION AND ONE-VS-REST RANDOM FOREST ALGORITHM
Network traffic classification is an identification process of network applications like Yahoo, YouTube, Facebook, and Skype. Network traffic classification is required by network management to manage resources and to know different applications that can help network operators provide good Qualit...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/60926 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:60926 |
---|---|
spelling |
id-itb.:609262021-09-21T12:13:08ZPERFORMANCE COMPARISON OF DIFFERENT FEATURE SETS FOR NETWORK TRAFFIC CLASSIFICATION USING RECURSIVE FEATURE ELIMINATION FEATURE SELECTION AND ONE-VS-REST RANDOM FOREST ALGORITHM Robbani, Arba Indonesia Theses feature sets, one-vs-rest, random forest, multiclass, imbalance data, network traffic, classification, flow-based, session-based, time-based, packet-based INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/60926 Network traffic classification is an identification process of network applications like Yahoo, YouTube, Facebook, and Skype. Network traffic classification is required by network management to manage resources and to know different applications that can help network operators provide good Quality of Service, secure network, and monitor network. In this thesis, we focused on the 7th layer of OSI model and using only TCP data. In recent years, there is much machine learning research to solve this problem either using supervised, unsupervised, or deep learning. Different feature sets are used to find the best performance for network traffic classification using Recursive Feature Elimination feature selections and One-Vs-Rest Random Forest classifiers. Six sets are compared: flow-based, session-based, time-based, packet-based, flow+session-based, and packet+time-based. Furthermore, we have class imbalance problems in multiclass that make this difficult due to imbalance distribution, presence of outliers, and irrelevant features. Using this method, we can solve these problems. From the experiment, we get flow-based as the best feature set for network traffic classification with f1-score 0.81, GM 0.85, and model build time is 2634.987s. We also can use packet-based, flow+session-based, and packet+time-based with a good classifier but need more time to model build. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Network traffic classification is an identification process of network applications
like Yahoo, YouTube, Facebook, and Skype. Network traffic classification is
required by network management to manage resources and to know different
applications that can help network operators provide good Quality of Service,
secure network, and monitor network. In this thesis, we focused on the 7th layer of
OSI model and using only TCP data. In recent years, there is much machine
learning research to solve this problem either using supervised, unsupervised, or
deep learning. Different feature sets are used to find the best performance for
network traffic classification using Recursive Feature Elimination feature
selections and One-Vs-Rest Random Forest classifiers. Six sets are compared:
flow-based, session-based, time-based, packet-based, flow+session-based, and
packet+time-based. Furthermore, we have class imbalance problems in multiclass
that make this difficult due to imbalance distribution, presence of outliers, and
irrelevant features. Using this method, we can solve these problems. From the
experiment, we get flow-based as the best feature set for network traffic
classification with f1-score 0.81, GM 0.85, and model build time is 2634.987s. We
also can use packet-based, flow+session-based, and packet+time-based with a
good classifier but need more time to model build.
|
format |
Theses |
author |
Robbani, Arba |
spellingShingle |
Robbani, Arba PERFORMANCE COMPARISON OF DIFFERENT FEATURE SETS FOR NETWORK TRAFFIC CLASSIFICATION USING RECURSIVE FEATURE ELIMINATION FEATURE SELECTION AND ONE-VS-REST RANDOM FOREST ALGORITHM |
author_facet |
Robbani, Arba |
author_sort |
Robbani, Arba |
title |
PERFORMANCE COMPARISON OF DIFFERENT FEATURE SETS FOR NETWORK TRAFFIC CLASSIFICATION USING RECURSIVE FEATURE ELIMINATION FEATURE SELECTION AND ONE-VS-REST RANDOM FOREST ALGORITHM |
title_short |
PERFORMANCE COMPARISON OF DIFFERENT FEATURE SETS FOR NETWORK TRAFFIC CLASSIFICATION USING RECURSIVE FEATURE ELIMINATION FEATURE SELECTION AND ONE-VS-REST RANDOM FOREST ALGORITHM |
title_full |
PERFORMANCE COMPARISON OF DIFFERENT FEATURE SETS FOR NETWORK TRAFFIC CLASSIFICATION USING RECURSIVE FEATURE ELIMINATION FEATURE SELECTION AND ONE-VS-REST RANDOM FOREST ALGORITHM |
title_fullStr |
PERFORMANCE COMPARISON OF DIFFERENT FEATURE SETS FOR NETWORK TRAFFIC CLASSIFICATION USING RECURSIVE FEATURE ELIMINATION FEATURE SELECTION AND ONE-VS-REST RANDOM FOREST ALGORITHM |
title_full_unstemmed |
PERFORMANCE COMPARISON OF DIFFERENT FEATURE SETS FOR NETWORK TRAFFIC CLASSIFICATION USING RECURSIVE FEATURE ELIMINATION FEATURE SELECTION AND ONE-VS-REST RANDOM FOREST ALGORITHM |
title_sort |
performance comparison of different feature sets for network traffic classification using recursive feature elimination feature selection and one-vs-rest random forest algorithm |
url |
https://digilib.itb.ac.id/gdl/view/60926 |
_version_ |
1822931514084556800 |