Investigating the Impact of Different Representations of Data on Neural Network and Regression
In this research the impact of different data representation on the performance of neural network and regression was investigated on different datasets that has binary or Boolean class target. In addition, the performance of particular predictive data mining model could be affected with the change o...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English |
Published: |
2008
|
Subjects: | |
Online Access: | http://etd.uum.edu.my/790/1/Ehab_A._Omer_El_Fallah.pdf http://etd.uum.edu.my/790/2/Ehab_A._Omer_El_Fallah.pdf http://etd.uum.edu.my/790/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Utara Malaysia |
Language: | English English |
id |
my.uum.etd.790 |
---|---|
record_format |
eprints |
spelling |
my.uum.etd.7902013-07-24T12:09:02Z http://etd.uum.edu.my/790/ Investigating the Impact of Different Representations of Data on Neural Network and Regression Fallah, Ehab A. Omer El QA76 Computer software In this research the impact of different data representation on the performance of neural network and regression was investigated on different datasets that has binary or Boolean class target. In addition, the performance of particular predictive data mining model could be affected with the change of data representation. The seven data representations that have been used in this research are As - Is, Min Max normalization, standard deviation normalization, sigmoidal normalization, thermometer representation, flag representation and simple binary representation. Moreover, all data representations have been applied on two datasets which are Wisconsin breast cancer and German credit dataset. As a result, the neural network performance is better than logistic regression on both datasets if we exclude the thermometer and flag representations. For datasets having a binary or Boolean target class, flag or thermometer binary representation is recommended to be used if logistic regression analysis is performed. Meanwhile, As-is representation, min max normalization, standard deviation normalization or sigmoidal normalization is recommended for neural network analysis on datasets having binary or Boolean target class. 2008-06 Thesis NonPeerReviewed application/pdf en http://etd.uum.edu.my/790/1/Ehab_A._Omer_El_Fallah.pdf application/pdf en http://etd.uum.edu.my/790/2/Ehab_A._Omer_El_Fallah.pdf Fallah, Ehab A. Omer El (2008) Investigating the Impact of Different Representations of Data on Neural Network and Regression. Masters thesis, Universiti Utara Malaysia. |
institution |
Universiti Utara Malaysia |
building |
UUM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Utara Malaysia |
content_source |
UUM Electronic Theses |
url_provider |
http://etd.uum.edu.my/ |
language |
English English |
topic |
QA76 Computer software |
spellingShingle |
QA76 Computer software Fallah, Ehab A. Omer El Investigating the Impact of Different Representations of Data on Neural Network and Regression |
description |
In this research the impact of different data representation on the performance of neural network and regression was investigated on different datasets that has binary or Boolean class target. In addition, the performance of particular predictive data mining model could be affected with the change of data representation. The seven data representations that have been used in this research are As - Is, Min Max normalization, standard deviation normalization, sigmoidal normalization, thermometer representation, flag representation and simple binary representation. Moreover, all data representations have been applied on two datasets which are Wisconsin breast cancer and German credit dataset. As a result, the neural network performance is better than logistic regression on both datasets if we exclude the thermometer and flag representations. For datasets having a binary or Boolean target class, flag or thermometer binary representation is recommended to be used if logistic regression analysis is performed. Meanwhile, As-is representation, min max normalization, standard deviation normalization or sigmoidal normalization is recommended for neural network analysis on datasets having binary or Boolean target class. |
format |
Thesis |
author |
Fallah, Ehab A. Omer El |
author_facet |
Fallah, Ehab A. Omer El |
author_sort |
Fallah, Ehab A. Omer El |
title |
Investigating the Impact of Different Representations of Data on Neural Network and Regression |
title_short |
Investigating the Impact of Different Representations of Data on Neural Network and Regression |
title_full |
Investigating the Impact of Different Representations of Data on Neural Network and Regression |
title_fullStr |
Investigating the Impact of Different Representations of Data on Neural Network and Regression |
title_full_unstemmed |
Investigating the Impact of Different Representations of Data on Neural Network and Regression |
title_sort |
investigating the impact of different representations of data on neural network and regression |
publishDate |
2008 |
url |
http://etd.uum.edu.my/790/1/Ehab_A._Omer_El_Fallah.pdf http://etd.uum.edu.my/790/2/Ehab_A._Omer_El_Fallah.pdf http://etd.uum.edu.my/790/ |
_version_ |
1644276266620157952 |