Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan
Uncertain data cannot be processed by using the regular tools and techniques of clear data. Special techniques like fuzzy set, rough set, and soft set need to be utilized when dealing with uncertain data, and each special technique comes with its own advantages and snags. Soft set is considered as t...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Published: |
2018
|
Subjects: | |
Online Access: | http://studentsrepo.um.edu.my/8720/1/Muhammad_Sadiq_Khan.pdf http://studentsrepo.um.edu.my/8720/6/sadiq.pdf http://studentsrepo.um.edu.my/8720/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Malaya |
id |
my.um.stud.8720 |
---|---|
record_format |
eprints |
spelling |
my.um.stud.87202021-04-11T20:18:20Z Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan Muhammad Sadiq , Khan QA75 Electronic computers. Computer science Uncertain data cannot be processed by using the regular tools and techniques of clear data. Special techniques like fuzzy set, rough set, and soft set need to be utilized when dealing with uncertain data, and each special technique comes with its own advantages and snags. Soft set is considered as the most appropriate of these techniques. A soft set application represents uncertain data in tabular form where all values are represented by 0 or 1. Researchers use soft set representation in a number of applications involving decision making, parameter reduction, medical diagnosis, and conflict analysis. Soft set binary data may be missing due to communicational errors or viral attacks etc. Soft sets with incomplete data cannot be used in applications. Few researchers have worked on data filling and recalculating incomplete soft sets, and the current research focuses on predicting missing values and decision values from non-missing data or aggregates. A soft set needs to be preprocessed in order to obtain aggregates while no preprocessing is needed when aggregates are not required. Therefore, this research discusses the existing techniques in terms of preprocessed and unprocessed soft sets. The currently available approaches in the preprocessed category recalculate partial missing data from aggregates, yet are unable to use the set of aggregates for recalculating entire values. This research presents a mathematical technique capable of recalculating overall missing values from available aggregates. Also investigated are the techniques belonging to the unprocessed category, among them being DFIS, a novel data filling approach for an incomplete soft set, which seems to be the most suitable technique in handling incomplete soft set data. The result shows that DFIS possesses a persisting accuracy problem in prediction. DFIS predicts missing values through association between parameters, yet makes no distinction between the different associations. Thus, it ignores the role of the strongest association, which in turn results in low accuracy. This research rectifies this particular DFIS issue by using a new prediction technique through strongest association (PSA). The experimental result validates the high accuracy of PSA over DFIS after implementing both techniques in MATLAB and testing for data filling using bench mark data sets. Further, this research applies PSA to online social networks (OSN) and detects a new kind of network community for those nodes that are associated with each other. The new network community is named ‗virtual community‘ and the inter-associated nodes are named ‗prime nodes‘. Researchers have found that the unavailability of complete OSN nodes results in a low accuracy of ranking algorithms. Therefore, this research predicts new links in two OSNs (Facebook and Twitter) data sets through association between prime nodes using PSA. By completing OSNs through association between prime nodes using PSA, this study demonstrates that the performance of famous ranking algorithms (k-Core and PageRank) can be significantly improved. 2018-05 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/8720/1/Muhammad_Sadiq_Khan.pdf application/pdf http://studentsrepo.um.edu.my/8720/6/sadiq.pdf Muhammad Sadiq , Khan (2018) Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan. PhD thesis, University of Malaya. http://studentsrepo.um.edu.my/8720/ |
institution |
Universiti Malaya |
building |
UM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaya |
content_source |
UM Student Repository |
url_provider |
http://studentsrepo.um.edu.my/ |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Muhammad Sadiq , Khan Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan |
description |
Uncertain data cannot be processed by using the regular tools and techniques of clear data. Special techniques like fuzzy set, rough set, and soft set need to be utilized when dealing with uncertain data, and each special technique comes with its own advantages and snags. Soft set is considered as the most appropriate of these techniques. A soft set application represents uncertain data in tabular form where all values are represented by 0 or 1. Researchers use soft set representation in a number of applications involving decision making, parameter reduction, medical diagnosis, and conflict analysis. Soft set binary data may be missing due to communicational errors or viral attacks etc. Soft sets with incomplete data cannot be used in applications. Few researchers have worked on data filling and recalculating incomplete soft sets, and the current research focuses on predicting missing values and decision values from non-missing data or aggregates. A soft set needs to be preprocessed in order to obtain aggregates while no preprocessing is needed when aggregates are not required. Therefore, this research discusses the existing techniques in terms of preprocessed and unprocessed soft sets. The currently available approaches in the preprocessed category recalculate partial missing data from aggregates, yet are unable to use the set of aggregates for recalculating entire values. This research presents a mathematical technique capable of recalculating overall missing values from available aggregates.
Also investigated are the techniques belonging to the unprocessed category, among them being DFIS, a novel data filling approach for an incomplete soft set, which seems to be the most suitable technique in handling incomplete soft set data. The result shows that DFIS possesses a persisting accuracy problem in prediction. DFIS predicts missing values through association between parameters, yet makes no distinction between the different associations. Thus, it ignores the role of the strongest association, which in turn results in low accuracy. This research rectifies this particular DFIS issue by using a new prediction technique through strongest association (PSA). The experimental result validates the high accuracy of PSA over DFIS after implementing both techniques in MATLAB and testing for data filling using bench mark data sets. Further, this research applies PSA to online social networks (OSN) and detects a new kind of network community for those nodes that are associated with each other. The new network community is named ‗virtual community‘ and the inter-associated nodes are named ‗prime nodes‘. Researchers have found that the unavailability of complete OSN nodes results in a low accuracy of ranking algorithms. Therefore, this research predicts new links in two OSNs (Facebook and Twitter) data sets through association between prime nodes using PSA. By completing OSNs through association between prime nodes using PSA, this study demonstrates that the performance of famous ranking algorithms (k-Core and PageRank) can be significantly improved. |
format |
Thesis |
author |
Muhammad Sadiq , Khan |
author_facet |
Muhammad Sadiq , Khan |
author_sort |
Muhammad Sadiq , Khan |
title |
Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan |
title_short |
Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan |
title_full |
Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan |
title_fullStr |
Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan |
title_full_unstemmed |
Data prediction and recalculation of missing data in soft set / Muhammad Sadiq Khan |
title_sort |
data prediction and recalculation of missing data in soft set / muhammad sadiq khan |
publishDate |
2018 |
url |
http://studentsrepo.um.edu.my/8720/1/Muhammad_Sadiq_Khan.pdf http://studentsrepo.um.edu.my/8720/6/sadiq.pdf http://studentsrepo.um.edu.my/8720/ |
_version_ |
1738506177422032896 |