Replacing missing values using trustworthy data values from web data sources

In practice, collected data usually are incomplete and contains missing value. Existing approaches in managing missing values overlook the importance of trustworthy data values in replacing missing values. In view that trusted completed data is very important in data analysis, we proposed a fram...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Jaya, Mohd Izham, Sidi, Fatimah, Mat Yusof, Sharmila, Affendey, Lilly Suriani, Ishak, Iskandar, A. Jabar, Marzanah
Format: Article
Language:English
Published: Institute of Physics Publishing 2017
Online Access:http://psasir.upm.edu.my/id/eprint/62958/1/Replacing%20missing%20values%20using%20trustworthy%20data%20values%20from%20web%20data%20sources.pdf
http://psasir.upm.edu.my/id/eprint/62958/
http://iopscience.iop.org/article/10.1088/1742-6596/892/1/012009/pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Putra Malaysia
Language: English
Description
Summary:In practice, collected data usually are incomplete and contains missing value. Existing approaches in managing missing values overlook the importance of trustworthy data values in replacing missing values. In view that trusted completed data is very important in data analysis, we proposed a framework of missing value replacement using trustworthy data values from web data sources. The proposed framework adopted ontology to map data values from web data sources to the incomplete dataset. As data from web is conflicting with each other, we proposed a trust score measurement based on data accuracy and data reliability. Trust score is then used to select trustworthy data values from web data sources for missing values replacement. We successfully implemented the proposed framework using financial dataset and presented the findings in this paper. From our experiment, we manage to show that replacing missing values with trustworthy data values is important especially in a case of conflicting data to solve missing values problem.