SourceVote: Fusing multi-valued data via inter-source agreements

Data fusion is a fundamental research problem of identifyingtrue values of data items of interest from conflicting multi-sourceddata. Although considerable research efforts have been conducted on thistopic, existing approaches generally assume every data item has exactlyone true value, which fails t...

Full description

Saved in:
Bibliographic Details
Main Authors: FANG, Xiu Susie, SHENG, Quan Z., WANG, Xianzhi, BARHAMGI, Mahmoud, YAO, Lina, NGU, Anne H.H.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2017
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3857
https://ink.library.smu.edu.sg/context/sis_research/article/4859/viewcontent/101007_2F978_3_319_69904_2_13.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Data fusion is a fundamental research problem of identifyingtrue values of data items of interest from conflicting multi-sourceddata. Although considerable research efforts have been conducted on thistopic, existing approaches generally assume every data item has exactlyone true value, which fails to reflect the real world where data items withmultiple true values widely exist. In this paper, we propose a novel approach,SourceVote, to estimate value veracity for multi-valued data items.SourceVote models the endorsement relations among sources by quantifyingtheir two-sided inter-source agreements. In particular, two graphs areconstructed to model inter-source relations. Then two aspects of sourcereliability are derived from these graphs and are used for estimatingvalue veracity and initializing existing data fusion methods. Empiricalstudies on two large real-world datasets demonstrate the effectiveness ofour approach.