Collusion-resistant spatial phenomena crowdsourcing
Data trustworthiness is a crucial issue in real world crowdsourcing and participatory sensing applications. Without considering this issue, different types of worker misbehavior, especially the challenging collusion attacks, can result in biased and inaccurate estimation and decision making. Previou...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/70228 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Data trustworthiness is a crucial issue in real world crowdsourcing and participatory sensing applications. Without considering this issue, different types of worker misbehavior, especially the challenging collusion attacks, can result in biased and inaccurate estimation and decision making. Previous works mostly focus on object labelling crowdsourcing, rating-based opinion crowdsourcing, and estimation of continuousvalued quantities, while little attention has been paid to a more challenging type of tasks in participatory sensing, the spatial field regression.
In this project, we constructed a novel trust-based mixture of Gaussian processes (GP) model for spatial field regression to jointly detect worker misbehaviors and accurately reconstruct the spatial field. It is able to model stationary and non-stationary spatial fields, while incorporating complex malicious attacks. We developed a Markov chain Monte Carlo (MCMC)-based inference algorithm to efficiently perform Bayesian inference of the proposed model. The inference algorithm was implemented using MATLAB.
To evaluate the predictive accuracy of the proposed model, we performed experiments using two real world datasets of spatial phenomena, and compared the model with three baseline models. The experimental results show that the proposed model is able to achieve better predictive accuracies when untrustworthy data is present. The experiments also highlighted the high computational cost and memory usage associated with GP regression, especially with non-stationary GP regression. Hence, future work will focus on optimizing the memory usage and adopting reduced-rank approximation methods to the model. |
---|