Distributed classification in P2P networks

In recent years, the popularity of Peer-to-Peer (P2P) networks has increased exponentially, owing to the huge amount of usable resources pooled together by the connected peers. Consequently, due to the huge amount of data and computing resources available, learning from the P2P networks in a distrib...

Full description

Saved in:
Bibliographic Details
Main Author: Ang, Hock Hee
Other Authors: Vivekanand Gopalkrishnan
Format: Theses and Dissertations
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/62190
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In recent years, the popularity of Peer-to-Peer (P2P) networks has increased exponentially, owing to the huge amount of usable resources pooled together by the connected peers. Consequently, due to the huge amount of data and computing resources available, learning from the P2P networks in a distributed manner can drastically improve the performance of classification tasks. However, learning in P2P networks is faced with many challenging issues due to the scale and dynamic settings of the P2P networks such as scalability, peer dynamism, asynchronism and fault-tolerance. In addition, existing distributed classification solutions are unsuitable for learning in P2P networks. Therefore, the objective of this thesis is to address the challenges of learning in the P2P networks to allow anyone to construct an accurate and efficient classification model under any type of P2P environment for a diverse domain of applications. This thesis took a systematic approach to (i) analyse the various types of P2P environments and highlight issues that are unique to these environments, (ii) study the existing distributed classification approaches and identify their limitations which make them unsuitable for the P2P environments, and (iii) propose several P2P classification solutions to address the identified challenges of learning in the P2P environments and the limitations of existing approaches. The challenges and limitations have been addressed using the multiple classifier system (cascade SVM and ensemble of classifiers) which has been proven with theoretical studies and experiments on real-life and synthetic datasets to be very effective for the P2P environment. In summary, this thesis has achieved its objective to provide an encyclopedic guide and solutions to learning in the P2P environments, allowing anyone to construct an accurate and efficient classification model under any type of P2P environment for a diverse domain of applications.