Fraudulent click detection

Online advertising has been popular with the emergence of Internet and the pay-per-click advertising model was one of the popular advertising models, however it poses problems to advertisers because of fraudulent clicks. It is a difficult and time-consuming task to identify fraudulent clicks manuall...

Full description

Saved in:
Bibliographic Details
Main Author: Kang, Eileen Mun Yee.
Other Authors: Chan Syin
Format: Final Year Project
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/52310
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-52310
record_format dspace
spelling sg-ntu-dr.10356-523102023-03-03T20:46:11Z Fraudulent click detection Kang, Eileen Mun Yee. Chan Syin School of Computer Engineering Centre for Computational Intelligence DRNTU::Engineering::Computer science and engineering Online advertising has been popular with the emergence of Internet and the pay-per-click advertising model was one of the popular advertising models, however it poses problems to advertisers because of fraudulent clicks. It is a difficult and time-consuming task to identify fraudulent clicks manually. In this project, we try to solve this problem by using machine learning techniques. As the neural network is capable of solving classification problem, in this project, we study the features used in multilayer perceptron for fraudulent click detection and observed patterns in the clicks recorded. Three experiments were conducted in this project and their results were recorded. The first experiment was to use raw input data as the features, the second experiment was to construct new features based on the given set of features and finally the last experiment looks into the suitable features. Through the results of first and second experiment, we realize the importance in using features that are more representative. The correlation of raw features was investigated in Experiment 1 and we found out that most of the features are almost uncorrelated with each other, and it is difficult to see patterns from the unprocessed data. Whereas in Experiment 2, features are created to capture characteristic of each publisher and we observed that fraudulent publishers tend to produce more clicks within certain time intervals, hence the result of Experiment 2 improved quite a bit from Experiment 1. Finally, the last experiment investigate the importance of feature selection in obtaining subset of features that are more representative and the result of this experiment has improved compared to Experiment 2. With the findings we obtained in this project, it could serve as a guide of what kinds of features could be used for fraudulent click detection and some other ideas on how could some other features be useful in building the fraudulent click detection system. Bachelor of Engineering (Computer Science) 2013-05-06T01:49:30Z 2013-05-06T01:49:30Z 2013 2013 Final Year Project (FYP) http://hdl.handle.net/10356/52310 en Nanyang Technological University 50 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Kang, Eileen Mun Yee.
Fraudulent click detection
description Online advertising has been popular with the emergence of Internet and the pay-per-click advertising model was one of the popular advertising models, however it poses problems to advertisers because of fraudulent clicks. It is a difficult and time-consuming task to identify fraudulent clicks manually. In this project, we try to solve this problem by using machine learning techniques. As the neural network is capable of solving classification problem, in this project, we study the features used in multilayer perceptron for fraudulent click detection and observed patterns in the clicks recorded. Three experiments were conducted in this project and their results were recorded. The first experiment was to use raw input data as the features, the second experiment was to construct new features based on the given set of features and finally the last experiment looks into the suitable features. Through the results of first and second experiment, we realize the importance in using features that are more representative. The correlation of raw features was investigated in Experiment 1 and we found out that most of the features are almost uncorrelated with each other, and it is difficult to see patterns from the unprocessed data. Whereas in Experiment 2, features are created to capture characteristic of each publisher and we observed that fraudulent publishers tend to produce more clicks within certain time intervals, hence the result of Experiment 2 improved quite a bit from Experiment 1. Finally, the last experiment investigate the importance of feature selection in obtaining subset of features that are more representative and the result of this experiment has improved compared to Experiment 2. With the findings we obtained in this project, it could serve as a guide of what kinds of features could be used for fraudulent click detection and some other ideas on how could some other features be useful in building the fraudulent click detection system.
author2 Chan Syin
author_facet Chan Syin
Kang, Eileen Mun Yee.
format Final Year Project
author Kang, Eileen Mun Yee.
author_sort Kang, Eileen Mun Yee.
title Fraudulent click detection
title_short Fraudulent click detection
title_full Fraudulent click detection
title_fullStr Fraudulent click detection
title_full_unstemmed Fraudulent click detection
title_sort fraudulent click detection
publishDate 2013
url http://hdl.handle.net/10356/52310
_version_ 1759856619720540160