Forecasting sport-matches through data mining
Odds of Forecasting a perfect bracket for sports matches are astronomical and through participating in sports forecasting competitions, we can see how well machine learning, statistical techniques, feature engineering and ensemble learning can improve the predictions and in the process might helped...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/66752 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-66752 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-667522023-03-03T20:31:26Z Forecasting sport-matches through data mining Ng, Jun Xuan Pan, Sinno Jialin School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory Odds of Forecasting a perfect bracket for sports matches are astronomical and through participating in sports forecasting competitions, we can see how well machine learning, statistical techniques, feature engineering and ensemble learning can improve the predictions and in the process might helped to develop new scientific discoveries and business models by implementing code for the predictions of the outcome of matches. The purpose of this project is to explore different machine learning algorithms, data mining techniques and etc. and at the same time to achieve a favorable score in the competition’s leaderboard. National College Basketball prediction competition which is named as “March Machine Learning 2014” hosted by Kaggle will be discussed in this document. Before implementing the code, data are to be pre-processed (e.g. analysis of data provided are performed to remove unwanted data) and etc. RStudio with R programming language is used to perform all necessary tasks in this project. After implementing the script for prediction, results shown that standalone algorithm (Logistic Regression and XGBoost) works best for the competition and score submitted was scored at 2nd position in the leaderboard. Using ensemble learning on both algorithms as mentioned earlier on, further enhance the accuracy of the results rather than using standalone algorithm. Bachelor of Engineering (Computer Science) 2016-04-25T03:46:53Z 2016-04-25T03:46:53Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/66752 en Nanyang Technological University 40 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory Ng, Jun Xuan Forecasting sport-matches through data mining |
description |
Odds of Forecasting a perfect bracket for sports matches are astronomical and through participating in sports forecasting competitions, we can see how well machine learning, statistical techniques, feature engineering and ensemble learning can improve the predictions and in the process might helped to develop new scientific discoveries and business models by implementing code for the predictions of the outcome of matches.
The purpose of this project is to explore different machine learning algorithms, data mining techniques and etc. and at the same time to achieve a favorable score in the competition’s leaderboard.
National College Basketball prediction competition which is named as “March Machine Learning 2014” hosted by Kaggle will be discussed in this document. Before implementing the code, data are to be pre-processed (e.g. analysis of data provided are performed to remove unwanted data) and etc. RStudio with R programming language is used to perform all necessary tasks in this project. After implementing the script for prediction, results shown that standalone algorithm (Logistic Regression and XGBoost) works best for the competition and score submitted was scored at 2nd position in the leaderboard. Using ensemble learning on both algorithms as mentioned earlier on, further enhance the accuracy of the results rather than using standalone algorithm. |
author2 |
Pan, Sinno Jialin |
author_facet |
Pan, Sinno Jialin Ng, Jun Xuan |
format |
Final Year Project |
author |
Ng, Jun Xuan |
author_sort |
Ng, Jun Xuan |
title |
Forecasting sport-matches through data mining |
title_short |
Forecasting sport-matches through data mining |
title_full |
Forecasting sport-matches through data mining |
title_fullStr |
Forecasting sport-matches through data mining |
title_full_unstemmed |
Forecasting sport-matches through data mining |
title_sort |
forecasting sport-matches through data mining |
publishDate |
2016 |
url |
http://hdl.handle.net/10356/66752 |
_version_ |
1759854572737658880 |