Forecasting sport-matches through data mining II

In recent years, machine learning, particularly deep learning has become in- creasingly studied, and applied in different areas of interest. One of them is the prediction of sports game results. Many have attempted to develop models to predict results of National Col- legiate Athletic Associatio...

Full description

Saved in:
Bibliographic Details
Main Author: Fu, Jiaxiang
Other Authors: Pan Jialin, Sinno
Format: Final Year Project
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/70198
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In recent years, machine learning, particularly deep learning has become in- creasingly studied, and applied in different areas of interest. One of them is the prediction of sports game results. Many have attempted to develop models to predict results of National Col- legiate Athletic Association (NCAA) Men's Basketball Tournament. However, existing models require manual e ort to rst extract features from data before they can be trained to make predictions. These manual processes typically need to be repeated when model is applied to new data, a new season of game, for example. This project aims to develop a model that can generalize its training to make predictions for multiple seasons with no human adaptations. To do so, various stand-alone models as well as carefully designed complex models were implemented and evaluated. After repeated experiments, a nal model that combines deep learning techniques with traditional classi ers was able to achieve this task. The nal model uses multi-layer perceptron, a deep learning model, to extract hidden features within given data, and feed these feature into a gradient boosting model to make nal predictions. A variety of other techniques were also used to ensure the reliability and accuracy of the nal model. Tested for ve seasons, the model outperformed key benchmarks by large margins. Further, it showed consistent gain of performance over benchmarks. Its performance has been ranked among the top of leader board in a competi- tion where contestants develop optimized models for each individual season. In conclusion, the model showed its capability of capturing hidden patterns that are general to games across seasons, thus making reliable and informative predictions across seasons without manual adaptations.