Detecting player's position using in-game statistics : a machine learning approach

Background: Technology is everchanging in the realm of football data analytics. One domain which potentially requires a more data-driven focus is player recruitment. Till date, there is little evidence to suggest an existence of any classification model that can be used to identify and recruit playe...

Full description

Saved in:
Bibliographic Details
Main Author: Muhammad Aqmar Naqib Masrani
Other Authors: -
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/153162
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-153162
record_format dspace
spelling sg-ntu-dr.10356-1531622021-11-14T20:11:00Z Detecting player's position using in-game statistics : a machine learning approach Muhammad Aqmar Naqib Masrani - John Komar john.komar@nie.edu.sg Science::General Background: Technology is everchanging in the realm of football data analytics. One domain which potentially requires a more data-driven focus is player recruitment. Till date, there is little evidence to suggest an existence of any classification model that can be used to identify and recruit players. Aim: (1) To determine which technical performance statistics hold greater importance when distinguishing the different playing positions in football. (2) To develop and validate a machine learning model which can accurately classify the playing positions of players using the technical performance statistics. Method: Season-long observations of performance statistics of players in the English Premier League (EPL) and German Bundesliga from 2014-2021 were collected. Discriminant analysis was performed on the EPL dataset to determine the significant performance statistics that had the greatest ability in distinguishing the playing positions. The performances of five classification models, after being trained and tested against the EPL dataset, would be evaluated. The model with the highest accuracy would be validated by testing against the Bundesliga dataset. Results: Thirty-four technical performance statistics were found significant in distinguishing between positions using a discriminant analysis. The extreme gradient boosting (XGB) model achieved the highest classification accuracy (70.4%) among the classification models that were tested against the EPL dataset. The XGB model provided a moderately high ability of classification when tested using the Bundesliga dataset (63.9%). Conclusion: The usage of technical performance statistics and the XGB model is a practical and valid tool for coaches and scouts to use when identifying and recruiting players. Bachelor of Science (Sport Science and Management) 2021-11-09T23:53:25Z 2021-11-09T23:53:25Z 2021 Final Year Project (FYP) Muhammad Aqmar Naqib Masrani (2021). Detecting player's position using in-game statistics : a machine learning approach. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/153162 https://hdl.handle.net/10356/153162 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::General
spellingShingle Science::General
Muhammad Aqmar Naqib Masrani
Detecting player's position using in-game statistics : a machine learning approach
description Background: Technology is everchanging in the realm of football data analytics. One domain which potentially requires a more data-driven focus is player recruitment. Till date, there is little evidence to suggest an existence of any classification model that can be used to identify and recruit players. Aim: (1) To determine which technical performance statistics hold greater importance when distinguishing the different playing positions in football. (2) To develop and validate a machine learning model which can accurately classify the playing positions of players using the technical performance statistics. Method: Season-long observations of performance statistics of players in the English Premier League (EPL) and German Bundesliga from 2014-2021 were collected. Discriminant analysis was performed on the EPL dataset to determine the significant performance statistics that had the greatest ability in distinguishing the playing positions. The performances of five classification models, after being trained and tested against the EPL dataset, would be evaluated. The model with the highest accuracy would be validated by testing against the Bundesliga dataset. Results: Thirty-four technical performance statistics were found significant in distinguishing between positions using a discriminant analysis. The extreme gradient boosting (XGB) model achieved the highest classification accuracy (70.4%) among the classification models that were tested against the EPL dataset. The XGB model provided a moderately high ability of classification when tested using the Bundesliga dataset (63.9%). Conclusion: The usage of technical performance statistics and the XGB model is a practical and valid tool for coaches and scouts to use when identifying and recruiting players.
author2 -
author_facet -
Muhammad Aqmar Naqib Masrani
format Final Year Project
author Muhammad Aqmar Naqib Masrani
author_sort Muhammad Aqmar Naqib Masrani
title Detecting player's position using in-game statistics : a machine learning approach
title_short Detecting player's position using in-game statistics : a machine learning approach
title_full Detecting player's position using in-game statistics : a machine learning approach
title_fullStr Detecting player's position using in-game statistics : a machine learning approach
title_full_unstemmed Detecting player's position using in-game statistics : a machine learning approach
title_sort detecting player's position using in-game statistics : a machine learning approach
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/153162
_version_ 1718368086194651136