Recommendation systems based on extreme multi-label classification
This project aims to implement a recommender system using extreme multi-label classification algorithms. In the era of big data, traditional recommender systems are unable to keep up with the scale and size of data available. Extreme multi-label classification can tag a given target with multiple la...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/149716 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This project aims to implement a recommender system using extreme multi-label classification algorithms. In the era of big data, traditional recommender systems are unable to keep up with the scale and size of data available. Extreme multi-label classification can tag a given target with multiple labels that are most relevant to it from an extremely large dataset of labels. This report summarises the design implementation and empirical studies of extreme multi-label classification algorithms for recommendation systems on the MovieLens 1M benchmark dataset. This project studied 2 tree-based extreme multi-label classification algorithms, FastXML and AttentionXML, and implemented them using Python for a movie recommender system. This was to investigate the reformulation of the recommender problem as a multi-label classification task. The dataset was prepared such that each item that can be recommended by the system was treated as a unique label that can be tagged to a user by the classifier. The 2 algorithms were compared based on accuracy as well as computational resources required. The accuracy of AttentionXML was 46.6%, 5% larger than that of FastXML’s accuracy of 41.4%. However, FastXML had a smaller computational requirement than AttentionXML. The memory footprints of AttentionXML’s models were smaller than FastXML’s models. This is because AttentionXML used more computational resources to train a deep model for each layer of its tree, while FastXML used more memory to train a larger tree ensemble to make up for the lower accuracy per tree. |
---|