Streamsight: a toolkit for offline evaluation of recommender systems

There have been numerous Recommender System (RS) toolkits for offline evaluation that have been released over the years. However, little emphasis has been placed on observing the temporal aspects in the framework of these toolkits. We noticed that current toolkits tend to prioritize complex algorith...

Full description

Saved in:
Bibliographic Details
Main Author: Ng, Tze Kean
Other Authors: Sun Aixin
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181114
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:There have been numerous Recommender System (RS) toolkits for offline evaluation that have been released over the years. However, little emphasis has been placed on observing the temporal aspects in the framework of these toolkits. We noticed that current toolkits tend to prioritize complex algorithm implementations and the variety of metrics that are used to evaluate these algorithms. Instead, we would like to take a step back to consider another angle of approaching the implementation of toolkits for RS. That is, to consider appropriate approaches in handling the temporal aspects of the data pertaining to the data split scheme and how it can be observed during the evaluation of RS. This report introduces Streamsight, an open-source Python RS toolkit developed and made available on Python Package Index (PyPI). Streamsight provides a framework which considers the existing gaps discussed and implements the proposed solutions in this report. Streamsight provides the entire framework to develop and test RS, mainly targeted towards implementing a global sliding window as a proposed data split scheme and evaluation method for RS which considers a temporal aspect. With the observance of the temporal element, we aim to bring offline evaluation closer to the actual dynamic data communication and flow in the online setting. In this library, we provide the programmer with the APIs that abstract the underlying implementation for easy and standardized use of the implementation. The project and API documentation can be found in Github and PyPI.