Plackett-Luce Regression Mixture model for heterogeneous rankings

Learning to rank is an important problem in many scenarios, such as information retrieval, natural language processing, recommender systems, etc. The objective is to learn a function that ranks a number of instances based on their features. In the vast majority of the learning to rank literature, th...

Full description

Saved in:
Bibliographic Details
Main Authors: TKACHENKO, Maksim, LAUW, Hady W.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2016
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3354
https://ink.library.smu.edu.sg/context/sis_research/article/4356/viewcontent/Plackett_luceRegression.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Learning to rank is an important problem in many scenarios, such as information retrieval, natural language processing, recommender systems, etc. The objective is to learn a function that ranks a number of instances based on their features. In the vast majority of the learning to rank literature, there is an implicit assumption that the population of ranking instances are homogeneous, and thus can be modeled by a single central ranking function. In this work, we are concerned with learning to rank for a heterogeneous population, which may consist of a number of sub-populations, each of which may rank objects dierently. Because these sub-populations are not known in advance, and are eectively latent, the problem turns into simultaneously learning both a set of ranking functions, as well as the latent assignment of instances to functions. To address this problem in a joint manner, we develop a probabilistic graphical model called Plackett-Luce Regression Mixture or PLRM model, and describe its inference via Expectation-Maximization algorithm. Comprehensive experiments on publicly-available real-life datasets showcase the eectiveness of PLRM, as opposed to a pipelined approach of clustering followed by learning to rank, as well as approaches that assume a single ranking function for a heterogeneous population