Self-supervised fine-tuning for neural expert finding

Expert finding systems allow ones to find individuals who have expertise in specific fields or domains. Traditional expert finding are mostly based on topic modeling or keyword search methods that are limited in their capability to encode contextual knowledge from natural language. To address the li...

Full description

Saved in:
Bibliographic Details
Main Authors: SUBAGDJA, Budhitama, DAN Sanchari, TAN, Ah-hwee
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9864
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Expert finding systems allow ones to find individuals who have expertise in specific fields or domains. Traditional expert finding are mostly based on topic modeling or keyword search methods that are limited in their capability to encode contextual knowledge from natural language. To address the limitation, this paper presents Neural Expert Finder (NEF), a novel method that takes a transfer learning approach based on transformer encoder networks to leverage the rich semantic and syntactic patterns of language encoded in pre-trained language models (PLMs). We propose a self-supervised learning approach utilizing contrastive training using both positive and automatically generated negative samples to fine-tune the PLMs to realize NEF. In addition, we also contribute a new benchmark data set for expert finding named SGComp, curated from experts’ university and Google Scholar profiles. Our empirical evaluations demonstrate that the proposed method can effectively capture contextual representations and improve the retrieval of experts most relevant to their corresponding research areas. Both SGComp and three domain specific public data sets are utilized to compare NEF against ExpFinder nVSM, a state-of-the-art (SOTA) system in expert finding, and the results demonstrate consistent better performance of the proposed NEF.