Term importance for transformer-based QA retrieval : A case study of StackExchange

Question-answering (QA) retrieval is the task of retrieving the most relevant answer to a given question from a collection of answers. Various approaches to QA retrieval have been developed recently. One successful and popular model is Contextualized Late Interaction over BERT (ColBERT), a transform...

Full description

Saved in:
Bibliographic Details
Main Authors: TAN, Bryan Zhi Yang, LAUW, Hady W.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9855
https://ink.library.smu.edu.sg/context/sis_research/article/10855/viewcontent/webconf24shp__1_.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Question-answering (QA) retrieval is the task of retrieving the most relevant answer to a given question from a collection of answers. Various approaches to QA retrieval have been developed recently. One successful and popular model is Contextualized Late Interaction over BERT (ColBERT), a transformer-based approach that adopts a query-document scoring mechanism that retains the granularity of transformer matching, whilst improving on efficiency. However, one key limitation is that it requires further fine-tuning for new query or collection types. In this work, we explore and propose several non-parametric retrieval augmentation methods based on explicit signals of term importance that improve over ColBERT's baseline performance. In particular, we consider the QA retrieval task in the context of StackExchange question-answering forum, verifying the effectiveness of our methods in this setting.